分形图像编码(英文)
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
FRACTAL IMAGE CODING Po-kai Chen,Lizabeth Li
pokai@,lizli@
1.INTRODUCTION
Fractal coding employs an unconventional method of rep-resenting the original image with a series of transforma-tions that map image blocks to smaller,similar blocks within the image.When recursively iterated on any ini-tial image,the contractive transformations produce a se-quence of images that will converge to an approximation of the original [1,2].
The mappings considered in this paper are discrete,contractive block transformations involving either spatial contraction or the transformation of pixel values and lo-cations,explored in section 2.The bulk of the encoding time is spent finding block transformations that produce the lowest mean square error.To improve coding effi-ciency and time,in section 3,we classify certain types of blocks in order to reduce the number of allowable trans-formations performed on those blocks.After encoding,the transformations with the least root mean square er-ror are transmitted using a method we describe in section 4.On the decoder end,the application of these trans-formations on an initial image will reach convergence in roughly 4iterations.Section 5contains sample im-ages and results,including rate-distortion curves com-pared with JPEG and JPEG-2000.We present our con-cluding remarks in section 6.
2.BLOCK MAPPING AND PROCESSING 2.1.Image Partitions
The first step in encoding the image involves partitioning the image into non-overlapping B x B pixel blocks called range blocks.For each range block,we search a pool of 2B x 2B domain blocks from the original M x M im-age for the most optimally similar domain block.These domain blocks are generated by sliding a 2B x 2B win-dow across the original image,with spacing 1<δ<B .
For an M x M image,there will be (M −2B
δ
+1)2domain blocks.We may utilize δ>1in order to speed up the algorithm.
Another parameter imposed by our algorithm on the search for domain blocks is the allowable search dis-tance.In order to decrease encoding time,we also limit the search distance for domain blocks so that the algo-rithm does not search through the entire image for a min-imal distortion domain block.In the event that a suitable domain block cannot be found that produces a root mean square error below an error threshold,we split the B x B range parent block into four B/2x B/2child blocks,and rerun the search,searching for domain blocks that are now B x B in size.Child blocks are especially useful in capturing more detail in complex parts of an image,while parent blocks represent an efficient way of encod-ing areas of uniform pixel value.Fig.1represents the partitioning and mapping explained above.
Fig.1.Parent and child block partitioning and mapping.
1
2.2.Transformations
Two basic transformations exist in fractal image coding.The first,the geometric transformation,maps from the larger domain block to the range block.If no other opera-tions occur,this transformation results in the range block pixels being the average of the 4corresponding pixels in the domain block.
In addition to the geometric transformation performed for every range block,massic transformations may be employed.Massic transformations include transforma-tions that either change pixel values or pixel locations as listed below:
1.Change of Pixel Values
(a)Absorption to gray level β(b)DC gray level shift by ∆c (c)Contrast scaling by α2.Change of Pixel Locations
(a)Identity
(b)Reflection about vertical axis (c)Reflection about horizontal axis (d)Reflection about first diagonal (e)Reflection about second diagonal (f)90◦rotation around center (g)180◦rotation around center (h)−90◦rotation around center
The algorithm searches for the transformations that produce the lowest distortion measure of root mean square error between the transformed domain block and the tar-get range block.
3.DOMAIN BLOCK POOLS
To improve efficiency of the algorithm,we perform a di-rected search by classifying the source domain blocks into 3different types:shade,midrange,and edge.
Shade blocks refer to blocks that are approximately uniform in pixel values.They are fundamentally differ-ent from other domain blocks because we can directly evaluate whether or not a range block belongs to this cat-egory rather than searching for a domain block to map from.Our algorithm calculates the difference between the maximum and minimum pixel value within a block,and classifies it as a shade block if the difference is under a certain threshold.For the sample images presented in
this paper,the threshold was strictly set to less than 2,on a gray scale between 0and 255.Given classification into a shade block,the algorithm transmits the DC gray level βof that block,quantized to 6bits within the range of [0,255].This transformation is the quickest mapping of the algorithm since no search has to be made.
Midrange blocks refer to textured domain blocks that exhibit a slight gradient in pixel values but no defined edge.Edge blocks have a strong gradient in pixel val-ues.The actual processing of midrange and edge blocks overlap.Everything that is not a shade block is classi-fied as a midrange or edge block.Both midrange and edge blocks undergo contrast scaling and DC shift op-erations.The contrast scaling factor,α,is chosen to equalize the dynamic range between the domain block and range block,where dynamic range is defined as the difference between maximum and minimum pixel values within the block.To maintain contractivity and lower en-tropy respectively,we choose αto be strictly less than 1,and we quantize its value to 2bits.The DC shift level,∆c is chosen afterwards so that the mean pixel values of the domain and range blocks (after contrast scaling)are equal.This DC shift level is chosen to be in the range from [-128,128]and quantized to 6bits.At this point,our algorithm chooses the one of the eight transforma-tions outlined in section 2that minimizes the root mean square error between the domain block and range block.If the identity transformation is chosen,the domain block has essentially been classified as a midrange block.Any other transformation indicates the block as an edge block.Fig.2shows examples of operations performed on pos-sible domain blocks.
Fig.2.Example transformations performed on shade,midrange,and edge blocks respectively.
2
4.TRANSMISSION
This section contains information on the actual data trans-mitted after the transformation and encoding process.For each shade block encoded,only the absorption gray level βis sent.For midrange and edge blocks,the algorithm sends the position of the referenced domain block,the DC shift level ∆c ,the scaling factor αand the index of the transform used.We can differentially encode the lo-cation of the domain block since we search through the image sequentially.The number of bits used to encode each of these parameters is indicated in Fig.
3.
rmation of each parameter in bits,where N cd =number of child domain blocks and N pd =num-ber of parent domain blocks.
This tree indicates the algorithmic decisions made in encoding each image block.One bit is used to indicate the decision to split into child blocks.Another bit con-tains the classification between shade and midrange or edge.In our final encoder,absorption level βand DC shift level ∆c were quantized to 6bits each.The 8trans-formations can be encoded into 3bits,and the domain block position is differentially encoded based on how many domain blocks were available to search from.To calculate our bit rates,we assumed ideal entropy coding of all the information in each of these parameters and di-vided by number of pixels in the encoded image.
5.RESULTS
We encoded several test images under the following spec-ifications:
Block Size BxB =4x4
Image Size MxM =128x128RMSE threshold =1Search distance =16Original image =8bpp
We chose a block size B =4because we found it capable of producing image reconstructions with high
PSNR (Larger block sizes may increase edge artifacts).Image size was chosen to be 128x 128primarily for en-coding time reasons.The RMSE threshold between split-ting from parent to child blocks was chosen to be 1,to ensure high PSNR in the resulting image.We limited our search distance to across 16pixels to also reduce encod-ing time.We will point out in a later figure that search distance has a slight effect on PSNR but a considerable impact on encoding time.The below images were iter-ated on a uniformly gray image.
The first 4iterations of ”Lena”encoded with the above parameters are shown in Fig.4.
Fig.4.”Lena”decoded after 4iterations,with PSNR =33.583,rate =5.353,encoding time 3248s.
We see that within the first iteration,the reconstruc-tion has captured the essence of the original with a few inaccurate pixel blocks scattered through the image.Vi-sually,we see very little difference between the second and last iteration,although root mean square error de-creases with each iteration.Our algorithm has the bene-fit of decoding very quickly and converging in very few iterations.
We wanted to try the decoder on a fractal image as well.The first 4iterations of ”Flower”encoded with the above parameters are shown in Fig. 5.We found that the fractal encoder did not perform better on the fractal image as expected.Several possible reasons exist for this finding.The encoder must still search through the possi-ble domain blocks to find a match even though the image contains a lot of redundancy.This fractal also has a great level of detail -it contains few shade blocks which would normally reduce encoding time.
To demonstrate the ease with which the fractal coder
3
Fig.5.”Flower”decoded after 4iterations,with PSNR =31.536,rate =5.576,encoding time 4374s.
works with shade blocks,we encoded a generally uni-form computer generated image ”1up.”For this encod-ing,we decreased our search distance to 0,increased our error threshold to 10,increased block size to 8x 8,and still managed to obtain reconstruction with PSNR =32.53and bit rate of 0.032bpp.The reconstruction con-verged in 1iteration,and is shown compared with the original in Fig.
6.
Fig.6.”1up”reconstruction (right)compared with orig-inal (left)with PSNR =32.53,rate =0.032,encoding time 0.99s.
The ”1up”image was encoded in only 0.99s.This very fast encoding is due to the decreased search dis-tance,increased error threshold,and increased block size.
Decreasing the search distance limits the number of do-main blocks searched,and increased error threshold en-sures obtaining a match early without searching for a bet-ter distortion measure.Increased block size also reduces coding time because it also reduces the number of range blocks.In general,making these parameter tweaks will lower PSNR in final image while speeding up the encod-ing time,but in the case of ”1up,”the presence of ma-jority shade blocks ensures almost perfect visual recon-struction.
We also measured the rate distortion curve for our fractal encoder,specifically for ”Lena.”To obtain data points on the curve,we varied both the error threshold and quantization of the coding parameters such as β,∆c ,and α.Fig.7and 8compare our rate distortion results for ”Lena”64x 64and 128x 128to JPEG and
JPEG-2000.
Fig.7.Rate distortion curve for ”Lena”
64x64.
Fig.8.Rate distortion curve for ”Lena”128x128.
4
We generated a curve for”Lena”64x64largely to save coding time.For this curve,we kept search distance 64so as to search the entire image for a matching do-main block(and thereby increasingfinal PSNR).In this case,we see that the fractal encoder approaches JPEG and JPEG-2000in both PSNR and rate within the rel-evant range of PSNR over28dB.Especially for lower PSNR,we see the fractal coder performs within1bpp of JPEG.
In the curve for”Lena”128x128,we graph the rate distortion curves for our fractal coder using different search distances to demonstrate the effect search distance has on PSNR.We see that increasing search distance increases the PSNR obtained for a given rate.We used relatively small search distances due to the exponential dependence of time on search distance.For reference,a search dis-tance of16took1114seconds to encode,as opposed to 540s and325s for search distances of10and6respec-tively.Due to the prohibitively long encoding time,we did not produce a rate distortion curve for128x128using maximum search distance,but the inclusion of the curve for64x64should be sufficient to indicate that as we in-crease search distance in the128x128image,PSNR will increase and approach JPEG results within a few dB as better domain to range block matches are found.The curve also indicates that the performance of our fractal encoder will surpass that of JPEG for low bit rates.
6.CONCLUSION
In this paper,we have demonstrated a successful imple-mentation of a fractal coder.Fractal coders can perform very well in terms of bit rate and PSNR for images with large areas of uniform color,as demonstrated with the encoding and reconstruction of”1up.”We did notfind the coder to perform any better on fractal images of high complexity,due to the fact that the full search distance must still be traversed in the encoding process.
We have also illustrated the many trade-offs associ-ated with fractal coding.The PSNR tops out at around35 dB due to the lossy compression utilized by fractal cod-ing.As we increase search distance,which will increase possible block matches and conceivably lower the rms er-ror,encoding time increases exponentially.Other coding parameters such as quantization levels and error thresh-old settings for parent-child block splitting also influence the encoding time,bit rate,and PSNR.With less quanti-zation levels,PSNR and bit rate generally decrease,al-though not at the same rates,since the entropy of the transmission decreases.Changing the error threshold af-fects the percentage of parent and child blocks used,which factors into both encoding time and bit rate.We have also demonstrated comparable bit rates and PSNR to JPEG at lower bit rates.
We conclude that although fractal encoding can pro-duce reconstructed images with very good PSNR and de-cent bit rate,fractal encoding is not attractive over other compression schemes like JPEG and JPEG-2000,which can achieve the same PSNR and rate in a fraction of the time.The fractal coder described in this paper al-ready utilizes directed search to increase speed,but en-coding time is still prohibitive especially for common im-age sizes greater than128x128.Unless further optimiza-tions to the encoding process can be made,we recom-mend that fractal encoding not be used in favor of quicker compression algorithms.
7.REFERENCES
[1]A.E.Jacquin,”A novel fractal block-coding tech-nique for digital images,”International Conference on Acoustics,Speech,and Signal Processing,1990.
[2]A.E.Jacquin,”Fractal image coding:a review,”Pro-ceedings of the IEEE,vol.81,no.10,pp.1451-1465, October1993.
8.APPENDIX
Pokai worked on the base encoder and decoder,including the massic transformations and parent-child block split-ting.Liz worked on coding the geometric transforma-tions and domain block classifications.Both of us worked onfine tuning the code,including adding various speed optimizations and tweaking search and coding parame-ters.Pokai worked on the transmission format of coded parameters and batch encoding.Both of us worked on generating animations and sample images,including col-lecting data for the rate distortion curves.Liz worked on the presentation,and we shared work on the report.
5。