ACCESS AND RETRIEVAL FROM IMAGE DATABASES USING ...

5 downloads 17615 Views 162KB Size Report
Email : [email protected]. ABSTRACT .... thumb. See http:\[email protected]/~image/thumb.html. 2. ... To generate the image thumb requires the use of parts (a) (b) and (c); to ...... images for the Photoshop image database front end --- ...
International Symposium on Signal Processing and its Applications, ISSPA, Gold Coast, Australia, 25-30 August. 1996. Organised by the Signal Processing Research Centre, QUT, Birsbane, Australia

ACCESS AND RETRIEVAL FROM IMAGE DATABASES USING IMAGE THUMBNAILS Harvey A. Cohen Computer Science and Computer Engineering La Trobe University, Melbourne, Victoria Australia 3083 Email : [email protected]

ABSTRACT The emerging role of thumbnail images in the selection of images for display from large image databases and via network access requires that the characteristics of thumbnails be seriously studied. We introduce a measure of effective compression Ceff(p) that is a function of the probability p that thumbnail access will be followed by the display of a full-scale image. For credible values of p, we deduce the need for the development of thumbnail-based image compression schemes, which efficiently and effectively use thumbnail data as part of the code for the full-scale image. We briefly discuss design features of thumbnail-based variants of the block-oriented coding schemes of vector quantization, fractal coding, and JPEG. 1. THUMBNAIL IMAGES Thumbnail images are now commonly used as WYSIWYG directory icons for the selection of images from directories and local databases. Notable systems for managing query of image databases, the QBIC system developed by Niblack and coworkers at IBM San Jose, [2] and the PhotoBook System developed at MIT Media Lab by Pentland et al, [3], feature arrays of small images for displaying the images most similar in content to the users query. A number of researchers, notably Picard[3], have stressed the important emerging role of very large image databases as in collections of compressed images stored on a CD ROM, or the far larger compilations available via the Internet. The needs for a user-friendly interface to these very large image albums and libraries requires the development of image compression schemes suitable and efficient for applications where the image thumbnails are more often required than full images, and full image display is invariably preceded by a data fetch of the thumbnail data. This entails that image compression schemes suitable for very large database applications should be so designed that thumbnail data is separately accessible, and that the information delivered in the thumbnail should be used with other image code for economical image synthesis or reconstruction. In this paper these considerations lead us to introduce a measure of image compression that gives an effective compression for image data accessed via thumbnails. We examine how traditional block-oriented vector quantization (VQ) and fractal coding based on Jacquin's scheme should be modified. Finally we discuss JPEG coding in this regard. 1.1 Gray-scale Thumbnails Although image thumbnails are now commonplace, a literature search failed to locate any previous formal description of thumbnails. The following general definition of a thumbnail image is proposed: An image thumbnail is produced after the partitioning of an image into rectangular (usually square) blocks of pixels, and constructing a thumbnail image comprising blocks, of single pixel-size or larger, of uniform pixel value. The blocks in the thumbnail are uniformly scaled with respect to those of the image. In the simplest case of the uniform partitioning of an image into constant-sized blocks each of R rows and C columns the

427

image block at (rR, cC) in the image may be denoted by its block row number r and block column c as B[r][c]: the corresponding block in the thumbnail contains pixels of grayscale or colour index b[r][c]. For gray-scale images, b[r][c] is usually chosen as the block mean b[r][c] = Σ B[r][c]ij / RC The smallest thumbnail image contains just one pixel of grayscale b[r][c] to represent the block B[r][c], is here simply called the image thumb. The thumbnail image with blocks the same size as the corresponding blocks in the original image is here called the zoomed thumbnail. Some display systems offer thumbnails based on sub-sampling the uncompressed image, eg using b[r][c] = B[r][c]00 . The MIT system PhotoBook [3] offers both block mean and subsampled thumbnails.. Where an image has been decomposed into regularly scaled blocks, as in quad-tree decomposition, the thumb is clearly the thumbnail scaled so that the smallest block is of single pixel size. 1.2 Colour Thumbs For true-vision images, with true-vision display, thumbnails could in principle be produced with each component of the thumb the average of corresponding component (red, green, blue) of the pixels in each block For blocked colour images, mean RGB component in each block may be likewise computed from the actual red, green, and blue components for each pixel in an RxC pixel block B as per: m[r][c]colour =

Σ B[r][c]ij colour/RC

For an indexed-colour environment, the most appropriate colour index to use in the thumb is index p in the palette available that is closest to m: || palette[p] - m[r][c]|| ≤ || palette[q] - m[r][c]|| any colour index q. There is no confusion in referring to the thumb computed using this approximate mean as the colour-mean thumb. See http:\[email protected]/~image/thumb.html 2. COMPRESSION EFFICIENCY WITH THUMBNAILS The crudest measure of efficiency of image compression is the compression ratio, lets call it C, of the entire image, as it is transmitted. C measures the efficiency of retrieval of whole image from a data-base. If the user selects such an image for full-scale display via a thumbnail based interface, then the cost of accessing includes the transfer cost of all the thumbnails perused. We here define an effective compression as a function of the probability p that from a set of thumbnails a wholeimage selection will be made. For simplicity suppose that one is dealing with a set of images all of the same size. Then, formally: If p is the probability that a viewed thumbnail image will be selected for full-image display T = data to be transferred to generate image thumbnail F = data to be transferred for full-size image display S = savings in using thumbnail data to generate display The data required at a workstation to display a particular image is just T + F, but from the systems perspective the

427

International Symposium on Signal Processing and its Applications, ISSPA, Gold Coast, Australia, 25-30 August. 1996. Organised by the Signal Processing Research Centre, QUT, Birsbane, Australia average amount of image data to be delivered for the display of each full-sized image is 1 T + F - S where 0 ≤ S ≤ T p By definition of compression, the image data in the display is just C F. Hence the effective compression C (p) in terms of bytes in the display versus transferred bytes for probability p of full-image retrieval is pF T C (p) = C pS pF 1+ T T The following are indicative values: eff

3.2 Thumbnail-based Fractal Coding In block-oriented fractal coding, as introduced by Jacquin[6] [7] the image is partitioned into non-overlapping blocks, called range blocks, precisely as in vector quantization The fractal code for each image block, is simply the location of a larger block within the image, termed a domain block, together with the parameters of the transformation by which the range block may be produced from the corresponding domain block.

C p F T S T

60 .0.1

60 0.1

45 0.1

60 0.1

45 0.1

.3

3

4

1

5

The basic idea of thumbnail based fractal coding is to use the zoomed-out thumbnail as a first approximation to the decoded image, and to iteratively improve this blocked image using (signed) tiles that have zero pixel sum. Because of this zero mean, the domain block code is smaller than in Jacquin-type coding, as the pixel gray-scale transformation is simply scaled (homogenous linear transform). Thus this variety of coding is truly thumb-nail based, and also has the advantage of very rapid convergence.[8].

0

1.0

1.0

1

1.0

3.3 JPEG Coding and Thumbs

C Eff

13.86

15.0

13.86

6

16.5

eff

F on T the effective compression, and the significant impact of the savings S. One can conclude that it is most crucial to devise efficient thumbnail compression schemes, and that if the thumbnail information is utilised for the synthesis of the displayed image there will be significant advantages, which especially ought to be incorporated into network browsers.

The table shows the dominant significance of the ratio

3. THUMBNAIL-BASED IMAGE CODECS In this section we show practical means of designing image codecs which are thumbnail based, making effective use of the data contained within an image thumbnail. Attention is confined to block oriented codecs, where the image thumb provides a representative value for each block. . 3.1 Vector quantization Image coding via vector quantization requires the use of a code book whose contents are best called tiles, though often, with mathematical emphasis, called vectors. The coding algorithm involves the partition of an image into blocks, and the determination for each block in the image of the block-code, namely the number of the tile in the code-book that is closest, usually in the least squares sense. Classic references are [4][5]. What is introduced here is a thumb VQ code comprising four parts instead of the traditional three: (a) Header information (b) Code - set of tile numbers for each block (c) Thumb code: a table of mean gray-scale or colour-mean for each tile in the code book (d) The code book - an album of tiles/vectors To generate the image thumb requires the use of parts (a) (b) and (c); to generate the full scale image requires in addition the codebook (d). Using the formula for effective compression above, one can readily estimate the considerable savings for typical situations. The thumb code (c) is relatively small, and adds little to overall coded image size. The extra storage costs for the thumb code can be made negligible, at computational cost: for gray-scale images, knowledge of the block-mean, enables the more significant bits of one pixel to be determined from the pixel values in the block. In sum, for regular blocked, or regularly graded blocks as in quad-tree, VQ coding is readily augmented to facilitate thumbbased retrieval.

1

In base-line JPEG[1] coding of gray-scale images, the mean of each 8x8 block, the DC component for the block, is stored separately, but is not readily accessible. It is clear that with this thumb material separately accessible, would be advantageous for thumb-based retrieval. For colour images, the different scale for colour and chromaticity leads to some modest complications which will be discussed elsewhere. The basic point is that the path ahead is here. It is important to note that although there is provision for a thumb within the JFIF variant of JPEG,[9] this refers to a RGB thumb, which is stored in addition to the usual JPEG data.. The overall conclusion is that with some relatively modest alteration to the standard, thumbnail-based JPEG coding could be achieved, with the introduction of coders and decoders that would be backwards compatible. 4. CONCLUSIONS Maximising the speed of image retrieval is essential to user satisfaction. in network access, where image selection is (almost) invariably preceded by the viewing and selection of its thumbnail. The traditional measure of image data compression is not adequate to describe the efficiency of such thumbnailbased retrieval, so a new measure, Ceff has been devised. Our analysis leads to proposals for a range of thumbnail-based codecs vector quantization, and JPEG. Images illustrating these codecs and issues of thumb quality are on the WWW page: hppp://[email protected]/image/thumbs

References [1] [2] [3]

[4] [5] [6] [7]

Gregory K. Wallace, The JPEG Still Picture Compression Standard, 1991, C.A.C.M., 34(4): 30:44. Wayne Niblack and Myron Flickner, Find Me the Pictures That Look Like This: IBM's Image Query Project, Advanced Imaging, April 1993, pp 32-35. A. Pentland, R.W. Picard and S. Sclaroff, Photobook: Tools for content-base manipulation of image databases, in Proc Society for Optical Engineers, SPIE Storage and Retrieval of Image and Video Databases II, San Jose, Calif, Feb 1993. Lloyd, S.P. Least squares quantization in PCM, Bell Lab Memo, July 1957; also in IEEE Trans Info Theory, 'IT28, 129-137 (1982). Max,, J. Quantizing for minimum distortion, IRE Trans Info Theory, IT-6(1), 7-12 (1960) A.E. Jacquin, A novel fractal block-coding technique for digital images,Proc Int'l Conf Acoustics, Speech and Signal Processing, ICASSP'90, pp 2225-2228 A.E.Jacquin, A novel fractal block-coding technique for digital images, IEEE Trans. Image Processing, 1992, Vol 1, No 1, pp 18-30.

1

International Symposium on Signal Processing and its Applications, ISSPA, Gold Coast, Australia, 25-30 August. 1996. Organised by the Signal Processing Research Centre, QUT, Birsbane, Australia [8] [9]

Harvey A. Cohen, Fractal Image Coding for Thumbnail Based Access, (in Press) Proceedings ISSPA 96. Eric Hamilton, JPEG File Interchange Format Version 1.01, Aug 20, 1991, C-Cube Microsystems, Inc, 399A West Trimble Road, San Jose, CA 95131

2

2

International Symposium on Signal Processing and its Applications, ISSPA, Gold Coast, Australia, 25-30 August. 1996. Organised by the Signal Processing Research Centre, QUT, Birsbane, Australia

In hierarchical mode of JPEG [1], an image is stored at several increasing resolutions. Eq an image 16x16, then 128x128, then 1024x1024 This mode, more of a promise than a reality, is geared to the idea of progressive update of the image during display, and has intrinsic interest, as conceivably one of the smaller images could function as the thumb. There is, however, another feature of JPEG of interest: [2]

W. Niblack et al, The QBIC project: Querying images by content using color, texture, and shape, in Proc SPIE Storage and Retrieval of Image and Video Databases II, San Jose, California, Feb 1993. [10] Shusterman, E. and M. Feder, Image Compression Using Improved Quadtree Decomposition Algotithms, IEEE Trans Image Processing 3(2):207-215 March 1994. [11] Pattern Recognition Letters, 1995

[13] Harvey A. Cohen, Thumbnail based fractal coding, Preprint. La Trobe University, Bundoora, 3083, Australia.

3

3

International Symposium on Signal Processing and its Applications, ISSPA, Gold Coast, Australia, 25-30 August. 1996. Organised by the Signal Processing Research Centre, QUT, Birsbane, Australia

that in decoding, the starting point The crux of the process is the production of the codebook. The global objective is to choose the vectors of the code book so as to minimise the total error

∑ || b[r][c] - B[r][c] ||2 rc The classical method to construct a VQ codebook for gray-scale images is the Linde-Buzo-Gray [7] algorithm, which is a natural generalization of the classic LloydMax algorithm for scalar quantization [8][9]. However other methods of determining codebooks have been developed, notably Fuzzy C-Means [10] Much greater compression may be obtained in vector quantization by means of two-level codes, or quad-tree coding[11] In block-oriented fractal coding, as introduced by Jacquin[12][13] the fractal code for each image block, being coded, is simply the location of a larger block within the image, which after rotations and/or reflections, is the closest (using Euclidean mean). It is precisely such a transformed large block that is used in the VQ system described here. 2.2 Thumbnail augmented VQ In this section a simple way of modifying standard VQ code is presented, which is termed augmented thumbnail. What is proposed is that the VQ code should consist of three parts instead of the traditional two: Code thumb: a table of mean gray-scale for each coding tile. Code array: tile code for each block Album: array of coding tiles The first two components of the code are sufficient to produce the Thumb. 2.3 Thumbnail based VQ - Mathematical Analysis The basic idea of thumbnail VQ is to use the zoomed-out thumbnail as a first approximation to the decoded image, and to use this image itself as a source of tiling blocks that are to supply corrections. The correction tiles can be determined either by sub-sampling or by averaging within the zoomed-out thumbnail, and subtracting from each pixel the block mean to yield a zero-sum tile, which may be rotated/reflected and/or contrast scaled. Note that this specification is compatible with multi-level partitioning scheme, as the familiar quadtree, or more efficient BSP. In this presentation, for clarity, we shall only detail thumbnail VQ where the image is partitioned only into blocks of the one size, 4x4. For gray-scale images, the thumbnail used is based on the mean-value of pixels within each such 4x4 block of the original image. The codebook IS precisely the zoomed-out thumbnail image,

4

for which the 'natural' thumbnail has been expanded to be the same size as the original image. That is the codebook comprises 4x4 blocks, each block the same location and size as in the partitioned original image, with a grayscale equal to the mean gray-scale within the corresponding block. Using the notation above, where the original image block is denoted by B[r][c], the first approximation to each 4x4 image block is the zoomed-out thumbnail, 1111 1 1111 B[r][c] where b[r][c] = b[r][s] 1111 16 i,j 1111 i,j The second approximation aims to determine the 4x4 correction 1111 1111 C[r][c] = B[r][c] - b[r][s] 1111 1111 which is a block with zero sum:

  

  



  

ΣC

  

=0

ij

This correction block is coded by searching through 16x16 blocks within the code-book, and compressing each to a 4x4 size which is called Z. If the mean pixel within Z is called z , then the 4x4 block

Y(0)

=

Z

-

z

1111 1111 1111 1111

has zero mean and is a suitable tile, as are all tiles derived from Y(0) by any of the 8 eight transformations T(ℵ) which are the product of a rotation by a multiple of 90 degrees, and a reflection about a diagonal. -1 Y(ℵ) = T(ℵ) Y(0) T (↵) ( ↵=0, ... , 7) Any linear multiple of the eight Y(ℵ) is a potential tile. Hence the coding problem becomes the determination of the Y(ℵ) so as to minimise 2 || C[r][c] - aY(ℵ) || For a fixed C[r][c] and Y(ℵ), the minimisation of this error is precisely the determination of the slope of the line of best fit (least-squares) of elementary regression theory, so that

`

16Σ C Y(↵) ij ij a = 2 ΣY ij

where the summations are over the 16 pixels in the 4x4 blocks. (The index ℵ is superfluous in the denominator). Hence choosing a in accord with this formula the block error for Y(ℵ) is after elementary algebra reduced to

4

International Symposium on Signal Processing and its Applications, ISSPA, Gold Coast, Australia, 25-30 August. 1996. Organised by the Signal Processing Research Centre, QUT, Birsbane, Australia

ΣC 2 ij

2 16Σ C Y(↵)  ij  ij ΣY 2 ij

Encoding of the block B[r][c] thus is a two stage process (a) Computing the block mean b[r][c] to be used in the thumbnail (b) Determing for the position (I,J) within the exploded thumbnail, and the transformations ℵ of the minimum of

1

Proc Int'l Conf Acoustics, Speech and Signal Procsing, ICASSP'90, pp 2225-2228 [13] A.E.Jacquin, A novel fractal block=coding technique for digital images, IEEE Trans. Image Processing, 1992, Vol 1, No 1, pp 18-30. [14] Y. Fisher, E.W. Jacobs and R.D. Boss, Iterated Transform Image Compression, Technical Report, 1408 /full details later/

2 16Σ C Y(↵)  ij ij  2 Σ C ΣY 2 ij ij

(the index ℵ is deliberately omitted from the denominator). I,J together with ℵ and the contrast a given by

16Σ C Y(↵) ij ij a = ΣY 2 ij

give the VQ code for the image block rc,

References [1] ISO/IEC CD 10918-1 Digital Compression and Coding of Continuous-tone Still Images, Part 11: Requirements and guidlines, New York, ANSI [JPEG Standard]. [2] W. Niblack et al, The QBIC project: Querying images by content using color, texture, and shape, in Proc SPIE Storage and Retrieval of Image and Video Databases II, San Jose, California, Feb 1993. [3] Wayne Niblack and Myron Flickner, Find Me the Pictures That Look Like This: IBM's Image Query Project, Advanced Imaging, April 1993, pp 32-35. [4] A. Pentland, R.W. Picard and S. Sclaroff, Photobook: Tools for content-base manipulation of image databases, in Proc Society for Optical Engineers, SPIE Storage and Retrieval of Image and Video Databases II, San Jose, California, Feb 1993. [5] Rosalind W. Picard and T. Kabir, Finding Similar Patterns in Large Image Databases, IEEE ICASSP 93, Minneapolis, MN, Vol V, pp V-161-164. [6] Rosalind W. Picard and Fang Liu, A New Wold Ordering for Image Similarity, IEEE ICASSP 94, Adelaide, SA Vol V, pp V-129-132. [7] Linde, Y, A. Buzo and R.M. Gray, An algorithm for vector quantizer design, IEEE Trans, Commun., COM28(1), 84-95(1980). [8] Lloyd, S.P. Least squares quantization in PCM, Bell Lab Memo, July 1957; also in IEEE Trans Info Theory, 'IT28, 129-137 (1982). [9] Max,, J. Quantizing for minimum distortion, IRE Trans Info Theory, IT-6(1), 7-12 (1960) [10] Shusterman, E. and M. Feder, Image Compression Using Improved Quadtree Decomposition Algotithms, IEEE Trans Image Processing 3(2):207-215 March 1994. [11] Pattern Recognition Letters, 1995 [12] A.E. Jacquin, A novel fractal block-coding technique for digital images,

5

5

International Symposium on Signal Processing and its Applications, ISSPA, Gold Coast, Australia, 25-30 August. 1996. Organised by the Signal Processing Research Centre, QUT, Birsbane, Australia

Conclusion In the emerging environment where the display of a particular image at a workstation is invariably preceded by its selection from a set of thumbnail images, a measure of image compression is proposed to take into account the costs of multiple thumbs for each image accessed fully. Examples given indicate how to maximise the effective compression, using small thumbs that contain data required for whole image generation. For vector quantisation, For vector quantization, such a thumbnail oriented approach leads to the use of the exploded thumbnail itself as a souce of correction tiles for higher resolution, with VQ data actually in the form as for fractal coding as per Jacquin.

C[r][c] is specified as

Σ Cij2

2 16Σ C Y(↵)  ij ij  ΣYij2

Computing (a) Subtracting the (a) Locating the indicated 16x16 block within the code-book image. (b) The set of block transforms is precisely those introduced by Fisher et al [14].

Colour thumbnails as for images using indexed RGB colour, can be handled by the same approach.

Thumbnail based VQ - Encoding Algorithm // Written up as an algorithm

Using the notation above, where the original image block is denoted by B[r][c], the first approximation to each 4x4 image block is the zoomed-out thumbnail,

1111  1111 b[r][s] 1111 1111

where

b[r][c]

=

1 16

∑B[r][c]i,j i,j

The second approximation determines the 4x4 correction C[r][c] so that the decoded block is just

C[r][c] +

1111  1111 b[r][s] 1111 1111

6

me systems offering thumbnails simply use as for more economical will be more effectively A major new criteria for the design and evaluation of any image compression scheme intended to be applied to images in a large image album, as on a CD ROM, or a larger image library has emerged: the As textural description is such a poor index for image selection, small images termed thumbnails have become the primary means for image selectio from image compilations and libraries. Such thumbnails are typically 32x32, 64x64, 64x48 pixels in size need to be usefully indicative to the huiman viewer as to the contents of larger images they denote, , typically 256x256, 512x512, or 640x480. The important role for thumbnails for image selection from image libraries implies that the practical merit of any image compression scheme depends on the easy availability of thumbnails, in addition to traditional criteria of fast decompression, and a high degree of data compression. The major criteria for evaluation of any image compression scheme intended to be applied to images in a large image album, as on a CD ROM, or a larger image library.

6

International Symposium on Signal Processing and its Applications, ISSPA, Gold Coast, Australia, 25-30 August. 1996. Organised by the Signal Processing Research Centre, QUT, Birsbane, Australia

arithmetic or Huffman coding has been used. These compression figures should not be directly compared with JPEG, for instance. Rather, the results should be compared to a similar simple block-oriented fractal code [2], based on the Jacquin s[3] and Fisher [4] scheme, where about the same compression is achieved. Experience of block- fractal coding makes one confident that much higher compressions can be achieved. And because of the clear separation of code-book from code, more elaborated versions of thumbnail VQ shall still offer ready and super-efficient thumbnails.

Fig 1. Showing how the image thumbnail is used as a code-book. Using the thumbnail data, a full-sized image isproduced, with thumbnail gray-scale occupying multi-pixel blocks. The code for a particular block consists of the address of a size 4x block, which may not necessarily lie along block boundaries, together with a block rotation and/or reflection.Within the larger block, the difference of each pixel from the block mean is a---Discussion The idea of permitting rapid production of the thumbnail has been considered with regard to the specification of the JPEG standard. [1] In hierarchical mode, an image is stored at several increasing resolutions. Eq an image 16x16, then 128x128, then 1024x1024, Preliminary analysis indicate very significant losses in the overall compression for hierarchical mode JPEG, coupled with an increased computational complexity. Hence this scheme is more of a promise than a reality, and is not available in commercial WWW browsers. For the demonstration images where 4x4 blocks have been used, the thumbnail, composed of 4x4 blocks compressed into a single pixel, requires 0.5 bits per pixel of the original image, and the code described here requires 1.5 bits per pixel. The overall compression of just 25% is of course quite modest, but none of the artifices of

7

JPEG encoding, where the mean of each image block, termed the DC component, is stored separately from the AC components

Fractal Coding In block-oriented fractal coding, as introduced by Jacquin, [ ??] an image is partitioned into non-overlapping rectangular range blocks. Each block is related by a symmetry transform to a larger domain block within the image. In In order to conveniently present this work, we require a compact notation to denote compression of Domain block: If A is a domain block within an imageI, we denote by a compressed version of the domain block, with columns compressed by factor α, and rows by factor β

r=α-1 1

αβ

i,j

=

c=β-1

∑ ∑ Aαi+r,βj+c r=0

c=0

7

International Symposium on Signal Processing and its Applications, ISSPA, Gold Coast, Australia, 25-30 August. 1996. Organised by the Signal Processing Research Centre, QUT, Birsbane, Australia

In particular the i,j th element of the 4x4 compressed block, relative to the block, r=3

b - b i,j

∑ ∑ A4i+r,4j+c

1 = 16 i,j

c=3

The work presented here is based on the variant of Fisher et al [??], which we formulate as = a[n] +

b[n] SοBο

i,j

where Fisher terms a[n] as the luminance of the block number n, and b[n] is termed the contrast. Fisher uses this formula as an iterative algorithm:

i,j

(t +1) - b

= b[n]

SοBο{ -} Note that at the first iterate, each pixel in the (uncompressed) domain block has the unique value given by the mean value of the unique range block in which the pixel lies. For sufficiently large The use of what

b = a[n] + b[n] SοBο i,j and performing an average within the range block, and the parallel average within the corresponding domain block

= a[n] +

The application of the thum

b

The new algorithm is derived by starting from the same fixed point relation:

b

For 2-level and quad-tree coding, the corresponding decomposition information is also required.

(b) Iterate according to the formula:

where the initial image is arbitrary.

= a[n] +

The thumb nail algorithm here introduced is based on this formula. For the thumbnail algorithm, fractal code consists of a) the mean pixel value in each range block - ie precisely a set of thumbnail data. b) the value of the contrast b[n] for each range block n

(a) Step 0: Initialise each range cell with the required mean value. This results in a uniformally magnified thumbnail image.

b (t +1) = a[b] + i,j ο b[b] S Bο

b

= b[n] SοBο{ -

}

r=0 c=0 Note that row column coordinates of the elements in this summation are relative to the uncompressed block

b i,j

Hence the departure of the element of a range block from the mean is given by the fixed point relation

b[n] SοBο b[n] SοBο

The speed-up advantage of using higher Basic data

8

8

International Symposium on Signal Processing and its Applications, ISSPA, Gold Coast, Australia, 25-30 August. 1996. Organised by the Signal Processing Research Centre, QUT, Birsbane, Australia

Here the Fisher alg applied from distinct starting points: Thumb LN0701.rms Mandril

Initial Decod e style Iteratio n Start 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Thumb

Thumb

Mandril

2-generation

continuous update

2-generation

23.7975 27.9860 32.3008 33.8549 34.1305 34.1698 34.1772 34.1775 34.1763 34.1760 34.1779 34.1778 34.1782 34.1782 34.1783 34.1782 34.1778 34.1783 34.1779 34.1778

23.7975 29.7351 33.6594 34.1468 34.1766 34.1772 34.1767 34.1772 34.1779 34.1780 34.1780 34.1780 34.1779 34.1779 34.1779 34.1779 34.1778 34.1779 34.1779 34.1779

11.3867 15.0775 20.1028 26.2309 31.7188 33.7863 34.1196 34.1695 34.1765 34.1768 34.1762 34.1766 34.1768 34.1775 34.1781 34.1779 34.1775 34.1780 34.1779 34.1781

Using MAND0604.rms (?? interblock = 4 - what's 6?) June/July 1994

THUMBSTART 2-gen

THUMBSTART Seidl

Lena start 2-gen

9

0 20.4084 20.4084 11.3867 1 21.8555 23.0857 14.3963 2 24.7054 24.7070 16.6894 3 24.7059 24.7058 19.0795 4 24.7071 24.7070 22.8817 5 24.7074 24.7067 24.3514 6 24.7071 24.7065 24.6477 7 24.7074 24.7067 24.6994 Mandril Thumb Thumb 8 24.7070 24.7068 24.7050 TRUE TRUE 9 24.7066 24.7069 24.7073 continuous 2-generation continuous 10 24.7066 24.7070 24.7076 update update 11 24.7065 24.7070 24.7073 12 24.7065 24.7070 24.7074 13 24.7069 24.7069 24.7069 11.3867 23.7975 23.7975 14 24.7070 24.7069 24.7075 16.8348 28.1536 29.9331 15 24.7071 24.7069 24.7066 25.1562 32.5913 33.8869 16 24.7070 24.7069 24.7065 32.3777 34.1069 34.3202 17 24.7069 24.7069 24.7065 34.0604 34.3433 34.3218 18 24.7068 24.7069 24.7067 34.3395 34.3376 34.1720 19 24.7069 24.7069 24.7067 34.1769 34.3362 34.3365 34.1761 34.3350 34.3355 34.1774 34.3384 34.3353 34.1779 34.3385 34.3358 34.1780 34.3367 34.3359 LN40401.HD0 34.1781 34.3356 34.3353 34.1781 34.3355 34.3365 Quad compression 34.1782 34.3344 34.3353 4x4 range blocks (16x16 domain) 34.1781 34.3362 1 interblock34.3370 34.1782 34.3363 34.3353 | contrast | 0) is

F pC (1 + T ) To recap: the formula just derived gives the effective compression for display in a system where there is probability p that a thumb, requiring T bytes of data, will be followed by the additional F bytes of data needed for consruction/decoding of the full displayed image. T is the data needed to be transferred for producing a thumb-print, whileThis last formula highlights the crucial importance of In the case of fractal coding, it is assumed that this compression refers to a particular scale used in decoding.

Consider the simple case of a 256x256 gray-scale image partitioned into 4096 4x4 blocks, and coded using 256 tiles. Data sizes are as follows (in bytes): Whole image 216 Code Thumb 28 Code Array 212 Tile Album 212 • For standard VQ coding, where code comprises just consists of just the Code Array and Tile album, the compression C =8. The entire image code has to be transfered to generate the thumbnail, so that C =8 T = 213 F = 213 S = 213 For p = .1, effective "compression" Ceff = 0.8, is in fact less than one. •For Thumb-augmented VQ code, the code stored is of the same size + 28 with minor reduction in the compression, but now T = 28 + 212 F = 213 + 28 S = T For p = 1, Ceff = 1.14

1

1

International Symposium on Signal Processing and its Applications, ISSPA, Gold Coast, Australia, 25-30 August. 1996. Organised by the Signal Processing Research Centre, QUT, Birsbane, Australia

fies the tile to be laid in each image block. In such a scheme, all image data is required before a thumbnail version of the image can be synthesised. A novel variant of vector quantization called Thumbnail based VQ is described in which the image thumbnail serves as a codebook, of overlapping tiling blocks for corrections to the zoomed-out thumbnail. The new scheme offers dataflow savings, especially in network environments. In the basic system described here, using 4x4 blocks, the codebook is the thumbnail, magnified by duplication to full image size, and tiling blocks are 16x16 blocks located anywhere within the codebook, compressed to 4x4 before tiling. Using only a single level of coding, using 4x4 blocks, gray-scale coding at good PSNR is achieved for a thumbnail of 0.5 bits per pixel, code 1.5 bits per pixel. Much greater compression is possible using more elaborate versions of the basic scheme.

It is noted that a minor saving of just 28 bytes in code, may be made by omiting just one byte in each tile in the album, as the missing value may be found from the other pixels using the tile mean. We now repeat the calculation, but with quantisation using 128 tiles In this case, for standard VQ, C = 32/3 T=F F = 212+ 211 = 3x 212 S = F For p = .1, effective "compression" Ceff = 0.8, is in fact less than one. For Thumb-augmented VQ code, with 128 tiles, the code stored is of the same size + 27 with minor reduction in the compression, but now C = 32/3 T = 27 + 212 F = 212+ 211 + 27 = S = T For p = 1, Ceff = ?????

2

2

International Symposium on Signal Processing and its Applications, ISSPA, Gold Coast, Australia, 25-30 August. 1996. Organised by the Signal Processing Research Centre, QUT, Birsbane, Australia

Fig 2 From top: 320x200 8-bit indexed colour Clown; zoomedout colour mean thumbnail with (inset) thumb; zoomed-out corner value thumbnail with (inset) thumb.

Fig 1 From top: 256x256 gray-scale lena; zoomed-out and zeroth thumbnail and based on mean gray-scale in 4x4 block; zoomedout and zeroth thumbnail based on corner value of 4x4 blocks.

3

3

International Symposium on Signal Processing and its Applications, ISSPA, Gold Coast, Australia, 25-30 August. 1996. Organised by the Signal Processing Research Centre, QUT, Birsbane, Australia

1.0 Introduction and that image compression schemes should involve a role for thumb productiIn the emerging environment of massive image databases where the display of a particular image at a workstation is mostly if not invariably preceded by its selection from a set of thumbnail images, a measure of image compression is proposed to take into account the costs of multiple thumbs for each image accessed fully. Analysis suggests that to maximise the effective compression, requires the use of thumbs efficiently transferred that contain data required for whole image generation. We show that for vector quantization of gray-scale images, a simple re-organization of coding data will usefully meet the needs of thumbs. For blockoriented fractal coding as by Jacquin, thumbnail capability is likewise notably improved. We discuss some of the limitations of JPEG coding, whose ISO standard was not keyed to the needs of thumb based retrieval.

. Thus VQ code of an image comprises (a) Header information (b) Code - set of tile numbers for each block (c) The code book - an album of tiles/vectors For regularly blocked image, for which the code is an array the header holds just the few bytes that specify image and block sizes, together with display palette specification.. High efficiency VQ coding involves compression schemes to further reduce the overall data-size, and the construction tree may be scattered through the code. However the point to be made here is that there is no separate facility for thumbnails within the code, and the full VQ code needs to be received before an image thumbnail can be constructed. 3.21 Block oriented fractal coding . of gray-scale is the closest available (using Euclidean mean). The linear transform of gray-scale, involeves the two parameters a and b of a transform of the form p = aq + b. To decode a fractal coded image, from an arbitrary starting image, each range block is replaced by a tile produced by the corresponding transformation. Rather than start the iteration from an arbitrary initial image, use of an exploded thumbnail as initial image will accelerate convergence, but offers no data savings.

4

4