Image Indexing and Spatial Similarity-Based ... - Semantic Scholar

6 downloads 0 Views 335KB Size Report
In addition to 2-D B-strings 37] for the description of spatial relationships among image objects, Lee et. ..... quadtree which results by removing all nodes on level j>i. It is denoted by. T(i). ...... 2] A. Del Bimbo, P. Pala, and S. Santini. Visual ImageĀ ...
Image Indexing and Spatial Similarity-Based Retrievals Imran Ahmad and William I. Grosky Multimedia Information Systems Laboratory Department of Computer Science Wayne State University Detroit, Michigan 48202, USA [email protected], [email protected]

Abstract Similarity-based retrieval is an essential requirement for retrieval of images by contents. In this paper, we are introducing a new symbolic image representation technique and an indexing scheme for spatial similarity-based retrieval of images. In this technique, an image is recursively decomposed into a spatial arrangement of distinct features to preserve the spatial information of image objects. This scheme is independent of image size, translation and rotation and is essentially domain independent. Quadtrees are used to manage the decomposition hierarchy and help in establishing a measure of similarity for the retrievals. This scheme is incremental in nature and allows match at various levels of details, from coarse to ne. A two phase indexing scheme based on the concepts of signature les and quadtree matching is constructed. Use of signature les prunes the search space by discriminating against the non-matching entities which are further eliminated during the coarser tree matching process. For a given query, A facility is provided to rank order the retrieved spatially similar images from the image database for subsequent browsing and selection by the user.

1

1 Introduction Like traditional databases, an image database (IDB) is a collection of unique images in which each image represents some real world information. Each of these images are composed of a set of di erent objects which in turn possess certain distinguishable features and spatial positions. Thus, a composition of these objects, their spatial arrangements and characteristics make each image unique. All of the image management systems, in one respect or another, exploit these identi able characteristics and spatial positions of image objects to perform search and retrieval operations for user provided queries. To take full advantage of the nature of information embedded in images, it is essential to access information on the basis of their contents. Such search and retrieval operations are termed content-based retrievals [11]. Several schemes have been proposed to support such retrievals [10, 11, 14, 19, 23, 26, 27] using both relational and object-oriented database design [16, 28]. Based on the nature and type of retrieval, content-based retrievals are further classi ed into di erent categories [12, 14, 29]. Prominent among these are:

   

Retrieval by browsing Retrieval by attributes Retrieval by textual descriptions and keywords Retrieval by similarity measures { Retrieval by color and texture information { Retrieval by drawing a rough sketch { Retrieval by shape similarity { Retrieval by spatial similarity

Most widely discussed among these are similarity-based retrievals. However, to improve the overall performance of the search and retrieval process, some of the proposed data modeling and retrieval schemes combine more than one of the above mentioned retrieval types, For example, QBIC [10] utilizes not only the information about the shape of the image objects but also takes advantage of the color and texture as well. Similarly, Chabot [28] combines the contextual and color information of images to perform content-based retrievals. Prominent and most widely discussed among the similarity-based retrievals are shape similarity-based retrievals and spatial similarity-based retrievals [12]. In shape similarity-based retrievals, the shape of the image objects are important for retrievals. A facial image database is an example of shape similaritybased retrievals. In spatial-similarity based retrievals, the spatial orientation

of image objects and their mutual relationships are considered signi cant for retrievals. These systems are used in applications such as geographic and medical information systems. A similarity-based search reduces the search space to a limited number of matched instances. The number of instances thus retrieved depends on the nature and measure of similarity speci ed in the query. In all such cases, a facility is needed to navigate or browse through the set of retrieved images to select the desired image(s) through some manual operations. Actual or physical images are large in size, structured according to any of the available image format schemes such as jpeg, gif, ti , and can involve image compression standards as well, resulting in limited retrieval and computational eciencies. To overcome such problems, symbolic image representations are used [3] where a symbolic image is an abstraction of physical image, providing physical and logical data independence. Use of symbolic images is limited to comparison purposes only or to develop index structures. The term indexing in image database systems primarily refers to the process of ltering unwanted images from the set of potential matching candidates, thus, reducing the search space [2, 31]. Once the measure of similarity is determined, only then are the corresponding actual images retrieved from the database. Existing data modeling and symbolic image representation schemes, such as [1, 3, 7, 13, 18] either depend on extensive use of image processing techniques or, in one respect or the other, are image orientation, scaling or translation dependent [1, 2] and lack any indexing mechanism, resulting in non-polynomial time complexities. In this paper, we propose a new symbolic image representation and indexing scheme to support domain independent spatial similaritybased retrievals. Our scheme is based on a hierarchical decomposition of an image space into a spatial arrangement of distinct features. Our scheme, in several aspects, di ers from the existing ones. 1. It is independent of the original or query image size, orientation and translation. 2. Its computational eciency is independent of the number of pictorial objects in an image. 3. In this scheme, the computational time, and hence, the retrieval eciency is improved by pruning the search space at the earlier stages by spatial ltering through image signatures. 4. It is incremental in nature and can be adopted to nd a match at various levels of details, from coarse to ne. 5. It allows rank-ordering of the retrieved images on the basis of the degree of similarity with the query image for subsequent browsing.

The remainder of this paper is organized as follows. Section 2 contain a review of some of the proposed similarity-based retrieval techniques. Sections 3 describes our image representation scheme. Section 4 discusses out indexing and retrieval strategies. An analysis of the disk block access and storage requirements is presented in section 5. Various experimental results are presented in Section 5. Finally, section 6 o ers our conclusions and future research directions.

2 Related Work One of the earliest and widely discussed ideas for symbolic image representation using 2D-strings was presented in [6]. In this scheme, a picture is considered as a matrix of symbols where each symbol corresponds to an image object. The 2D-string is a symbolic projection of image symbols along the x and the y-axis. Since a query image can also be transformed into a 2D-string, the problem of pictorial information retrieval then becomes an instance of the largest common subsequence problem. However, this is equivalent to nding the maximal clique of a graph [4, 13] which is NP complete. Therefore, any such solution is not computationally feasible when dealing with images containing large number of pictorial objects. The mechanics and encoding process of spatial information in a 2D-string makes it un t for the complete description and e ective representation of the spatial relations among constituent objects of an arbitrarily complex image [22]. Chang and Jungert [5] extended the idea of 2D-strings to generalized 2D-strings (2D G-strings) by introducing additional local and derived binary operators and special cutting mechanisms to preserve the relative spatial positions of image components. By introducing a set of spatial relations among the objects to preserve the segmentation and shorten the length of derived string, Lee et. al. [20] proposed the idea of 2D C-strings. To further reduce the ambiguity in the derived string and provide a better description of the spatial relationships among image objects by taking into account the relative size, location and mutual distance among objects, Huang and Jean [18] introduced 2D C+ -string representation. To spatial relationship description of image objects, Gudivada [13] introduced the concepts of  0, we have that m1+m2+m3+m4

T1

T2

T3

T4

is a quadtree. The node m1+m2+m3+m4

is the root of the resulting quadtree. The leaves of T1 ; T2 ; T3 and T4 are the leaves of the resulting quadtree. The coordinate sequence of the root node is . If seq is the coordinate sequence of a node in Tj , for 1  j  4, then jseq is the coordinate sequence of this node in the resulting quadtree, where  is the sequence concatenation operator.

The root node of quadtree T is denoted by root(T ) and the occupancy of node n of a quadtree is denoted by occupancy(n).

De nition 4.2 The level of a node, n, in a quadtree is the length of the coordinate sequence of that node and is denoted by level(n).

De nition 4.3 The height of a quadtree T, denoted by height(T ), is one more than the maximum level of any node in the quadtree.

De nition 4.4 The ith approximation of a quadtree T, for i  0, is the

quadtree which results by removing all nodes on level j > i. It is denoted by T (i) .

De nition 4.5 A quadtree is complete i each leaf node has an occupancy of 0 or 1.

Since quadtree is recursive in nature, a recursive function is needed to compute the distance d between two quadtrees T and U and must satisfy the following properties: 1. 2. 3. 4.

d(T; T ) = 0 d(T; U )  0 i T 6= U d(T; U;) = d(U; T) ;  if d T (i); U (i) > then d T (i+1) ; U (i+1) >

First three of the above mentioned properties are self explanatory. Property 4 says that at any approximation, if the distance between two trees up to a certain level is greater than the threshold, then no matter to how many more levels the two trees are compared, the distance between them will never be reduced. Thus, the two trees at higher levels of detail need not be examined. Now we de ne a distance function d(T; U ) which not only satis es all of the above mentioned properties but also computes the distance between two quadtrees T and U and is given as follows:

De nition 4.6 Consider the following cases, Case 1:

Suppose height(T ) = height(U ) = 1 and occupancy(root(T )) + occupancy(root(U )) = 0. We then de ne

d(T; U ) = 0 Case 2:

Suppose height(T ) = height(U ) = 1 and occupancy (root(T )) + occupancy (root(U )) > 0. For M = occupancy (root(T )) and N = occupancy (root(U )), we then de ne

jM ;N j d(T; U ) = max (M; N ) Case 3:

Suppose height(T ) = 1, occupancy (root(T )) = 0, and height(U ) > 1. We then de ne

d (T; U ) = 1 Case 4:

Suppose height(T ) = 1, occupancy (root(T )) = 1, height(U ) > 1, and each child of the root node of U has an occupancy greater than 0. We then de ne

d(T; U ) = j N N; 1 j Case 5:

Suppose height(T ) = 1, occupancy(root(T )) = 1, height(U ) > 1, and at least one child of the root node of U has an occupancy equal to 0. We then de ne

d(T; U ) = 1 Case 6:

Suppose height(T ) > 1 and height(U ) > 1. For 1  j  4, let the subtree of T and U determined by the nodes having coordinate sequence j be called Tj and Uj , respectively. Let occupancy(root(T )) = M and occupancy(root(U )) = N. For 1  j  4, let occupancy(root(Tj )) = mj and occupancy(root(Uj )) = nj . We then de ne 4 nj d(Tj ; Uj ) mj d(Tj ; Uj ); X d(T; U ) = max M N j =1 j =1 4 X

!

We now have the following theorem:

Theorem 4.1 Let T and U be the two complete quadtrees. Then, for i >= 0, ;



;

d T (i) ; U (i)  d T (i+1) ; U (i+1)



Proof: Base Case: i = 0: Case 1:

height(T ) = height(U ) = 1 Then, we have T (0) = T (1) and U (0) = U (1) and our result follows.

Case 2:

height(T ) = 1, occupancy (root(T )) = 0, and height(U ) > 1 ;  It follows that T;(0) =;T (1)  = T . Also, height T (1) = 1. Suppose that occupancy root U (0) = N. We then have that N > 0, from ;  which it follows by Case 2 of De nition 4.6 that d T (0) ; U (0) = 1.

Case 3:

height(T ) = 1, occupancy(root(T )) = 1, and height(U ) > 1 ;  It follows that T;(0) =;T (1)  = T . Also, height T (1) = 1. Suppose that occupancy root U (0) = N. We then have that N > 0, from ;  which it follows by Case 2 of De nition 4.6 that d T (0) ; U (0) = (N ;1) N . ;



Now, height U (1) = 2. ;Thus, fromthe Cases 4 and 5 of De nition 4.6, we have that d T (1) ; U (1)  (NN;1) , from which our result follows.

Case 4:

height(T ) > 1 and height(U ) > 1 For 1  j  4, let the subtrees of T and U determined by the nodes having coordinate sequence j be called Tj and Uj , respectively. Let occupancy(root(T )) = M and occupancy(root(U )) = N. For 1  j  4, let occupancy(root(Tj )) = mj and occupancy(root(Uj )) = nj . Without loss of generality, assume; that M  N . Then, by Case 2 of De nition 4.6, we have that d T (0) ; U (0)  (MM;N ) .

jmj ;nj j where, if m = n = 0, Consider the quantity Q = 4j=1 mMj max j j (mj ; nj ) the jth summand is equal to 0. jmd ;nd j = md ;nd , while For 1  d  4, if md  nd , then mMd max (md ; nd ) M j m ; n j m m if md < nd , then Md max(dmd ; dnd ) > Md mdm;dnd = mdM;nd . Thus, P

Q

jmj ;nj j j =1 max(mj ; nj )

P4

= MM;N .

;  ;  Now,; d T (1) ; U(1)  Q. Thus, d T (1) ; U (1)  Q  (MM;N ) = d T (0) ; U (0) , and our result follows.

Induction Hypothesis: Assume that the theorem is true for i = 0; : : : ; k, where k  1. We will show that the theorem is true for i = k + 1. For 1  j  4, let the subtrees of T and U determined by the nodes having coordinate sequence j be called Tj and Uj , respectively. Let occupancy (root(T )) = M and occupancy (root(U )) = N. For 1  j  4, let occupancy (root(Tj )) =

mj and occupancy (root(Uj )) = nj . We then have that the subtrees of T (k(k) ;and U (k) determined by the nodes (k;1) 1) having coordinate j are Tj and Uj , respectively. Also the subtrees of T (k+1) and U (k+1) determined by the nodes having coordinate sequence   (k) (k ) (k ) j are Tj and Uj , respectively. We also have that d Tj ; Uj(k) =   P  i hP max 4j=1 mMj d Tj(k;1) ; Uj(k;1) ; 4j=1 nNj d Tj(k;1) ; Uj(k;1) ; and   hP   P  i d Tj(k+1) ; Uj(k+1) = max 4j=1 mMj d Tj(k); Uj(k) ; 4j=1 nNj d Tj(k); Uj(k) : 







But, by the induction hypothesis, d Tj(k;1) ; Uj(k;1)  d Tj(k) ; Uj(k) , and our result follows. QED. Theorem 4.1 is quite important, as it implies ; ;  that for the two trees T and U , if d T (i) ; U (i) > then d T (i+1) ; U (i+1)  , where  0 is the allowable deviation in similarity or threshold and i  0. This means that we can walk down both the database and the query tree and observe whether the distance between the given approximations of these trees is larger than our threshold. If this is the case, we may eliminate the database tree from further consideration, since all subsequent approximations and indeed, the original trees themselves, will have a distance from each other larger than the given threshold. In order to de ne the eciency of computing the distance d, we introduce the following additional concepts: For two quadtrees T and U , we say T  U i for every node of T , there is a node of U having the same coordinate. Now, for any two quadtrees T and U , there is a greatest lower bound of T and U , denoted by glb(T; U ). This tree has the following properties, 1. glb(T; U )  T

2. glb(T; U )  U 3. If W  T and W  U , then W  glb(T; U ) Let T and U be two quadtrees. We de ne the path tree of T and U , path(T; U ), to be glb(T; U ) with every edge, e, labeled by two quantities, T (e) and U (e). These labels are de ned as follows. Let e be an edge of path(T; U ). There is then a corresponding edge eT of tree T as well as a corresponding edge eU of tree U . Let edge eT join a parent node pT and a child node cT , while eU joins a parent node pU and a child node cU . We de ne,

occupancy(cT ) T (e) = occupancy (p ) T

and

occupancy(cU ) U (e) = occupancy (p ) U

Let n be a node in path(T; U ). Now, n determines an edge path from the root of path(T; U ) to node itself. Let this edge path be (e1 ; e2 ; : : : ; eh ). We de ne a path product of n to be a product of the form,

P (e1)  P (e2 )  : : :  P (eh ); where, for 1  i  h, either P (ei ) = T (ei ) or P (ei ) = U (ei ) Now, consider two quadtrees T; and U . Suppose we want to calculate ;  d T (i); U (i) . We rst nd path T (i) ; U (i) and let L be the ;set of leaf nodes of this tree. Then, from the de nition of d, it follows that d T (i) ; U (i) is of the form, X

n2L

c(ni) d(ni) ; 



where c(ni) is the path product of node n and d(ni) is d Tn(i) ; Un(i) , for Tn(i) the subtree of T (i) determined by the node corresponding to node n and Un(i) the subtree of U (i) determined by the node corresponding to node n.





d Tn(i) ; Un(i) for an increasing i would be easily calculable if c(ni) for a leaf node ;  n of path T (i); U (i) would depend on c(ni) , for n the parent;of n, where this would be previously calculated during the computation of d T (i;1) ; U (i;1) ; (i;1) (i;1)  via the tree path T ;U . From the previous de nition of c(ni) and c(ni) , it would be quite nice if either c(ni) = c(ni)  T (e) or c(ni) = c(ni)  U (e) for the edge joining the nodes n and n. However, this is not the case in general. We illustrate these concepts with the following quadtrees. Let T be a quadtree of the form, 100 20 8

2

10

50 2

8

0

0

5

3

2

45

20 4

1

10

0

0

10

0

10

and U be a quadtree of the form, 90 20 4

1

10

50 1

4

0

5

0

;

45

2

3



20 4

1

10

0

;

In order to compute d T (1) ; U (1) , we need to consult the tree path T (1) ; U (1) and is given as: e1

e2

e3



e4

We have that T (e1 ) = 20=100, T (e2 ) = 50=100, T (e3 ) = 10=100, T (e4 ) = 20=100, U (e1 ) = 10=90, U (e2 ) = 50=90, U (e3 ) = 10=90, and U (e4 ) = 20=90. We then get that ;



d T (1) ; U (1) = T (e1 )  1=2 + T (e2)  0 + T (e3 )  0 + T (e3 )  0: ;



;



In order to compute d T (2) ; U (2) , we need to consult the tree path T (2) ; U (2) and is given as: For 5  i  20, T (ei) and U (ei ) are found as above, by consulting the quadtrees T and U . We then get that

e4

e1 e5

e6 e7

;

e9

e8

e3 e13

e2

e e10 e11 12

e14 e15 e16

e17

e e18 e19 20



d T (2) ; U (2) = U (e1 ) [U (e5 )  1=2 + U (e6 )  1=2 + U (e7 )  1=2 U (e2 ) [U (e9 )  1 + U (e10 )  1 + U (e11 )  1 U (e3 ) [U (e13 )  0 + U (e14 )  0 + U (e15 )  0 U (e4 ) [U (e17 )  0 + U (e18 )  0 + U (e19 )  0

+ U (e8 )  1=2] + + U (e12 )  1] + + U (e16 )  0] + + U (e20 )  0].

rather than having ;



d T (2) ; U (2) = T (e1) [U (e5 )  1=2 + U (e6 )  1=2 + U (e7 )  1=2 T (e2) [U (e9 )  1 + U (e10 )  1 + U (e11 )  1 T (e3) [U (e13 )  0 + U (e14 )  0 + U (e15 )  0 T (e4) [U (e17 )  0 + U (e18 )  0 + U (e19 )  0

+ U (e8 )  1=2] + + U (e12 )  1] + + U (e16 )  0] + + U (e20 )  0].

However, computing d(T; U ) for two quadtrees T and U is still relatively ecient. i Theorem 4.2 Let T and U be two quadtrees such that both T i and ; U , for  some ;i  0 are complete trees. The time complexity to compute d T ; U ,  : : : ; d T `; ; U `; is O(j (T; U ) j), where j (T; U ) j is the cardinality of the ( )

( )

(0)

(

1)

(

(0)

1)

path tree path(T; U ).

Proof:

;



In order to compute d T (0) ; U (0) , only the root of the path tree ; (0) (0)  path T ; U is signi cant. However, it has an out; degree of 4 and all of its children contribute in determining the distance d T (1) ; U (1) at the following approximation. In order to compute the distance between the two trees for subsequently each of the subsequently increasing approximations, we need to consider the incremental contribution of the 4 children of each of the nodes at current approximation of the path tree. Therefore, the total number of nodes needed to be traversed to nd the distance for an approximation h, 0  h < `, the total number of levels in the path tree, can be given as: 1 + 41 + 42 + 43 + : : : + 4h =

h X i=0

4i

h+1 = 4 3; 1 ;



;



Thus to compute the sequence d T (0) ; U (0) ; : : : ; d T (`;1) ; U (`;1) , it takes `;1 X

4h+1 ; 1 = 4`+1 ; 3` ; 4 3 9 h=0

Therefore, the time complexity to compute the overall distance is nothing but O(j (T; U ) j). QED. Though application dependent, a suitable value of is critical to ltering process and o ers a trade-o between the number of candidate trees for potential similarity and those to be examined for such determination. A value close to 1 ensures that none of the desirable trees are left out but will result in a large selection of trees for a potential match, including those which are not really similar to the query image tree at all. Similarly, a small value reduces the search space but also introduces the danger of leaving out some of the potential candidates as well.

5 Expected Disk Block Access and Storage Requirements In this section, we will brie y analyze the storage requirements for both the signature les and quadtree structures with a brief analysis for the number of disk block access to nd matching signatures from the signature le. In order to calculate the potential number of unsuccessful block match [32], let A be the number of records containing the terms S1 and S2 . The estimate of the probability that a particular block contains one or more records containing both S1 and S2 in signatures stored on it is de ned as: n o Nb p(A) = 1 ; 1 ; N1 Since, in general, A  N , p(A) can be approximated as: A b p(A)  AN N = Ns

The probability of a false match can be made small by a careful selection of signature size (W) and the number of bits (b) set in a signature at it directly a ects the number of block accesses and the I/O time. In order to calculate the disk space overhead associated with signature les, we de ne: Disk page capacity Size of an image id Size of the image signature Size of the block signature Size of a disk page pointer Number of signature blocks Number of signatures per block Number of disk blocks for signatures Number of signatures in signature le Total number of disk blocks Number of feature points Total number of nodes in a quadtree Normalized max(height;width) of an image Height of the quadtree Length of the path sequence Number of bits for direction encoding Therefore, Number of signatures per disk page

p bits Wi bits Ws bits Wb bits q bits Nb Ns Ns(disk) Ntot Ndisk nf n

N

h = log4 N L = j seq j b Ns



= Ws +p Wi 



l

j



Total number of disk blocks for image signatures Ns(disk) = N= Ws +p Wi Total number of disk blocks for block signatures Nb but Nb = Ns(disk) , therefore,

= N= Wbp+q



km

Total number of disk blocks Ndisk = Nb + Ns To nd matching signatures in single level indexing scheme, a total of Ns blocks are accessed. However, with a two level indexing, the number of disk blocks accessed Ndisk(2) is given as:

Ndisk(2) = pl(A)  Ndiskm Nb = A(1 + N s

For a large N , Ndisk(2)  Ns , and hence, a signi cantly reduced number of disk block access.

Table 2: Characteristics of the image data set Number of original images Images and their variants Feature Points Rotational variants per image Scale variants per image Translation variants per image Image Size (Min) Image Size (Max)

200 1200 2 - 25 1 2 2 100 X 100 1024 X 1024

The storage requirements for the quadtree of a symbolic image o ers a trade o between the comparison performance and their storage requirements. Although, storing a complete quadtree will result in optimum computational speed but will require large amount of disk space. However, typically since nf  n, most of this space will be unnecessarily occupied. On the other hand, the path sequence seq of each of the feature points is capable of completely describing their spatial orientations and does not result in wasteful memory requirements at the expense of extra processing time to reconstruct quadtrees on need to build basis. For our discussion, we assume that only the path sequence of each of the quadtrees are stored in memory. We further assume that all of the quadtrees are full, and hence, all of the path sequences are of same length. Since we are dealing with only 4 directional relations, this information can be encoded in two bits (b = 2bits). For a quadtree of height h, the maximum length of path sequence for any of the feature points is also limited to h and consequently, only 2h bits are needed to encode the complete sequence for each of the feature points. Hence, all of the feature points in a quadtree will require only 2nf h bits.

6 Experimental Results For experiments, our image database consists of 1200 random images of arbitrary size and complexity. Each image consists of from 2 to 25 feature points. There are 200 basic images, the rest consisting of a known number of geometric variants. These images are essentially independent of one another. The general characteristics of the image data set are summarized in Table 2.

Each of the basic 200 images in the database has 5 di erent variants in terms of three essential geometric transformations: scaling, translation and rotation. The data set consists of 2 variants of each in terms of both scale and translation and a single rotation variant. Our experiments are based on a range of tolerance factor (TF ) values to determine successful retrievals. We have tested our approach with TF 2 f0:0; 0:1; 0:2g to create signatures for the database images. When TF = 0:0 we must have an exact match. These results are collected for di erent approximations and averaged over the 200 original images in the database. It is important to note that since each of the 200 images has 5 variants, each image in the database is known to have exactly 6 instances. Our query is composed of one of the known images in the database. Therefore, for each instance, we can predict the minimum number of retrievals. For TF = 0:0 or an exact match, there should only be 6 matched instances for each image. However, since the number of feature points in an image ranges between 2 and 25 and the process of recursive decomposition in our current implementation stops when all of the feature points are in distinct quadrants, for images with only 2 or 3 feature points, the chances of getting a false match in general are much higher. Moreover, based on the distribution of feature points, it may also be possible to have only one decomposition of the feature image and hence, to have a maximum of two approximations. However, in such cases, the accuracy can be improved signi cantly by extending the decomposition hierarchy. With higher approximations, the chances of getting false matches are reduced due to the increase in the decomposition depth of the feature image. A comparison of results for the three values of TF is given in Figure 7. These results are in accordance with our proposed theory. Initially, for the root level or the 0th approximation, we start with a large number of successful matches. This is due to the fact that a number of di erent trees may have the same root occupancy. However, as proved earlier, this number reduces signi cantly for subsequently increasing approximations. The best results are obtained by matching the entire trees. The entire tree or the nth approximation is indicated by \Tree" in Figure 7. To nd a matching distance at the leaf level, we adopted two di erent approaches. In the rst approach, the two trees are compared as it is without any consideration of the relative position of the feature point in the enclosing quadrants. However, this approach may not be suitable in cases where either a match at ner levels is sought or cases when the size of the quadrants is quite large. Results of such experiments for successful match are shown by the vertical bars in Figure 7. In a second set of experiments, we computed the matching distance by taking into account the size of the quadrant as well.

Avg. Number of Matched Instances

12 Successful Matches (without quadrant size match) Successful Matches (with quadrant size match) Matched Signatures

10

8

6

4

2

0 0

1

2

3

4

5

Tree

Approximation

(a) Avg. Number of Matched Instances

45 40 Successful Matches (without quadrant size match) Successful Matches (with quadrant size match) Matched Signatures

35 30 25 20 15 10 5 0 0

1

2

3

4

5

Tree

Approximation

(b) Avg. Number of Matched Instances

90 80

Successful Matches (without quadrant size match) Successful Matches (with quadrant size match) Matched Signatures

70 60 50 40 30 20 10 0 0

1

2

3

4

5

Tree

Approximation

(c) Figure 7: ith approximation Vs average number of matched instances for TF = (a) 0.0, (b) 0.1 and (c) 0.2

The new computed distance between the two subtrees T and U , for the feature points F T = (xT ; yT ) and F U = (xU ; yU ) respectively, is given as:

d(T; U ) =

p

(xT ; xU )2 + (yT ; yU )2 max(diagonal(T; U ))

where diagonal(T; U ) is the length of the diagonal of the largest of the two quadrants containing feature points F T and F U and occupancy(root(T )) = occupancy(root(U )) = 1. This is a change for Case 2 of the de nition 4.6 of d(T; U ). The results obtained for successful match in this case are shown by the lines crossing vertical bars in Figure 7. It is worth noting that in both cases, we started with an identical number of matching signatures for each of the approximations. The number of successful matches also remained same for the 0th approximation but start to di er for subsequent approximations. The number of successful matched instances in all those cases where the size of the sub-quadrant is considered is smaller than those cases where no such comparison is made and is consistent for all of the increasing approximations. For the last approximation where an entire tree is matched against another tree, a comparison of the average number of retrieved signatures and successful matches for di erent values of tolerance factor is given in Figure 8. Note that the average number of successful matches in experiments when the computed distance is independent of the position of a feature point in the enclosing subquadrant for TF = 0:0 is 6.71, for TF = 0:1 is 7.79 and for TF = 0:2 is 8.87 whereas in experiments when the overall distance is computed by considering the relative positions of each feature point in its enclosing sub-quadrants for TF = 0:0 is 6.31, TF = 0:1 is 7.00 and for TF = 0:2 is 7.54. These results are obtained by choosing a known image from the database to nd images similar to it with a threshold = 0:5. As expected, TF = 0:0 resulted in less comparisons and this number increased signi cantly for the other TFs. In such cases, although we started with a large number of images, due to the characteristics of the distance function, only those trees quali ed which were similar to the query image within the threshold. We also collected average cpu time needed to determine the suitability of a match for both of the above mentioned cases. In each of these cases, the cpu time is least for TF = 0:0 since we start with a smaller number of matching signatures whereas for other values of TF , the cpu time increased due to an increased number of matched signatures. As expected, in all experiments, the cpu time is least for the 0th approximation. However, because of more involved computations, it increases signi cantly for higher approximations with the highest being for the nest approximations when an entire query tree is compared against the potential candidates. The average cpu time for all three

90.0 80.0

Average

70.0 60.0 50.0 40.0 30.0 20.0 10.0 0.0 0.0

0.1

0.2

Tolerance Factor (TF) Matched Signatures

Successful Matches

Non-Matches

(a) 90.0 80.0

Average

70.0 60.0 50.0 40.0 30.0 20.0 10.0 0.0 0.0

0.1

0.2

Tolerance Factor (TF) Matched Signatures

Successful Matches

Non-Matches

(b) Figure 8: A comparison of average number of retrieved signatures and successful matches in case of complete tree matches for di erent TFs (a) when quadrant size is not considered (b) when quadrant size is considered

values of TF and all approximations is consistently higher when the nal distance is based on the relative position of each individual feature point in its enclosing sub-quadrant. A comparison of average CPU time for both of these cases for di erent approximations and all three values of TF is given in Figure 9. These times are obtained on a 100 MHz Pentium PC, running Linux 1.3.18.

7 Conclusion and Research Directions In this paper, we have presented a symbolic image representation and indexing scheme which di ers from earlier proposed techniques in many respects. As it stands, this scheme is independent of size, translation and orientation of the query images. The symbolic representation does not involve any string

16

CPU Time (1/10 sec)

14 12 10 8 6 4 2

TF = 0.2

0 0

1

TF = 0.1 2

3

4

5

Tree

TF = 0.0

Approximation TF=0.0: w/o quadrant size TF=0.1: w/ quadrant size

TF=0.0: w/ quadrant size TF=0.2: w/o quadrant size

TF=0.1: w/o quadrant size TF=0.2: w/ quadrant size

Figure 9: A comparison of average cpu time for various approximations and TFs comparisons, the search space is reduced signi cantly in the earlier phases of comparisons to improve on retrieval times, and the non-matches are eliminated only due to its incremental nature. We are working on schemes to nd a match at ne levels of details by extending the decomposition hierarchy. This will allow us to locate a feature point at more precise positions for applications requiring greater levels of details such as medical systems. We are also working on extensions of this scheme to incorporate domain dependent spatial similarity-based retrievals as well as on devising schemes to increase the exibility of querying procedures. This extension will enable us to formulate queries by simply describing the spatial relationships among the objects in terms of their relative positions. For future extensions of this work, this methodology can be generalized to 3 dimensions in order to retrieve similar in video segments based on the motion of individual objects.

References [1] Y. Alp Aslandogan, Chuck Thier, Clement T. Yu, and Chengwen Liu. Design, Implementation and Evaluation of SCORE (a System for COntent based REtrieval of pictures. In Proceedings of the 11th IEEE International Conference on Data Engineering, pages 280{287, March 1995. [2] A. Del Bimbo, P. Pala, and S. Santini. Visual Image Retrieval by Elastic

[3] [4] [5] [6] [7] [8] [9] [10]

[11] [12] [13]

Deformation of Object Shapes. In Proceedings of IEEE Symposium on Visual Languages, pages 216{223, October 1994. Chin-Chen Chang and Tzong-Chen Wu. Retrieving the Most Similar Symbolic Pictures from Pictorial Databases. Information Processing and Management, 28(5):581{588, 1992. Chin-Chen Chang and Tzong-Chen Wu. An Exact Match Retrieval Scheme Based Upon Principal Component Analysis. Pattern Recognition Letters, 16(5):465{470, May 1995. S. K. Chang, E. Jungert, and Y. Li. Representation and Retrieval of Symbolic Pictures using Generalized 2D Strings. In Visual Processing and Image Processing, pages 1360{1372. SPIE, 1989. S. K. Chang and S. H. Liu. Iconic Indexing by 2-D Strings. IEEE Transaction on Pattern Analysis and Machine Intelligence (PAMI), 9(3):413{428, May 1987. Shi-Kuo Chang, Qing-Yun Shi, and Cheng-Wen Yan. Iconic Indexing by 2D Strings. In 1986 IEEE Computer Society Workshop on Visual Languages, pages 12{21, Dallas, Texas, June 1986. Chris Faloutsos and Stavros Christodoulakis. Signature Files: An Access Method for Documents and Its Analytical Performance Evaluation. ACM Transactions on Oce Information Systems, 2(4):267{288, October 1984. Christos Faloutsos. Access Methods for Text. ACM Computing Survey, 17(1):49{74, March 1985. Myron Flickner, Harpreet Sawhney, Wayne Niblack, Jonathan Ashley, Qian Huang, Byron Dom, Monika Gorkani, Jim Hafner, Denis Lee, Dragutin Petkovic, David Steele, and Peter Yanker. Query by Image and Video Content: The QBIC System. IEEE Computer, 28(9):23{32, September 1995. William I. Grosky and Zhaowei Jiang. Hierarchical Approach to Feature Indexing. Image and Vision Computing, 12(5):275{283, June 1994. William I. Grosky and Rajiv Mehrotra. Image Database Management. In Advances in Computers, pages 237{291. Academic Press, N.Y., 1992. Venkat Gudivada.