TDA - A Method for Topologically Indexing Spatial

0 downloads 0 Views 194KB Size Report
The TDA method maintains a topological data structure for each ... of data structures adequate to store and retrieve the geometric representation of the objects' ...
TDA - A Method for Topologically Indexing Spatial Attributes Maurcio Riguette Mediano Marco Antnio Casanova Marcelo Gattass Computer Science Department Pontifcia Universidade Catlica do Rio de Janeiro, Rua Marqus de So Vicente, 225, 22453-900, Rio de Janeiro, RJ, Brazil mediano,casanova,[email protected] Abstract This work presents a method for topologically indexing a set of spatial attributes, called TDA Topological Data Access Method. The TDA method maintains a topological data structure for each set of spatial attributes for which one wishes to optimize the calculation of topological operators. The topological data structure, which is internally manipulated by the geographic database manager system, is responsible for generating the faces, vertices, and edges that compose the topological representation of each spatial attribute. In certain circumstances, the topological representation is several times more compact than the exact geometric representation of the spatial attributes, thus offering a significant gain in the processing of topological operators. This work initially presents the TDA method in an abstract way, including the formalism used to define the geometric representation of spatial attributes. Then it details the topological data structures used. Finally, it discusses issues related to the persistent storage of the data structures underlying the method.

1 Problem and Contribution Geographic Information Systems, or GIS, are information systems built especially to store, analyze, and manipulate geographic objects, that is, objects that represent artifacts or phenomena in which the geographic location is an inherent, and essential characteristic [CCH+ 96]. GIS typically offer spatial query languages, which explicitly incorporate spatial operators, describing relationships (or operations) among geographic objects. Among the various spatial operators, there are the topological operators, such as adjacent to, intercepts, etc., which define strictly topological relationships among geographic objects. Even though such operators are very useful for geographic analysis, they will have a very high cost if the geometric representation of the objects’ geographic locations is used directly. Therefore, this work proposes to store a representation of the topological relationship between the geographic objects, and to use such information to optimize the processing of the topological operators, instead of directly using the geometric representation of the objects’ geographic location. In certain circumstances, this additional representation, called topological representation, is several times more compact than the exact geometric representation, thus providing a considerable gain in the query processing cost. The method that generates and mantains this representation is called TDA - Topological Data Access Method. Almost all efforts to optimize the performance of spatial queries center on the development of data structures adequate to store and retrieve the geometric representation of the objects’ geographic location. The R-tree proposed by Guttman [Gut84] was one of the most outstanding access methods due to its simplicity and efficiency. Its variant, the R -tree proposed by Beckmann and others [BKSS90], has been widely adopted as a spatial access method for optimizing queries involving spatial attributes. Other variants were proposed, such as the grouping techniques using R -trees presented by Brinkhoff and

Kriegel [BK94]. Gaede [GG98] presents a chronology with the history of 54 access methods of this kind. Nievergelt and Widmayer [NW98] proposes some criteria for classifying multidimensional access methods. Saalfeld [Saa98] presents methods to sort spatial entities according to proximity in space, called tree-orderings, which can be constructed from topological data structures. According to Martha [Mar89], “The great contribution of the use of topological data structures is that (adjacency) queries to the database are executed locally using algorithms whose complexity is, at worse, linear on the number of topological elements of the result.” The study of which topological operators must be adopted is a whole new chapter. Egenhofer and Franzosa [EF91] proposed a framework for defining topological representations. From a 4-intersection matrix M4IM , 16 possible combinations of intersections between the boundary (@ ) and the interior ( ) of two spatial attributes of dimension 2, soi and soj , are generated. Egenhofer and Herring [EH90] have enlarged the method’s matrix from 4 intersections to 9 intersections. Matrix M9IM of the 9-intersection method generates a total of 512 intersection combinations between the boundary, the interior, and the exterior (; ) of two spatial attributes of dimensions 0, 1, or 2. Egenhofer [Ege93] has proven that the 9-intersection method is more adequate for treating spatial attributes of dimension 1. Clementini and others [CFvO93] have proposed a set of 8 operators called Calculus-Based Method, CBM. In the same work, the authors have defined yet another method, called dimension-extended 4-intersection method, or DE+4IM, in which the dimension of the results from the intersections in the matrix M4IM is taken into account. Finally, the authors have proven that the operators from the CBM method are more expressive than the ones from the DE+4IM method. Clementini and Di Felice [CF95] have defined another method, called dimension-extended 9-intersection method, or DE+9IM, in which the dimension of the results of the intersections in the M9IM matrix is taken into account.

0 dim(@so \ @so ) dim(s \ @so ) dim(s; \ @so ) 1 oi oi i j j j MDE+9IM = @ dim(@soi \ soj ) dim(soi \ soj ) dim(s;oi \ soj ) A : dim(@soi \ s;oj )

dim(soi \ s;oj )

dim(s;oi \ s;oj )

The authors have also shown that the CBM method is equivalent to the DE+9IM method. In the present work, we will present two types of topological representations: the Complete Topological Representation (CTR) and the Reduced Topological Representation (RTR). Topological representations are proposed to calculate detailed tological operators of complex attributes (Clementini and others [CF96]) in the refinement step of the storage and access architecture proposed by Brinkhoff and others [BHKS93]. We will also propose algorithms to insert, delete, and calculate topological operators of the MDE+9IM matrix. The same work includes tests with vegetation data in which topological representations were generated with magnitudes smaller than the respective geometric representations. This work is organized as follows. Section 2 contains a summary of the concepts used along this work, including the formalism used in the definition of complex spatial attributes, an intuitive definition of planar subdivision, and the geometric representation of the objects’ geographic location. Section 3 describes the TDA topological indexing method. Section 4 discusses persistent storage of the data structure underlying the method. Section 5 presents the insertion and removal algorithms of the TDA method. Section 6 presents the algorithms from the MDE+9IM matrix’ operators for the TDA method. Section 7 shows test results with geographic data. Finally, in Section 8, conclusions and future works are presented.

2

2 Preliminary Concepts 2.1 Basic Concepts A spatial attribute soi of a geographic object oi is any attribute of oi which contains a definition of the geographic location of oi . A geographic object may naturally have more than one spatial attribute. In the context of this work, we are interested in spatial attributes whose values represent, in a general case, sets of complex points, complex lines, or complex areas, defined according to the point-set topology of Clementini and Di Felice [CF96], resumed in Section 2.2. We call the internal representation of the value of a spatial attribute soi the geometric representation g (soi ) of that spatial attribute. The remaining of this section summarizes the main definitions concerning point-set topology, briefly discusses the concept of planar subdivision, and defines the geometric representations of complex points, complex lines, and complex areas, adopted here.

2.2 Concepts in Point-Set Topology Following Egenhofer and Franzosa [EF91], given a complex point, line or area T , we use the notation T , @T , T ; , T , and dim(T ) to denote the interior, boundary, exterior, closure, and dimension of T , respectively. A simple point is a point-set of dimension zero (non-empty) consisting of only one element. A complex point is a point-set of dimension zero (non-empty) consisting of a finite number of distinct elements P1 ; : : : ; Pn . The boundary @P of a complex point is empty. As a consequence, the interior P of a complex point P equals the union of all elements in P . A simple line is a closed point-set L such that dim(L) = 1, defined as the image of a continuous mapping f : [0; 1] ! ; @V o o i i Eoi ; Voi ; @Voi :;V   oi

se dim(soi ) = 2 se dim(soi ) = 1 se dim(soi ) = 0

(17)

Notice that, to store toi , we can also omit some of the subsets from the CTR. Basically, the smaller the number of TSO in toi , the smaller the space occupied in secondary memory by the topological representations (which is advisable in indexing techniques). Nevertheless, as we will see, storing the CTR has an efficiency gain when calculating the operators. The basic criterion used for selecting the subsets of components of TSO which will be stored by toi is the following: the subsets selected must suffice so that it is not necessary to resort to the geometric representation g (soi ) of soi , or to the geometric representation of the TSO edges in order to calculate topological operators. The second way of storing the topological representation toi of a spatial attribute soi is by means of the Reduced Topological Representation, or RTR, defined as follows:

8;   > Eoi ; Voi ; @Voi : ;V   oi

6

se dim(soi ) = 2 se dim(soi ) = 1 se dim(soi ) = 0

(18)

We may conclude that the RTR is more compact than the CTR. However, when using the RTR, it is necessary to navigate through the topological data structure to calculate some topological operators. To prove the completeness of the TDA method, we present is Section 6 algorithms for each one of the MDE+9IM matrix operators, easily calculated from both topological representations, introduced by the TDA method.

4 Storage In order to store a set SO of spatial attributes, topologically indexed according to the TDA method, it is necessary to store the topological representations of the spatial attributes in SO and the topological data structure TSO associated with SO . In this section we will describe how SO and TSO will be stored. We can say in advance that all descriptors of the data structures used to store SO and TSO have a fixed size and are persistently allocated.

4.1 The Multi Level Architecture of the TDA Method Figure 1 illustrates the TDA method’s level architecture. It is worth noting that the descriptor of the spatial attribute in the geographic database management system (GDBMS) references the descriptor of the spatial attribute in the topological index, and vice-versa. As a consequence, the GDBMS is capable of optimizing topological operators by means of the access to the spatial attribute’s topological representation, and the TDA method is capable of accessing this geometric representation stored in the GDBMS. Example in Section 4.4 details this architecture. GDBMS

TDA Method (a)

spatial attributes

(c) topological representations of spatial attributes (b) geometric descriptions of spatial attibutes

(d)

topological data structure

(e) geometric representations of edges

Figure 1: The TDA method’s level architecture.

4.2 Topological Data Structure Storage To adapt TSO in secondary memory, the first change to be made is to substitute adequate access methods for the lists of faces, edges, and vertices. This change is based on the observation that, during the access to these lists, insertions and removals of spatial objects, and queries are made such as “Retrieve the components in the list of faces (vertices or edges) that intercept a given region or that contain a point.” Thus, even in main memory, it is more adequate to use data structures that group spatial objects close to one another. As these lists are stored in secondary memory, we have chosen to use R -trees. The geometric representation of the vertex is stored together with the vertex descriptor. The geometric representation of each edge must be stored by means of an access method to vectorial data capable of optimizing geometric algorithms used by the topological data structure, and of efficiently storing the 7

geometric representation of the edge in secondary memory. The SV-tree (static V-tree) proposed by Mediano and others [MCD94] is an access method adequate to these requirements. It has an occupancy level of almost 100% in its nodes, and is capable of retrieving, in time O (N ), the N points that describe the stored polygonal line.

4.3 Topological Representation Storage The topological representation toi is stored by means of B-trees, each B-tree storing one of the subsets Foi ; Eoi ; Voi ; @Eoi and @Voi . The RTR toi has three B-trees when dim(soi ) = 2 or dim(soi ) = 1, and just one B-tree when dim(soi ) = 0. The B-trees store references to the components of TSO . To store the CTR toi , five B-trees would be needed when dim(soi ) = 2. We shall store an R -tree associated to the set SO of spatial attributes, topologically indexed by means of the TDA method. Each spatial attribute in set SO will contain an entry in the R -tree of SO . Each entry in the SO R -tree, relative to a spatial attribute soi 2 SO , is composed of the bounding box of g(soi ) and a reference to the soi descriptor. The descriptor of the spatial attribute soi contains a number representing dim(soi ), a reference to g (soi ) in the GDBMS, and a reference to the toi descriptor. The descriptor of the RTR toi contains the B-tree descriptors.

4.4 Examples Figure 2 details the TDA method’s level architecture using an example with a spatial attribute of dimension 1 whose topological representation is composed of two vertices (@Voi ) and one edge (Eoi ). GDBMS

TDA Method (a1)

(b1) (b2)

1

1 R-tree

(a2) 1

(c1) (c2) (d)

B-trees vertex

face

(e)

vertex

edge

SV-trees

Figure 2: Example of a spatial attribute of dimension 1 in the TDA method’s level architecture. We have subdivided the levels (a), (b), and (c) of the TDA method and the GDBMS level that manipulates the geometric representation of the spatial attribute, as follows:

 

level (a) of the spatial attributes of the TDA method is composed by levels: (a1) containing the R -tree that references the descriptors of the topologically indexed spatial attributes, and (a2) containing the descriptors of the TDA method’s spatial attributes; level (b) of the geometric representations of spatial attributes stored in the GDBMS is composed by levels: (b1) containing the descriptors of spatial attributes in the GDBMS, and (b2) containing spatial attributes’ geometric representations;

8

  

level (c) of the TDA method’s topological representations is composed by levels: (c1) containing topological representation descriptors, and (c2) containing the B-trees that store the references to the components of TSO that compose the topological representations; level (d) of the TDA method’s topological data structure is composed by (see Section 4.2): the three face, vertex, and edge R -trees; the face, vertex, and edge descriptors; and other auxiliary descriptors, according to with the topological data structure used; level (e) of the geometric representations of the edges in the topological data structure is composed by the SV-trees that store the geometric representations of the edges in the topological data structure.

The following example illustrates two spatial attributes soj and sok stored in a set SO topologically indexed by the TDA method. Figure 3 shows soj and sok . Each attribute is constituted by one region, and soj has a hole. Figure 4 shows TSO (19) after de insertion of soj and sok . Note that faces f3 and f5

Figure 3: Spatial attributes soj and sok and their relative position. belong to both toj (20) and tok (21), which means that dim(soj \ sok ) = 2 (Section 5 details topological operators’ algorithms). Faces f0 , f2 , and f4 do not belong to any topological representation.

8 > ETSO = fe ; e ; e ; e ; e ; e ; e ; e ; e g :VTS = fv ; v ; v ; v ; v g O

0

1

2

3

4

5

6

1

2

3

4

5

6

7

1

2

3

4

5

8  > @Eoj = fe ; e ; e ; e ; e g :@Voj = fv ; v ; v ; v ; v g 1

3

2

3

4

5

1

2

3

4

5

8  > @Eok = fe ; e ; e ; e g :@Vok = fv ; v ; v ; v g 9

9

(19)

5

1

3

8

5

(20)

6

6

7

8

9

1

2

3

4

(21)

f0 v1 e6 v2 f4 e4 f3 e7

e1

f1 e3 v5

e5 f2

v3 f6

f5 e8 v4 e2

e9

Figure 4: The TSO topological data structure after the insertion of soj and sok .

5 The Insertion and Removal Algorithms To insert (or remove) spatial attributes in (or from) a set SO of spatial attributes indexed according to the TDA method implies in executing three basic tasks: editing the topological structure, creating (or destroying) the topological representation of the inserted (or removed) spatial attribute, and updating the topological representations of the SO spatial attributes whose topological representations were affected by the topological structure’s edition. The Insertion and Removal algorithms of the TDA method are presented in Figures 5 and 6, respectively. Each step of the Insertion and Removal algorithms will be presently commented on. Step Step Step Step Step Step

1 2 3 4 5 6

00  SO Identify SO 00 do T ;1 (to ) For each sok 2 SO k Edit the topological structure

toi = T (g(soi ); TSO ) C (toi ; TSO ) 00 For each sok 2 SO

do tok

=

T SO

inserting

g(soi )

T (g(sok ); TSO )

Figure 5: Algorithm for inserting the spatial attribute soi in the set of spatial attributes SO indexed by the TDA Method.

5.1 Vertex and Edge Reference Counters For the edition of TSO to work properly, it is necessary to modify the topological structure, adding reference counters to the descriptors of edges and vertices. The reference counters are used for the proper removal of vertices and edges, as they allow the identification of vertices and edges which are not referenced to in topological representations. The reference counters are initialized with zero during the application of Euler operators that create vertices and/or edges. The only exception is the Euler operator 10

Step Step Step Step

1 2 3 4

Step 5 Step 6

Identify For each

C ;1 (to ; TS i

SO00  SO sok 2 SO00

O)

do

T ; (tok ) 1

Edit the topological structure unecessary components

T ; (toi ) 1

For each

sok 2 SO00

do tok

=

T SO

removing from toi

T (g(sok ); TSO )

Figure 6: Algorithm for removing the spatial attribute soi from the set of spatial attributes SO indexed by the TDA Method. that creates a vertex by dividing an edge in two. In this case, the reference counters of the two edges that result from the division will have the same value as the counter of the original edge. Operator C , in Step 5 of the Insertion algorithm (Figure 5), increments the reference counters of the vertices and edges referenced to in the topological representation of the inserted spatial attribute. Operator C;1 , in Step 3 of the Removal algorithm (Figure 6), decrements the reference counters of the vertices and edges referenced to in the topological representation of the removed spatial attribute.

5.2 Editing the TSO Topological Structure

Let us consider soi as a new spatial attribute to be inserted in or removed from SO . Editing the TSO topological structure, in Step 3 of the Insertion algorithm (Figure 5), consists in editing the planar subdivision stored in TSO from g (soi ) (for further details on the geometric representation of spatial attributes, see Section 2.4) by means of the application of Euler operators. NO DEVIAMOS REMOVER ESSA DISCUSSO SOBRE COMO EDITAR A ESTRUTURA TOPOLOGICA (PRXIMOS 3 PARAGRAFOS), TEMOS QUE DIMINUIR E ISSO NAO ACRESCENTA NADA. Given a spatial attribute soi to be inserted in SO , we will consider that the auto-intersections in the polygonal lines of g (soi ) , when dim(soi ) = 1, are removed in a pre-processing stage by means of geometric algorithms (Preparata [PS85]). The first step to be executed during the edition of the topological structure is to select the edges that touch or intercept g (soi ) and to generate intersections between the edges selected and the polygonal lines of g (soi ). Notice that the polygonal lines of g (soi ) are also split. For such, one must select the edges in ETSO whose bounding box touches or intercepts the bounding box of g (soi ). Then one must split the selected edges (using the respective Euler operator) and the polygonal lines in g (soi ), in the intersection points found, by means of algorithms for intersecting polygonal lines. If dim(soi ) = 0, the points in g (soi ) that coincide with vertices in VTSO are removed; if dim(soi ) = 1 or dim(soi ) = 2, the polygonal line segments in g (soi ) that coincide with the selected edges are removed. The remaining of g (soi ) after the intersections with vertices and edges of TSO and after the removal of parts of g (soi ) that coincided with the vertices of TSO will be called g0 (soi ). The second step is to create new vertices. The candidates to be new vertices are: when dim(soi ) = 0, the points of g0 (soi ); when dim(soi ) = 1 or dim(soi ) = 2, the first and the last points of each open polygonal line of g0 (oi ) and any point of each closed polygonal line of g0 (soi ). Before applying the Euler operators that create vertices, one must verify whether there already exists a vertex in VTSO in the same coordinates as the point which is candidate to be a new vertex. The operator is not applied when there is a vertex in the same position (or close enough to it, according to certain tolerance parameters). The third step is to create edges from the polygonal lines in g0 (oi ). For each polygonal line lj such that lj 2 g 0 (oi ), a new edge ej is created so that lj is the geometric representation of ej . 11

During removal, considering that the effects of the insertion of g (soi ) in TSO must be undone in order to reduce the size of TSO , the edition of TSO consists in applying destructive Euler operators. Step 4 of the Removal algorithm (Figure 6) is responsible for editing the topological structure during the removal of soi from SO . Given a topological representation toi of a spatial attribute soi removed from TSO , the components that can be removed from TSO are certainly among the components belonging to toi , because only these components’ reference counters were decremented. Thus, the first step is to traverse toi looking in @Eoi and Eoi for the edges that can be removed, presently removing them. The edges that can be removed are those whose reference counters have value 0. The second step is to traverse toi looking in @Voi and Voi for the vertices that can be removed, removing them by means of Euler operators. The vertices that can be removed are those whose reference counters have value 0 and those adjacent to no more and no less than 2 different edges. Is should be noted that when a vertex adjacent to 2 different edges is removed, these 2 edges are collapsed into one. Note that no geometric computation is made during the removal.

5.3 Creation and Destruction of Topological Representations Being soi a spatial attribute inserted in or removed from a topological index indexed by the TDA method, operator T is responsible for selecting the TSO components whose interiors are contained in soi , and for generating the topological representation toi . Operator T in Step 4 of the Insertion algorithm (Figure 5) creates the topological representation of the inserted atributte soi . Operator T in Step 6 of the Insertion algorithm (Figure 5) and in Step 6 of the Removal algorithm (Figure 6) creates the topological representations destroyed in Step 2 of the Insertion algorithm in Figure 5 and in Step 2 of the Removal algorithm in Figure 6, respectively. Steps 2 and 6 of the Insertion algorithm and Steps 2 and 6 of the Removal algorithm are part of the topological representation updating strategy and are commented on in Section 5.4. There is a total of 9 possibilities for selecting subsets of TSO components to create CTRs of spatial attributes:

 Foi , @Eoi and @Voi when dim(soi ) = 2;  @Voi , Voi and Eoi when dim(soi ) = 1; and  Voi when dim(soi ) = 0.

Remember that when creating RTRs its not necessary to generate Eoi and Voi when dim(soi ) = 2. The selection of Voi when dim(soi ) = 0 is done by looking for the vertices in VTSO whose geometric representation intercepts the bounding box of g (soi ). For the set of vertices selected, a geometric test is performed to find out whether the vertex belongs to soi or not. If it does, then the vertex is inserted into Voi ; if not, then the vertex is not considered, because it belongs to s;oi . The selection of Eoi when dim(soi ) = 1 is done by looking for the edges in ETSO whose bounding box intercepts the bounding box of g (soi ). For the set of edges selected, a geometric test is performed to find out whether the edge belongs to soi or not. If it does, then the edge is inserted into Eoi ; if not, then the edge is not considered, because it belongs to s; oi . The selection of Voi and @Voi when dim(soi ) = 1 is done, from Eoi , by selecting the vertices adjacent to the edges of Eoi . Vertices adjacent to more than one edge of Eoi belong to Voi ; the others belong to @Voi . The selection of @Eoi when dim(soi ) = 2 is made using the same procedure described for selecting  Eoi when dim(soi ) = 1. 12

The selection of @Voi when dim(soi ) = 2 is done, from @Eoi , by selecting the vertices adjacent to the edges of @Eoi . The selection of Foi when dim(soi ) = 2 is done from @Eoi . Let us consider that TSO has the following property: any edge e 2 ETSO between two faces f1 and f2 , stores in its descriptor the necessary information for distinguish which of the two faces is at the right of the orientation of its geometric representation g (e). Firstly, one of the two faces laterally adjacent to each edge in @Eoi must be selected. To find out which one of the two faces, the one to the right or the one to the left of g(e1 ), belongs to Foi , the orientation of g(e1 ) must be checked in relation to the orientation of g(soi )’s loop that matches g (e1 ). If the orientations match, the face to the left of g (e1 ) belongs to soi ; if not, the face to the right of g (e1 ) belongs to soi . Then the faces adjacent to the already selected faces in Foi must be recursively selected, only when the edges that separate the faces do not belong to @Eoi . The recursion is executed only once for each selected face. The selection of Eoi (and Voi ) when dim(soi ) = 2 is done selecting all the edges (and vertices) adjacent to faces in Foi , removing the edges (and vertices) that belong to @Eoi (and @Voi ). Operator T ;1 destroys the topological representation toi generated by T , that is, T ;1 liberates the persistent objects allocated to store toi . Operator T ;1 does not need TSO as an argument, because it is not necessary to access TSO to destroy toi . Operator T ;1 in Step 5 of the Removal algorithm (Figure 6) destroys the topological representation of the removed atributte soi . Operator T ;1 in Step 2 of the Insertion algorithm (Figure 5) and Step 2 of the Removal algorithm (Figure 6) destroys the topological representations created in Step 6 of the Insertion algorithm (Figure 5) and in Step 6 of the Removal algorithm (Figure 6), respectively.

5.4 The Topological Representations Update Strategy Updating topological representations of spatial attributes in a set SO of spatial attributes indexed accord0 whose topological representations ing to the TDA method implies in identifying spatial attributes SO have become inconsistent after editing TSO , and in updating the topological representations of the at0. tributes in SO During the insertion or removal of a spatial attribute soi in or from SO , the TDA method must 0 of affected spatial attributes of SO whose topological representations have become identify the subset SO 00 such that S0  S 00  SO , formed by the spatial attributes whose bounding inconsistent. Initially, set SO O O boxes of the geometric representations intercept the bounding boxes of soi ’s geometric representation g(soi ), is selected. The selected attributes in SO00 are likely to be affected spatial attributes. The option 00 as if they were affected spatial objects, even adopted here consists in treating all spatial attributes in SO 00 is the though not always all of them have been affected by the edition of TSO . The identification of SO first step (Step 1) of the Insertion algorithm (Figure 5) and of the Removal algorithm (Figure 6). The topological representations updating strategy adopted in this paper consists in destroying and 00 by means of operators T ;1 reconstructing the topological representations of the spatial attributes in SO and T . This strategy is used in Steps 2 and 6 of the Insertion algorithm and in Steps 2 and 6 of the Removal algorithm.

6 The Algorithms from the

M

DE+9IM

Matrix’ Operators

Let us consider:

 soi and soj as two complex spatial attributes of SO so that dim(soi ) = 0, 1, or 2, and dim(soj ) = 0, 1,

or 2;

13

 g(soi ) and g(soj ) as the geometric representations of soi and soj ; and  toi and toj as the topological representations of soi and soj . The computation of the MDE+9IM(soi ; soj ) matrix’ operators is traditionally done from g (soi ) and g(soj ). The TDA method’s proposal is to compute the MDE+9IM(soi ; soj ) matrix’ operators by means of toi , toj , and TSO . We will show that the topological representations toi and toj , together with the topological data structure TSO , are enough to compute the 9 operators of the MDE+9IM matrix. The geometric representations of spatial attributes, edges, and vertices are not used in this computation.

6.1 Possible Results from the Operators Given two spatial attributes soi and soj , Egenhofer and Herring [EH90] proposes 6 groups of operators, one for each of the six 9-intersection matrices, to represent the 9IM method among spatial attributes with dimensions 0, 1, and 2. Clementini and Di Felice have extended the 9IM method by dimension, defining the DE+9IM method. Initially, each one of the 54 operators can return ;1 (when there is no intersection), 0, 1, or 2. However, we can list a series of conditions to restrict the real result possibilities for each of the 9 operators of the six MDE +9IM matrices. Some of these conditions are derived from the conditions presented to reduce the result possibilities for the 9IM method in [EH90].

        

Condition 1 We can notice that dim(soi \ soj ) is smaller than or equal to the smallest value between dim(soi ) and dim(soj ). For instance, if dim(soi ) = 1, we can conclude that any intersection operator involving @soi will have as a result, at most, 0, because dim(@soi ) = 0. Condition 2 Since the existence of infinite spatial attributes is not admitted, we can establish that dim(s;oi \ s;oj ) = 2. Due to this property, it is not necessary to compute the operator dim(s;oi \ s;oj ) for any of the six matrices. Condition 3 If dim(soi ) > dim(soj ), then dim(soi Condition 4 If dim(soi ) ;1 as a result.

= 0,

then @soi

=

\ s;oj ) = dim(soi ).

;. Thus, any intersection operator with @soi will have

Condition 5 In the three matrices whose spatial attributes have the same dimension, the operators to the left of the main diagonal are symmetrical to those to the right. Condition 6 If dim(soi ) = 1 and dim(soj ) = 2 , then dim(soi \ soj ) 6= 0. If dim(soi \ s; oj ) were  equal to 0, then there would be a point belonging to soi isolated in soj , which is impossible.

; Condition 7 If dim(soi ) = 1 and dim(soj )  1, then dim(soi \ s; oj ) 6= 0. If dim(soi \ soj ) were equal to 0, then there would be a point belonging to soi isolated in s; oj , which is impossible.

Condition 8 If dim(soi ) = 2 and dim(soj ) = 2, then dim(soi \ soj ) 6= 0 ^ dim(soi \ soj ) 6= 1. If dim(soi \ soj ) were equal to 0 or 1, then there would be a point or line belonging to soi isolated in soj , which is impossible.

 ; Condition 9 If dim(soi ) = 2 and dim(soj ) = 2, then dim(soi \ s; oj ) 6= 0 ^ dim(soi \ soj ) 6= 1.  If dim(soi \ s; oj ) were equal to 0 or 1, then there would be a point or line belonging to soi isolated ; in soj , which is impossible. 14

The six following matrices Rdim(soi )dim(soj ) illustrate the possible results of each one of the 9 operators of the MDE+9IM matrix according to the dimension of soi and soj :

0 f1; 0; ;1g R = @ f1; ;1g f2; ;1g 22

f1; ;1g f2; ;1g f2g

1 AR

21

0 f0; ;1g f0; ;1g f0; ;1g 1 = @ f1; 0; ;1g f1; ;1g f1; ;1g A f1; ;1g

0 f;1g f;1g f;1g 1 R = @ f0; ;1g f0; ;1g f0; ;1g A R 20

f1g

f2g

f0; ;1g f1; ;1g f2g

0 f;1g f;1g f;1g 1 R = @ f0; ;1g f0; ;1g f0; ;1g A R 10

f0; ;1g

f1g

00

f2g

f2g

0 f0; ;1g = @ f0; ;1g f1; 0; ;1g

11

f2g

f2g

0 f;1g = @ f;1g f0; ;1g

f;1g f0; ;1g f2g

1 A

1 A

6.2 The Operators’ Algorithms With the purpose of validating the TDA method, in this section we present algorithms for each of the operators of the DE+9IM method for both topological representations, RTR and CTR. Figures 7, 8, 9, 10, 11, and 12 show operators’ algorithms of the matrices M22 , M21 , M20 , M11 , M10 , and M00 using the RTR. Some of the operators’ algorithms of the matrices M22 , M21 , and M20 access the topological data structure to obtain the adjacency relations between either edges and faces, or vertices and faces. When CTR is used, it is possible to write more efficiently algorithms that avoid the access to the topological data structure. Figures 13, 14, and 15 show operators’ algorithms of the matrices M22 , M21 , and M20 rewritten for CTR. Algorithm Step 1 Step 2 Algorithm Single step Algorithm Single step Algorithm Single step Algorithm Single step

dim(@soi \ @soj )

If any component of @Eoi belongs to @Eoj , return 1 If any component of @Voi belongs to @Voj , return 0, otherwise return ;1

dim(@soi \ soj )

If any component of @Eoi does not belong to @Eoj and is laterally adjacent to any component of Foj , return 1, otherwise return ;1

dim(@soi \ s;oj )

If any component of @Eoi is not laterally adjacent to any component of Foj , return 1, otherwise return ;1

dim(soi \ soj )

If any component of Foi belongs to Foj , return 2, otherwise return ;1

dim(soi \ s;oj )

If any component of Foi does not belong to Foj , return 2, otherwise return ;1 Figure 7: Algorithms of the operators of the M22 matrix for RTR.

The operators from the MDE+9IM matrix can be divided in three groups: group R with the operators for RTR (and CTR) which do not access the topological data structure, group RT with the operators for RTR that access the topological data structure, and group C with the operators from the RT group 15

Algorithm Single step Algorithm Step 1 Step 2 Algorithm Single step Algorithm Single step Algorithm Single step Algorithm Single step Algorithm Single step

dim(@soi \ @soj )

If any component of @Voi belongs to @Voj , return 0, otherwise return ;1.

dim(@soi \ soj )

If any component of @Eoi belongs to Eoj , return 1 If any component of @Voi belongs to Voj , return 0, otherwise return ;1

dim(@soi \ s;oj )

If any component of Eoj neither belongs to @Eoi nor is laterally adjacent to any component of Foi , return 1, otherwise return ;1

dim(soi \ @soj )

If any component of @Voj does not belong to @Voi and is laterally adjacent to any component of Foi , return 0, otherwise return ;1

dim(soi \ soj )

If any component of Eoj does not belong to @Eoi and is laterally adjacent to any component of Foi , return 1, otherwise return ;1

dim(s;oi \ @soj )

If any component of otherwise return ;1

@Voj

neither belongs to

dim(s;oi \ soj )

If no component of Eoj belongs to otherwise return ;1

@Eoi

@Voi

nor is adjacent to Foi , return 0,

nor is laterally adjacent to Foi , return 1,

Figure 8: Algorithms of the operators of the M21 matrix for RTR.

Algorithm Single step Algorithm Single step Algorithm Single step

dim(@soi \ soj )

If any component of @Voi belongs to Voj , return 0, otherwise return ;1

dim(soi \ soj )

If any component of Voj does not belong to @Voi and is adjacent to any component of Foi , return 0, otherwise return ;1

dim(s;oi \ soj )

If any component of Voj neither belongs to @Voi nor is adjacent to any component of Foi , return 0, otherwise return ;1 Figure 9: Algorithms of the operators of the M20 matrix for RTR.

16

Algorithm Single step Algorithm Single step Algorithm Single step Algorithm Step 1 Step 2 Algorithm Single step

dim(@soi \ @soj )

If any component of @Voi belongs to @Voj , return 0, otherwise return ;1

dim(@soi \ soj )

If any component of @Voi belongs to Voj , return 0, otherwise return ;1

dim(@soi \ s;oj )

If any component of @Voi does not belong neither to @Voj nor to Voj , return 0, otherwise return ;1

dim(soi \ soj )

If any component of Eoi belongs to Eoj , return 1, If any component of Voi belongs to Voj , return 0, otherwise return ;1

dim(soi \ s;oj )

If any component of Eoi does not belong to Eoj , return 1, otherwise return ;1 Figure 10: Algorithms of the operators of the M11 matrix for RTR.

Algorithm Single step Algorithm Single step Algorithm Single step Algorithm Single step

dim(@soi \ soj )

If any component of @Voi belongs to Voj , return 0, otherwise return ;1

dim(soi \ soj )

If any component of Voj belongs to Voi , return 0, otherwise return ;1

dim(s;oi \ soj )

If any component of Voj does not belong neither to @Voi nor to Voi , return 0, otherwise return ;1

dim(@soi \ s;oj )

If any component of @Voi does not belong to Voj , return 0, otherwise return ;1 Figure 11: Algorithms of the operators of the M10 matrix for RTR.

Algorithm Single step Algorithm Single step

dim(soi \ soj )

If any component of @Voi belongs to Voj , return 0, otherwise return ;1

dim(soi \ s;oj )

If any component of Voj belongs to Voi , return 0, otherwise return ;1 Figure 12: Algorithms of the operators of the M00 matrix for RTR.

Algorithm Single step Algorithm Single step

dim(@soi \ soj )

If any component of @Eoi belongs to Eoj , return 1, otherwise return ;1.

dim(@soi \ s;oj )

If any component of @Eoi does not belong neither to Eoj nor to @Eoj , return 1, otherwise return ;1. Figure 13: Algorithms of the operators of the M22 matrix for CTR. 17

Algorithm Step 1 Step 2 Algorithm Single step Algorithm Single step Algorithm Single step Algorithm Single step

dim(@soi \ s;oj )

If any component of Eoj belongs to @Eoi , return ;1, If any component of Eoj belongs to Eoi , return ;1, otherwise return 1

dim(soi \ @soj )

If any component of @Voj belongs to Voi , return 0, otherwise return ;1

dim(soi \ soj )

If any component of Eoj belongs to Eoi , return 1, otherwise return ;1

dim(s;oi \ @soj )

If any component of @Voj does not belong neither to @Voi nor to Voi , return 0, otherwise return ;1

dim(s;oi \ soj )

If no component of Eoj belongs to @Eoi nor to Eoi , return 1, otherwise return ;1 Figure 14: Algorithms of the operators of the M21 matrix for CTR.

Algorithm Single step Algorithm Single step

dim(soi \ soj )

If any component of Voj belongs to Voi , return 0, otherwise return ;1

dim(s;oi \ soj )

If any component of Voj does not belong neither to @Voi nor to Voi , return 0, otherwise return ;1 Figure 15: Algorithms of the operators of the M20 matrix for CTR.

18

rewritten for CTR. When CTR is not used (in this option, Eoi and Voi are not stored when dim(soi ) = 2), the lack of Eoi and Voi , used in algorithms of group C , is replaced by the navigation through the topological data structure in algorithms of group RT . The algorithms from groups R and RT are those of all operators of the MDE+9IM matrix for RTR, and the algorithms from groups R and C are those of all operators of the MDE+9IM matrix for CTR. As all the presented algorithms from the operators of the MDE+9IM matrix do not make use of the geometric representation neither of spatial attributes nor of edges and vertices of the topological data structure to compute topological operators, it is thus proven that the CTR and RTR do not use geometric representations to compute topological operators.

6.3 Complexity Analysis The algorithms from groups R and C have one or two occurrences of the generic step presented in Figure 16. Supposing that X and Y have the same number of components and that the number of components in X is N . Given that X and Y are ordered, the complexity of the algorithms of groups R and C and, consequently, of all algorithms for CTR, is O (N ). Single step

If f any j nog component of X [does not] belong(s) to Y , return f2 j 1 j 0 g, [otherwise return ;1]

Figure 16: Generic step used in the algorithms of operators in groups R and C . The algorithms in group RT have a coherence of one of the two steps presented in Figure 17. The adjacencies used in the algorithms of the RT group are either edges and faces, or vertices and faces. In the first option, whenever is necessary, one can check, for every component c of X , if the components adjacent to component c belong to Z . The difference between the first and the second option is that in the second it is necessary to check whether the component belongs (or not) to set Y before checking if the adjacencies belong to Z . Step A Step B

If fany j no g component of X is [not] adjacent to f any j no g component of Z , return f 2 j 1 j 0 g, [otherwise return -1] If fany j no g component of X [does not] belong(s) to Y and is [not] adjacent to f any j no g component of Z, return f 2 j 1 j 0 g, [otherwise return -1] Figure 17: Two generic steps used in the algorithms of operators in group RT .

Once again, let us suppose that the number of components in X , Y and Z is the same, that the number of components in X is N , and that the number of faces adjacent to a vertex is never greater than a constant K . Notice that the overlap between two natural maps hardly contains more than two lines crossing in a same point, that is, more than 4 faces adjacent to a vertex. Given that Y and Z are in order, the search for component c of X in Y has complexity O (log (N )), and the complexity of the search for the components adjacent to c in Z is also O (log (N )). Thus, the complexity of the algorithms in group RT is equal to O(N  log(N )). The complexity of the algorithms for RTR is equal to the sum of the complexities of the algorithms in groups R and RT : O (N  log (N )).

19

7 Tests with Geographic Data The data used in the tests were generated from 26 vegetation maps from legal Amazon courtesy of the Spatial Research National Institute, INPE. From the polygonal lines of the vegetation maps, 26 sets of geometric representations of dimension 2 were generated. For the tests’ sake, we shall consider each geometric representation obtained as the geometric representation of a spatial attribute. Two groups of tests were made. In the first group, each of the 26 maps were inserted side by side, just as they occur in geographic space, in only one set indexed by the TDA method. In the second group, 6 maps of rectangular area were overlapped. The topological data structure used is the Half-Edge [M¨an88]. The graph in Figure 18 shows the average number of identifiers in topological representations for the first test group (N 1) and to the second test group (N 2) on axis Y, and the number os attributes in SO , on axis X, after the insertion of each set of spatial attributes. We have noticed that another consequence of the increased overlap between spatial attributes is a raise in the average number of identifiers in topological representations. In the first test group, in which there was no overlap between the spatial attributes, N 1 remained around 17 identifiers, while in the second test group, after the insertion (overlap) of the sixth set of spatial attributes, the value of N 2 reached 180 identifiers. 180

N1 N2

135

90

45

0 0

4000

8000

12000 16000 20000

Figure 18: Graph with the average number of identifiers in topological representations N1 and N2, on vertical axis, and the number os attributes in SO , on horizontal axis, after inserting each set of spatial attributes from the first and the second test group, respectively. To perform the tests with topological operators, we have modified the operators’ algorithms. Before the computation of each operator, all of the topological representations to be used by the operators were transferred to a vector. Therefore, all searches made in B-trees were replaced by binary searches. Even though this modification has no impact on the complexity analysis of operators’ algoritms, the operators’ computation became 3 or 4 times faster compared to the computation using B-trees. At the end of the insertion of each set of spatial attributes, up to 10000 operations were performed between pairs of spatial attributes whose bounding boxes intercepted. These operations were the following:

 

Id1 - average time taken to read the geometric representation of both spatial attributes; Id2 - average time taken to read the geometric representation of both spatial attributes and to order their points in the X axis;

 MDE+9IM - average time taken to calculate the 8 topological operators of the matrix MDE+9IM ; (keeping in mind that dim(s; oi \ soj ) = 2); and 20



the average time taken to compute operators dim(@soi \ @soj ), dim(@soi \ soj ), dim(@soi \ s; oj ),    ; dim(soi \ soj ), and dim(soi \ soj ).

We can say that the greater the number of points per edge the smaller will the size of topological representations be when compared to those of geometric representations, that is, the faster will the computation of topological operators be using topological representations. It should be noted that the greater the overlap between spatial attributes, regardless of the number of points in the geometric representations, the greater will the size of the topological representation be, and the worse will the performance of TDA method be. The graph in Figure 19 shows the average time spent to execute the above operations in vertical axis and the number of spatial attributes in horizontal axis for the first group of tests. The most important result in this graph is the verification that a all of the operators’ algorithms using RTRs is faster than simple access to the spatial attributes’ geometric representation. 0.01

0.001

Id2 MDE+9IM Id1 @soi \soj @soi \@soj @soi \s;oj soi \s;oj soi \soj



           

0.0001 0

5000

10000

15000

20000

25000

Figure 19: Graph of the average time taken to compute topological operators in vertical axis (with logarithmic scale) and of the number of spatial attributes inserted in horizontal axis after the insertion of each set of spatial attributes in the first group of tests. The graph in Figure 20 shows the average time spent to execute the same operations of the previous graph for the second group of tests. The main difference between the two graphs is that, increasing the size of the RTR, due to the overlap of the spatial attributes, we have a decrease in the topological operators’ performance.

8 Conclusions and Future Works In this work we have presented a method capable of topologically indexing complex spatial attributes. Despite the TDA method being extensive, we were sure to emphasize along the text that it is based on formalisms such as point-set topology and topological data structures, which are concepts used in geometric modeling.

21

0.1

Id2 MDE+9IM Id1 @soi \soj @soi \@soj @soi \s;oj soi \s;oj soi \soj

0.01

0.001















0.0001 1000

2000

3000

4000

5000

6000

Figure 20: Graph of the average time taken to compute topological operators in vertical axis (with logarithmic scale) and of the number of spatial attributes inserted in horizontal axis after the insertion of each set of spatial attributes in the second group of tests. We have presented two kinds of topological representations of spatial attributes: RTR and CTR. We have concluded, from the definition of both representations, that RTR is more compact than CTR. We have shown that the complexity of the topological operators in the MDE+9IM matrix for CTR, O (N ), is smaller than the complexity of the same operators for RTR, O (N  log (N )). Tests for RTR show that the larger the overlap among the indexed spatial attributes the worse will the performance of the TDA method’s algorithms be, because the size of the topological representations is directly proportional to the overlap among the spatial attributes. That is, the smaller the size of the topological representations when compared to the geometric representations, the better will the TDA method’s performance be. Concerning future works, we suggest the investigation, among others, of grouping and cache techniques for the TDA method, and of special concurrency control techniques to allow good performance and concurrency during simultaneous accesses to the index.

References [BHKS93] T. Brinkhoff, H. Horn, H. Kriegel, and R. Schneider. A Storage and Access Architecture for Efficient Query Processing in Spatial Database Systems. In 3rd Int. Symp. on Large Spatial Databases, pages 357–376, 1993. [BK94]

T. Brinkhoff and H. Kriegel. The Impact of Global Clustering on Spatial Database Systems. In Proceedings of the 20th VLDB Conference, September 1994.

[BKSS90] N. Beckmann, H. Kriegel, R. Schneider, and B. Seeger. The R -Tree: An Efficient and Robust Access Method for Points and Retangles. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 322–332, May 1990.

22

[CCH+ 96] G. Cmara, M. Casanova, A. Hemerly, G. Magalhes, and C. Medeiros. Anatomia de Sistemas de Informao Geogrfica. 10a Escola de Computao, 1996. [CF95]

E. Clementini and P. Di Felice. A Comparison of Methods for Representing Topological Relationships. Information Sciences, 3(3):149–178, 1995.

[CF96]

E. Clementini and P. Di Felice. A Model for Representing Topologial Relationships Between Complex Geometric Features in Spatial Databases. Information Sciences, 90(1-4):121–136, 1996.

[CFvO93] E. Clementini, P. Di Felice, and P. van Oosteron. A Small Set of Formal Topological Relationships Suitable for End-User Interaction. In Proc. SSD, pages 277–295, 1993. [EF91]

M. Egenhofer and R. Franzosa. Point-set topological spatial relations. International Journal of Geographical Information Systems, 5(2):161–174, 1991.

[Ege93]

M. Egenhofer. Definitions of Line-Line Relations for Geographic Databases. Data Engineering, 16(6):40–45, 1993.

[EH90]

M. Egenhofer and J. Herring. Categorizing Binary Topological Relations Between Regions, Lines, and Points in Geographic Databases. Technical report, University of Maine - NCGIA, 1990.

[GG98]

V. Gaede and O. Gnther. Multidimensional Access Methods. ACM Computing Surveys, 30(2):170–231, June 1998.

[Gut84]

A. Guttman. R-Trees: A Dynamic Index Structure for Spatial Searching. In Proceedings of the ACM SIGMOD Conference on Data Engineering, pages 47–56, June 1984.

[M¨an88]

M. M¨antyl¨a. Solid Modelling. Computer Science Press, 1988.

[Mar89]

L. Martha. Topological and geometrical modelling approach to numerical discretization and arbitrary fracture simulation in three-dimensions. PhD thesis, Cornell University, August 1989.

[MCD94] M. Mediano, M. Casanova, and M. Dreux. V-tree - A Storage Method for Long Vector Data. In Proceedings of the 20th VLDB Conference, September 1994. [NW98]

J. Nievergelt and P. Widmayer. Spatial Data Structures: Concepts and Design Choices, pages 153–197. Springer-Verlag, 1998.

[PS85]

F. Preparata and M. Shamos. Computational Geometry: An Introduction. Springer-Verlag, 1985.

[Saa98]

A. Saalfeld. Sorting Spatial Data for Sampling and Other Geographic Applications. GeoInformatica, 2(2):37–57, 1998.

23