Approaching the Issue of Information Loss in

0 downloads 0 Views 246KB Size Report
Keywords: Geographic data transfers, semantics, information loss, formal methods. .... To get information from data, i.e., to answer a question from physical signs, ...
Approaching the Issue of Information Loss in Geographic Data Transfers1 Werner Kuhn Department of Geoinformation Technical University Vienna Gusshausstrasse 27-29/127 A-1040 Vienna (Austria) Email: [email protected]

Abstract Geographic data are now being transferred on a daily basis within and between organizations worldwide. The low-level technical problems of data transmissions are largely solved and significant progress on transfer formats is being made by standardization bodies. However, an issue that had long been considered minor is turning into a can of worms: the problem of transferring geographic information, i.e., the meaning of the data and not just the data themselves. This paper proposes to approach this problem algebraically, in three steps: (1) to formalize data models as algebras, describing data and operations together; (2) to describe mappings between models as morphisms, formalizing the transfer functions; and (3) to model invariant properties as homomorphisms, identifying everything else as information loss. An outline of this approach is presented, applied to a detailed example, and discussed with respect to its practicality. Keywords: Geographic data transfers, semantics, information loss, formal methods.

1. INTRODUCTION Collections of geographic data are, by nature, potentially useful to several parties. Since a given part of space is generally occupied in various forms by a variety of people and organizations, these parties share interests in the same space. For example, a land parcel has properties which are of interest not only to its owner, but also to the tax authorities, the water, gas, telephone, and electricity companies, the postal 1Support from Intergraph Corporation and from the Technical University Vienna is gratefully acknowledged.

Andrew Frank contributed to the ideas expressed and provided helpful comments on paper drafts.

service, and so on. However, those sharing an interest in some piece of land cannot yet effectively share information about it. While geographic data transfers have become a reality, they are still severely restricted by technical and organizational problems (Frank 1992). Improving the support for geographic data transfers and data sharing has therefore become a primary concern in today's theory and practice of geographic information systems (GIS). In addition to the shared nature of space usage, GIS applications increasingly transcend boundaries of organizations, local authorities, and even nations. In particular, two of the fastest growing GIS application sectors, namely, transportation and environment, typically involve several administrative and linguistic communities. The resulting communication needs across language boundaries pose major challenges. These lie not only at the superficial terminological level, but involve the deeper cognitive issues of how people from different cultural backgrounds think about space and express their thinking in natural and technical languages (Mark and Frank 1991). Thirdly, geographic data are characterized by particularly high acquisition and maintenance costs which impose an imperative need to use the data in multiple ways and to cooperate in their collection and maintenance. Together with the rapidly growing needs of society to effectively manage our environment, this difficult, costly, and time consuming nature of acquiring and handling geographic data has created a strong economic and social push to share geographic information (Onsrud and Rushton 1992). In a larger context, this social evolution raises important economic and legal issues. These are generally more crucial to successful data sharing than the technology involved. Yet many challenging cognitive and technical issues beyond hardware connections, software compatibility, and data formats remain unresolved. They mostly concern the semantics of geographic data and how to preserve these in a transfer. This paper addresses the problem of describing semantics in order to formally assess the potential loss of information in geographic data transfers. After a review of the current state of data transfer standards (section two) and a description of the problem of information loss (section three), an algebraic approach to this problem is proposed (section four) and applied to an example of a data transfer (section five).

2. GEOGRAPHIC DATA TRANSFER STANDARDS Considerable progress over the past decade toward improved data transfer mechanisms has helped to establish a substantial geographic data market. Spatially referenced data have become an important commodity and a rapidly growing number of companies are making a living entirely or largely on adding value to geographic data and reselling them in custom-made form. This development is made possible by an expanding digital data infrastructure at national and international levels, the most popular media being the CD-ROM for mass data distribution and the Internet for rapid worldwide communication. The special nature of spatial data, however, calls for data transfer support beyond storage and packaging. This has led to a flurry of national, regional, and international activities aiming at the standardization of spatial data formats (Lee and Coleman 1990). Almost every developed country has its national spatial data standardization committee, and there are substantial efforts going on at the European (CEN Technical Committee 287, "Geographic Information", and Technical Committee 278 "Road Maps"), North American (SDTS, SAIF, OGIS), and international (NATO, ISO, IHO, ICA) levels. A survey of current organizations and standards as well as references to literature on them can be found in (Cassettari 1993). The more recent undertakings clearly indicate a trend away from standardizing simple data formats toward common spatial data definition languages and reference models (Kuhn 1991). This has proven to be a very challenging and time consuming task, delaying the production of practically usable transfer standards. At the same time, the rapidly growing supply and demand of spatial data requires applicable standards now. This situation has made it necessary to rely on formats like DXF or Postscript. These de facto standards are graphics oriented rather than supporting the complex semantics of geographic applications. They can satisfactorily support the transfer of simple data layers, covering one theme at the time. However, they lead to an undesirable loss of information in more complex transfers where the interdependence of various data layers is essential or where complex topological data structures are involved. Applying such low level graphics standards in geographic applications, for lack of any better

solutions, has revealed the need for means to deal with semantics. At the same time, the ongoing work on designing and implementing geographic standards has raised a plethora of semantic issues. The combined effect of these developments is the realization that transferring spatial data is not (yet) the same as sharing geographic information. In order to bring these two closer together, more research is needed, particularly in the area of semantics. However, standardization bodies are not equipped to do research and need input from outside. This paper is an attempt to bridge this gap between data transfers and information sharing. It takes a data modeling perspective and extends formal methods from that area to the description of semantics and information loss in data transfers.

3. INFORMATION LOSS IN DATA TRANSFERS Viewed from a database perspective, information sharing is an issue of data models, extending the process of data modeling beyond the level of individual systems. Avoiding information loss in a transfer would therefore amount to the design of lossless mappings between two data models (Ullman 1988). To reduce the necessary number of conversions, one would standardize a transfer model to which all others can be mapped. However, the notion of a single objective real world out there for which we just have to find a universal spatial data model and then everybody can exchange data by converting them to that model has proven illusory. People see different things in the world based on different organizational or cultural "world views" (as the name implies). For example, land use is often classified quite differently by statistical and cadastral agencies of the same administrative units. Even the same people working on different tasks need different models of what might appear to be the same "reality". For example, a road has very different aspects for one and the same person who plans a trip at home, walks along and across some roads to her car, and then drives on the road. Or, consider the problem of describing a location for the following purposes: explaining where it is, go there, drive to it, fly there, attack or protect it (Salgé 1993). This dependence of spatial data models on cultures and tasks creates a major problem for standardization and for geographic data transfers. It leads to information loss when data are being transferred without the possibility to capture their interpretation and the tasks for which they are suitable.

When data are separated from their processing environment in order to transfer them, the information modeled by operations is generally lost. This, however, is today's normal data transfer scenario. A typical example is the separation of digital terrain models (DTM's) from their associated interpolation methods. The decision what parts of information are modeled as data or as operations is largely arbitrary from the perspective of system use and information transfers. It depends on system internals such as the chosen programming languages and data structures. For example, most GIS vary in their internal handling of topological relationships, with some of them storing topology explicitly and others deriving it from metric data. Customers are generally not concerned with these internals and consequently know little about the information contained in the data alone. Thus, not only is information lost (which is what we expect), but it is difficult to assess what information is lost in a transfer. The common emphasis on data transfers is actually quite misleading. A customer is, in general, interested to get information and should not have to be concerned with the form (the data) in which that information is being transferred. The crucial question for the receiver of a data set is what questions the data are capable of answering. Consequently, one faces a "truth in labeling" issue for data (Kottman 1992). If data are shipped without explanations of what they can be used for, all sorts of technical and legal problems can ensue. A data modeling perspective alone is therefore not sufficient. The approach presented here is based on an understanding of data transfers as communication. Any successful communication requires a shared conceptual basis among the partners, or at least means to establish such a basis from a minimal set of common concepts ("bootstrap"). It is quite obvious that different world views inhibit communication in natural language conversations. The fact that data transfers are affected by the same difficulty is becoming apparent when geographic information transfers fail due to different understandings of concepts and tasks in different national or organizational cultures.

4. AN ALGEBRAIC APPROACH 4.1. Abstract Data Types Our goal is to find a way to specify what information is preserved and what is lost in a specific data transfer. To get information from data, i.e., to answer a question from physical signs, an intermediary step of interpretation is needed. This interpretation can be performed by a human being (e.g., when reading a map), or by a machine (e.g., when manipulating a data collection by applying suitable operations). In both cases, the interpretation consists in the use of data within a certain context. The use is characterized by the operations (mental or computational) performed on the data and the context reflects the application domain and culture (organizational, national) where the use takes place. The need to model information content independently of its representation as data has been recognized long ago, independently of data transfers. It led to the concept of data abstraction (Liskov and Guttag 1986; Parnas 1972) which, in turn, has become the central ingredient of object orientation (Khoshafian and Abnous 1990). The essence of data abstraction (as opposed to procedural abstraction) is to describe data by the operations which can be performed on them, as so called abstract data types. Following the above reasoning on information loss and the need to consider data together with operations, it seems a natural strategy to apply the concept of data abstraction to data transfers. 4.2. Algebraic Specifications The situation in data transfers is different from that in databases or programming languages, where data and operations can effectively be packed together in a single processing environment. Applying this strategy to data transfers would mean that with every transfer, all operations possibly working on the transferred data should be transferred. This is clearly unpractical, as it would mean that data held in a system can only be transferred together with program code of that system. An alternative data abstraction approach to geographic data transfers is proposed in the remainder of this paper. It is based on a formal software engineering method which suits data abstraction and has successfully been applied to the design of large software systems; the method of algebraic specifications for abstract data types (Ehrich and others 1989; Guttag and others 1978).

Applying algebra to data modeling provides a mathematically sound basis for the idea of describing the semantics of data in terms of their operations. An algebra is a collection of objects and associated operations. For example, the algebra of natural numbers could be described as the collection of objects labeled '1', '2', etc., and the operations of addition and multiplication. An algebra of land parcels would consist of the parcels as objects and operations for splitting and joining them as well as for computing areas and perimeter lengths. An algebraic approach to the definition of geographic data semantics has been proposed and discussed in (Kuhn 1994). It addresses the semantic problems resulting from the relationship between a single data model and a user's view of reality. Examples of algebraic specifications for geographic data types can be found there. The example presented below will show how the same technique is used in an algebraic model of information loss in data transfers. Functional languages (Hudak 1989) can be used to write algebraic specifications for abstract data types. Such specifications can then be syntax checked and executed. The functional language serves as an interactive tool to specify the semantics of data models independently of how this behavior is implemented in a system. The example in section five uses the functional programming language Gofer (Jones 1994; Thiemann 1994) for this purpose. 4.3. Mappings between Algebras Once a data model is understood and described as an algebra, it becomes possible to formalize relationships or mappings between data models (Herring and others 1990). This is where the algebraic approach becomes useful for data transfers. Since transferred data go from one data model to another (generally passing through a third model, that of the transfer standard), a transfer can be described as a mapping between algebras. In mathematics, such mappings are called morphisms and are studied in category theory (Barr and Wells 1990). When transfers are seen as morphisms, their characteristics can be formally described. In particular, it can be succinctly stated what information is preserved or lost. The special kind of morphism useful for assessing information loss is the homomorphism. A homomorphism is characterized by producing the same results whether an operation of the first algebra is performed and its results are mapped to the

second or the arguments are mapped and then the corresponding operation of the second algebra is applied. For example, the utility of logarithms is based on the homomorphism that they establish between multiplication and addition on real numbers: log (a * b) = log a + log b. Here, the first algebra is that of positive real numbers under multiplication (function f in figure 1) and the second is that of real numbers under addition (function g).

Domain B1

Domain A1 h1: A1 -> B1

f: A1 -> A2

Domain A2

g: B1 -> B2

h2: A2 -> B2

Domain B2

Homomorphism: g (h1 (a1)) = h2 (f (a1)) Fig. 1. A homomorphism between two algebras A and B

The logarithm example shows that one needs to specify the operation(s) a homomorphism applies to. This is precisely what we want to do for data transfers: describe which operations are preserved by the mapping and which are not. Thus, a homomorphism is always sought for certain data and operations only, not for the entire data base. In order to make the diagram in figure 1 consistent, the two mappings h1 and h2 constituting the homomorphism should be interpreted as the logarithm function applied to pairs of and to single real numbers, respectively. The functions f and g represent multiplication and addition of real pairs. Ignoring, for the sake of simplicity, the intermediate data model of the transfer standard, we obtain the algebraic view of a data transfer presented in figure 2.

Data Model A

Transfer

Data Model B

Domain B1

Domain A1 h1: A1 -> B1

f: A1 -> A2

Domain A2

g: B1 -> B2

h2: A2 -> B2

Domain B2

Homomorphism : g (b1) = h2 (a2) Information Loss : g (b1) ≠ h2 (a2) Fig. 2. Homomorphism and information loss between two data models

In this view, the transfer function h from data model A to data model B is the crucial element for a formalization of the data transfer. This function maps data and operations from one model to the other. Mathematically, operations are relations, i.e., predicates of the form f (a1) and g (b1). To each operation f in data model A corresponds an operation g in B. If this correspondence satisfies the homomorphism condition, operation f is preserved in the transfer and no information is lost. In practice, one is interested only in the subset of domain A and its operations that undergoes a transfer: h: (Ai, fj) -> (Bi, gj)

In other words, the homomorphism will generally be a mapping of parts of A into B, establishing a correspondence for the actually transferred data and the operations of interest applied to them.

5. A DETAILED EXAMPLE 5.1. Different Semantics of Point Equality Consider the simple case of comparing two points in a geographic database for equality. Different data models have different methods for modeling points and deciding on their equality. For example, one

model might use integer coordinates and compare these, another model might use floating point coordinates and compare them using a tolerance, and a third model might base its decision on point names. Given that such equality tests are often deeply hidden in a system's software and occur very frequently, the example represents a practically important and common case of potential information loss. With the algebraic approach, it becomes possible to detach the semantics of point equality from the stored data and operations in each model and to describe its invariance (or its loss) in a transfer abstractly. The following diagram represents the transfer situation, with the functions f and g standing for the respective point equality tests in two data models A and B, and with the two parts of the transfer function, h1 and h2, representing the transfer of point pairs and of the boolean results of equality tests. Again, the intermediate data model of a potential transfer format is omitted for simplicity, without loss of generality.

Data Model A

Transfer

Point pairs

Data Model B

Point pairs Transfer Function h1

g: PxP -> bool

f: PxP -> bool

Transfer Function h2

Boolean

Boolean

Homomorphism : g (h1(p,q)) = h2 (f (p,q)) Information Loss : g (h1(p,q)) ≠ h2 (f (p,q)) Fig. 3. Homomorphism and information loss for point equality in two data models

5.2. Three Data Models Let us assume that we have three systems, each containing a slightly different data model for the storage of point data. System A stores simple lists of coordinates and decides point equality based on coordinate values. System B represents points by coordinates and names and compares point names to decide equality. System C uses the same data type as system B, but decides equality like system A, comparing coordinate values. Below are algebraic specifications for the three data models, written in the functional programming language Gofer. Each specification consist of three parts: •

the name of the data type and of its constructor operation from base types;



signatures of operations, giving the names of the operations and of their input and output types;



equational axioms specifying the behavior of the operations.

The specifications focus on equality operations, leaving additional operations on points (such as a distance) unspecified. The Gofer keyword "data" introduces the definition of a (abstract) data type, while the keyword "type" defines just a type synonym. The crucial difference among the three specifications lies in their last lines, defining the different semantics of equality. While Gofer would offer more compact ways to write these specifications, the code as written here has the advantage to be self explanatory.

Data Model of System A type Coord = Int data Point = New (Coord, Coord) x

::

Point -> Coord

y

::

Point -> Coord

equal

::

(Point, Point) -> Bool

x (New (cx,cy)) = cx y (New (cx,cy)) = cy equal (p,q) = (x(p) == x(q)) && (y(p) == y(q))

Data Model of System B type Coord = Int type Name = String data Point = New (Coord, Coord, Name) x

::

Point -> Coord

y

::

Point -> Coord

name

::

Point -> Name

equal

::

(Point, Point) -> Bool

x (New (cx,cy,n)) = cx y (New (cx,cy,n)) = cy name (New (cx,cy,n)) = n equal (p,q) = name (p) == name (q)

Data Model of System C type Coord = Int type Name = String data Point = New (Coord, Coord, Name) x

::

Point -> Coord

y

::

Point -> Coord

name

::

Point -> Name

equal

::

(Point, Point) -> Bool

x (New (cx, cy, n)) = cx y (New (cx, cy, n)) = cy name (New (cx, cy, n)) = n equal (p,q) = (x(p) == x(q)) && (y(p) == y(q)).

5.3. Modeling the Transfers 5.3.1. Six Transfer Situations Based on the three data models, six different data transfer situations are possible (see figure 4). Each of them exhibits specific cases of information loss, lack of information, or invariance. Leaving away the situations where the data model of the sender system is weaker than that of the receiver system (lack of information) and where information is lost but not needed (irrelevant information loss), we discuss the

three relevant situations of information loss. The first is presented in detail, while the other two turn out to be only minor variations of it.

System A without point names compares coordinates irrelevant lack of information

lack of information information loss

System B with point names compares names

irrelevant information loss

information loss

information loss

System C with point names compares coordinates

Fig. 4. Cases of information loss and lack of information for point equality in three data models

5.3.2. From System B to System A First, consider the situation where point data are transferred from system B to system A. Since system B contains point names and A does not, information is lost in the sense that data from B cannot be stored at A. Such an information loss is relatively harmless, manifesting itself in the transfer process. More dangerous is the information loss that occurs implicitly and affects answers to queries posed to system A. A query whether two points are equal could be answered differently for the same points, depending on whether it is asked in system A (which answers based on coordinate values) or in system B (comparing point names). How does the algebraic approach help with this problem? It requires us to specify a transfer function, mapping data and operations from data model B to data model A. Since a function is nothing else than a correspondence between elements of its domain and range, the transfer function can be defined extensionally by listing the corresponding pairs of data and operations in the two data models. (The same procedure would be applied to specify a mapping into a transfer format and, again, from the

transfer format into the second data model). In order to be able to write down the transfer function for the example, we define the following simplified point database for system B: Point Name

X Coordinate

Y Coordinate

P1

111

555

P2

222

666

P3

333

777

P4

444

888

P5

111

555

Assuming that points P1, P2, and P5 should be transferred from system B to system A, we get the following transfer function h: (P1, 111, 555)

-> (111, 555)

(P2, 222, 666)

-> (222, 666)

(P5, 111, 555)

-> (111, 555)

not equalB (P1, P2) -> not equalA ((111, 555), (222, 666)) not equalB (P2, P5) -> not equalA ((222, 666), (111, 555)) not equalB (P1, P5) -> equalA ((111, 555), (111, 555))

Note that the transfer function is simply a generalized list of transferred data, adding the mappings for the transferred operations (the equality test for points). Clearly, the specified transfer function does not achieve a homomorphism: equalA(h(p,q)) ≠ h (equalB(p,q))

Points P1 and P5 are found to be different in system B, but equal in system A, after being stripped from their names in the transfer process. For these two points, the homomorphism condition evaluates to true on the left and false on the right hand side. Various strategies are conceivable to deal with this potential loss of information. Depending on the application context and the system's sophistication, coordinate values or the working of the equal operation in system A could be modified. Of course, the loss can also be accepted if it is deemed to have no consequences. The important observation is that the specification of the transfer function has revealed the danger of information loss.

5.3.3. Other Transfer Situations Consider now a data transfer from system B to system C (see figure 4). Both systems store point names, but system C decides equality based on coordinates. The only difference to the transfer discussed before is that points retain their names. Information loss can still occur, however, when two points have the same coordinates, but different names, or the same names, but different coordinates. The third transfer situation, from system C to system B, is symmetric to the second, producing the same cases of potential information loss. Thus, the specification of the transfer functions for these two situations proceeds in exactly the same way as in the first situation, with almost identical results. The example shows that data acquire meaning through the operations performed on them. What two systems do with the same data can be different despite their use of the same operation names. Therefore, meaning has to be defined algebraically, going beyond operation names. The meaning of "equal" for any of the systems is defined by its algebraic properties, and the change of meaning from one system to an other is captured in the transfer function. 5.4. Implementation Aspects The above example can easily be implemented in any functional or relational language, such as Gofer, Haskell, Prolog or SQL. The only mechanism required is a query language producing either function values or tuples as results. The process of writing the transfer function and alerting to information loss can therefore be automated in any relational database environment. The transfer function in its extensional form is the result of queries in both systems for the transferred data and for the results of the transferred operations applied to these data. Users only need to pose a query for the contents of the transfer (which they would do anyway) and name the operations that should be preserved. A simple transfer program can then apply the homomorphism condition to the transfer function and list potential cases of information loss. However, it may not always be practical to specify the transfer function extensionally. A complete enumeration of all value pairs for operations can lead to a combinatorial explosion. Algebraic specifications solve this problem by providing intensional descriptions for the semantics of operations in their axioms. They allow us to write queries for information loss cases, based on the homomorphism

condition. This eliminates the need to write an extensional transfer function and offers the potential for query optimization. In the above example, such a query has to search system B for point pairs that have the same names, but different coordinates, or the same coordinates, but different names. The two logical conditions for this query follow directly from the axioms of the equal operations in both systems and from the (negated) homomorphism condition: equalA (p, q) = (x(p) == x(q)) && (y(p) == y(q)) equalB (p, q) = name (p) == name (q) equalA (p, q) ≠ equalB (p, q) => (name (p) == name (q)) and (x(p) ≠ x(q) or y(p) ≠ y(q)) or (x(p) == x(q)) and (y(p) == y(q)) and (name (p) ≠ name (q)).

In some situations, the transfer function may be difficult or impossible to specify intensionally. Consider coordinate transfers among systems with different representations for floating point numbers. If one system uses 80 bits and the other uses 64 bits to represent floating point numbers, it becomes virtually impossible to describe what happens to coordinates in a transfer. In such a case, the only method to reveal information loss would be to produce and compare the extensional lists of operation results in both systems.

6. CONCLUSIONS This paper has proposed an algebraic approach to the problem of information loss in data transfers, allowing transfer partners to agree formally on what information has to be preserved and what information can be lost. The key idea behind the approach is to model the use of data in order to capture their meaning. The chosen algebraic style allows us to describe that use in a concise way, grouping data and operations. Thereby, the approach fits the object-oriented paradigm in software engineering and database modeling. Furthermore, it offers the powerful instrument of a homomorphism to describe invariant properties in data transfers. The work presented here requires extensions and refinements in order to become applicable to data transfers in practice. More extensive case studies will have to be conducted for specific information loss problems in order to assess the viability of the method. Interesting test cases include, for example,

transfers using CAD data exchange formats like DXF, or the transfer of terrain models between irregular triangular and regular square grids. The latter would allow for the specification of an interpolation method as part of the transfer function. In order to make the method operational, tools have to be developed allowing users to define data semantics by algebraic specifications and information invariance by homomorphic transfer functions. Such tools have previously been produced for software engineers (Guttag and others 1985). Functional programming languages like Gofer or other ML descendants like Haskell (Hudak 1989) are an important step ahead but not yet practically applicable for data transfer specifications. As an ingredient of future data transfer standards, tools implemented in such environments would allow transfer partners to define their supplies and demands precisely, without concerns for implementation details of the systems. References Barr, M. and C. Wells. Category Theory for Computing Science. Prentice Hall, 1990. Cassettari, Seppe. “Standards for Spatial Information.” In Introduction to Integrated Geo-Information Management, ed. Seppe Cassettari. 121-137. Chapman & Hall, 1993. Ehrich, H.-D., M. Gogolla, and U.W. Lipeck. Algebraische Spezifikation Abstrakter Datentypen. Leitfäden und Monographien der Informatik, ed. H.-J. Appelrath, V. Claus, G. Hotz, and K. Waldschmitt. Teubner, 1989. Frank, Andrew U. “Acquiring a digital base map - A theoretical investigation into a form of sharing data.” URISA Journal 4 (1 1992): 10-23. Guttag, J.V., J.J. Horning, and J.M. Wing. Larch in Five Easy Pieces. Digital Equipment Corporation, Systems Research Center, 1985. Guttag, John V., Ellis Horowitz, and David R. Musser. “Abstract Data Types and Software Validation.” ACM Communications 21 (12 1978): 1048-1064. Herring, J., M.J. Egenhofer, and A.U. Frank. “Using Category Theory to Model GIS Applications.” In 4th International Symposium on Spatial Data Handling in Zurich, Switzerland, edited by Kurt Brassel, IGU, 820-829, 1990. Hudak, P. “Conception, Evolution, and Application of Functional Programming Languages.” ACM Computing Surveys 21 (3) 1989: 359-411. Jones, M.P. Qualified Types: Theory and Practice. PhD Thesis, Programming Research Group, Oxford University, Cambridge University Press, 1994. Khoshafian, Setrag and Razmik Abnous. Object Orientation - Concepts, Languages, Databases, User Interfaces. New York, NY: John Wiley & Sons, 1990. Kottman, Clifford A. Some Questions and Answers About Digital Geographic Information Exchange Standards. Intergraph Corporation, 1992.

Kuhn, Werner. “Requirements for Land Information Standards.” In International FIG Symposium on Environment and Land Information in Innsbruck, Austria; Sep 30 - Oct 1, 1991, edited by Ernst Hoeflinger, Verlag Konrad Wittwer, 35-40, 1991. Kuhn, Werner. “Defining Semantics for Spatial Data Transfers.” In 6th International Symposium on Spatial Data Handling in Edinburgh, UK, IGU, 1994. Lee, Y.C. and D.J. Coleman. “A Framework for Evaluating Interchange Standards.” CISM Journal 44 (4 1990): 391402. Liskov, B. and J. Guttag. Abstraction and Specification in Program Development. The MIT Electrical Engineering and Computer Science Series, Cambridge, MA: The MIT Press (MacGraw-Hill), 1986. Mark, D.M. and A.U. Frank, ed. Cognitive and Linguistic Aspects of Geographic Space. NATO ASI Series D: Behavioural and Social Sciences, vol. 63. Dordrecht, The Netherlands: Kluwer Academic Publishers, 1991. Onsrud, Harlan and Gerard Rushton. Institutions Sharing Geographic Information: Report of the Initiative 9 Specialist Meeting. National Center for Geographic Information and Analysis, 1992. Technical Paper 92-5. Parnas, D.L. “On the Criteria to be used in Decomposing Systems into Modules.” ACM Communications 15 (12 1972): 1053-1058. Salgé, François. Workshop on strategy for national, regional, and international de jure standards in the field of digital geographic information. IGN Paris, 1993. Thiemann, Peter. Grundlagen der funktionalen Programmierung. Leitfaden der Informatik, Stuttgart: Teubner, 1994. Ullman, J. D. Principles of Database and Knowledgebase Systems. Vol. 1. Principles of Computer Science Series, Rockville, MD: Computer Science Press, 1988.