An Information Semantics Approach for ... - Semantic Scholar

4 downloads 14713 Views 811KB Size Report
August 1, 2008; current version published September 17, 2008. S. S. Durbha is with .... Available: http://www.gosic.org/ios/GOOS-ios.htm. Authorized licensed ...
358

IEEE SYSTEMS JOURNAL, VOL. 2, NO. 3, SEPTEMBER 2008

An Information Semantics Approach for Knowledge Management and Interoperability for the Global Earth Observation System of Systems Surya S. Durbha, Member, IEEE, Roger L. King, Senior Member, IEEE, and Nicolas H. Younan, Senior Member, IEEE

Abstract—The Global Earth Observation System of Systems (GEOSS) is built on current international cooperation efforts among existing distributed earth observing and processing systems. The goal is to formulate an end-to-end process that enables the collection and distribution of accurate, reliable earth observation (EO) data, information, products, and services to both suppliers and consumers worldwide. EOs are obtained from a multitude of sources and require tremendous efforts and coordination among different governments and user groups to come to a shared understanding on a set of concepts involved in a domain. Semantic metadata play a crucial role in resolving the differences in meaning, interpretation, and usage of the same or related data. Also, the knowledge about the geopolitical background of the originating datasets could be encoded in the metadata that would address the diversity on a global scale. In distributed environments like GEOSS, modularization is inevitable. In this paper, we describe the need for an information semantics-based approach for knowledge management and interoperability between heterogeneous GEOSS systems. Further, considering the magnitude of concepts involved in GEOSS, we explore the possibility of using modular ontologies for formulating smaller interconnected ontologies. Index Terms—Global Earth Observation System of Systems (GEOSS), modularization, ontology, semantics.

I. INTRODUCTION HE Global Earth Observation System of Systems (GEOSS) is a distributed system of systems built on current international cooperation efforts among existing earth observing and processing systems. The goal is to formulate an end-to-end process that enables the collection and distribution of accurate, reliable earth observation (EO) data, information, products, and services to both suppliers and consumers worldwide (see Fig. 1). One of the critical components in the development of such systems is the ability to obtain seamless access of data across geopolitical boundaries. In order to gain support and willingness

T

Manuscript received July 1, 2007; revised April 19, 2008. First published August 1, 2008; current version published September 17, 2008. S. S. Durbha is with the Department of Electrical and Computer Engineering, and GeoResources Institute, Mississippi State University, Starkville, MS 39762 USA (e-mail: [email protected]). R. L. King is with the Bagley College of Engineering, Mississippi State University, Starkville, MS 39762 USA (e-mail: [email protected]). N. H. Younan is with the Department of Electrical and Computer Engineering, Mississippi State University, Starkville, MS 39762 USA (e-mail: younan@ece. msstate.edu). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JSYST.2008.925975

Fig. 1. Components of GEOSS.

to participate by countries around the world in such an endeavor, it is necessary to devise mechanisms whereby the data and the intellectual capital are protected through strong procedures that implement the policies specific to a country. EOs are obtained from a multitude of sources and require tremendous efforts and coordination among different agencies and user groups to come to a shared understanding on a set of concepts involved in a domain. It is envisaged that the data and information deluge in a GEOSS context would be unprecedented and the current data archiving and delivery methods need to be transformed into one that allows realization of seamless interoperability. Thus, EO data integration is broadly dependent on the resolution of conflicts arising from the following [1], [2]: • data sets stemming from the same data-source with unequal updating periods; • data sets represented in the same data-model, but acquired by different operators; • data sets which are stored in similar, but not identical, datamodels; • data sets from heterogeneous sources (across geographical boundaries), which differ in data-modeling, scale, thematic content, contexts, and meaning. • Data sets that are influenced by socio-political and cultural backgrounds. The resolution of such conflicts depends on the reconciliation of both syntactic and semantic heterogeneities in the data. Although the metadata standards (e.g., FGDC and ISO) alleviate to a large extent the syntactic heterogeneity of the data, a problem that is still not completely solved is heterogeneity in the process of converting this data into information and actionable intelligence [3]. Semantic metadata plays a crucial role in resolving the differences in meaning, interpretation, and dusage of the same or related data. Also, the knowledge about the geopolitical background of the originating datasets could be encoded in the metadata that would address the diversity on a global scale.

1932-8184/$25.00 © 2008 IEEE Authorized licensed use limited to: WASHINGTON UNIVERSITY LIBRARIES. Downloaded on October 12, 2008 at 03:16 from IEEE Xplore. Restrictions apply.

DURBHA et al.: INFORMATION SEMANTICS APPROACH FOR KNOWLEDGE MANAGEMENT AND INTEROPERABILITY FOR THE GEOSS

Formally, semantics is a branch of linguistics that deals with the study of meaning, changes in meaning, and the principles that govern the relationship between sentences or words and their meanings [4], whereas information semantics is the semantic representation (meaning) for systems, data, documents, or agents [5]. Ontologies are often used as interlinguas for developing semantic models in which relationships are explicated through naming and differentiation. They also serve as a common data format for data interchange. Ontologies help to solve the problem of implicit hidden knowledge by making the conceptualization of a domain explicit. A more formal definition of ontology is “a shared, formal conceptualization of a domain” [6]. As shown in Fig. 1, GEOSS consists of three components. This paper focuses on the Data Exchange and Dissemination component and proposes a conceptual grounding for developing knowledge-based systems for GEOSS. This paper is organized as follows. In Section II, we discuss the need for an information semantic approach in a GEOSS context and how ontological modeling of disparate information sources allows harmonizing the data and enables seamless information flow. We also highlight the limitations of developing ontologies that are large and monolithic and why it would be harmful from a GEOSS perspective. We introduce the modularization of semantic models and how it can overcome such problems. In Section III, we present the need for modularization of the ontologies and, in Section IV, we review some of the current modularization techniques and give a framework for GEOSS modular ontologies. Finally, Section V concludes with recommendations for future semantics-driven GEOSS applications. II. NEED FOR INFORMATION SEMANTICS IN GEOSS The GEOSS implementation plan states that the Group on Earth Observations (GEO) will establish, within two years, a process for reaching, maintaining, and upgrading GEOSS interoperability arrangements, informed by ongoing dialogue with major international programs and consortia [7]. The creation of a system of systems, as envisioned in GEOSS, requires the harmonization of disparate data sets and information systems. This is a tremendous challenge in relation to the complex organizational structures, the size of the data repositories, and the interdependence to other government or nongovernment entities both national and international. To overcome these, better information sharing, more effective information management, more intelligent search methods, and smarter decision-making are required for GEOSS to be a viable system. The heterogeneities are mainly due to the syntactic issues (e.g., logical data models, i.e., relational versus object-oriented or geometric representations, i.e., vector/raster), schematic heterogeneity (e.g., database models, different generalization hierarchies, spatial reference systems, or cartographic standards including variable minimum mapping units and mixed units), and semantic aspects [8]. It has been noted that semantic heterogeneity is the source of most information integration problems. This occurs because of the differences in meaning, interpretation, or usage of the same or related data. The preliminary step to resolve the semantic conflicts is the development of systems that provide formal conceptualization

359

of the local domain entities. This can be achieved through domain-specific ontologies which are normally built independent of each other and are highly heterogeneous. The importance of resolving semantic differences has recently gained wide attention resulting in the next-generation semantic web [9], [10] efforts, which are driven mostly by the progress in techniques to model, capture, represent, and reason about semantics. There is a growing recognition that, at the end of the data-information channel, there are users at a variety of skill levels and backgrounds and the generated information should cater to their individual or group needs. However, usable information, defined as knowledge in this context, is rarely readily available [3]. Thus, there is a need for EO systems that addresses the knowledge acquisition and management aspects. A. Semantic Heterogeneities The three main causes of semantic heterogeneity in general are [11] given as follows. • Confounding conflicts occur when information items seem to have the same meaning, but differ in reality (e.g., due to different temporal contexts). • Scaling and units conflicts occur when different reference systems are used to measure a value (e.g., different currencies). • Naming conflicts occur when the naming schemes of the information differ significantly. A frequent phenomenon is the presence of homonyms and synonyms. Below, we discuss these heterogeneities in terms of two domains (i.e., oceans and land observations). B. Heterogeneities in Global Ocean Observing Systems (GOOS) In the ocean’s domain, the global ocean observing system (GOOS) provides observations, modeling, and analysis of marine and ocean variables to support operational ocean services world wide.1 Several programs such as global sea-level observing systems, global temperature and salinity profile program, and moored and drifting buoys program form part of the GOOS. These systems serve several parameters (i.e., ocean surface physical data, ocean subsurface physical data, ocean circulation and currents, sea-level and ocean topography, surface meteorological observations, ocean chemical data, physical coastal zone data, chemical coastal zone data, biological coastal zone data, and coral reef data) that are highly heterogeneous in terms of their syntax, structure, and semantics. The investigation of confounding conflicts is significant to ocean sensors data, as the ability to collect data of the ocean parameters at different intervals of time is one of its most important functions. Instruments sample the environment at various intervals or during various time intervals, such as one hour or one day. Later, the data are averaged over a longer time period, such as five days or a month. Also, the instruments measure at specific locations. Temporal and spatial gaps are thus introduced into the data, and it is up to the analyst to resolve these conflicts manually as they are associated with an application-specific context for which the data are acquired. Hence, it is 1[Online].

Available: http://www.gosic.org/ios/GOOS-ios.htm

Authorized licensed use limited to: WASHINGTON UNIVERSITY LIBRARIES. Downloaded on October 12, 2008 at 03:16 from IEEE Xplore. Restrictions apply.

360

necessary to identify whether a value is an intrinsic and permanent property of some instance or whether it depends on some evaluation context (e.g., storm surge or coastal ecology) and, in the latter case, by associating this value with its context, it is possible to achieve interoperability. Scaling conflicts frequently occur in marine observing systems. For example, Coastal-Marine Automated Network (C-MAN)2 station data typically include barometric pressure, wind direction, speed and gust, and air temperature; however, some C-MAN stations are designed to also measure sea-water temperature, water level, waves, relative humidity, precipitation, and visibility.3 The granularity of parameters served by certain other stations4 is much finer. Also, the data served by the Gulf of Maine Ocean Observing System (GoMOOS)5 meteorological sensor has in addition to the above parameters, it also measures water temperature (1 m), water temperature (2 m), water temperature (20 m), current direction, current speed, dissolved oxygen, salinity (1 m), salinity (20 m), density (1 m), and density (20 m). Each of these meteorological and oceanographic parameters from different sensors differs in their purpose and level of aggregation. The interpretation of temperature (AirTemperature) is defined as a measurement 1 m above sea level for GoMOOS buoy data, whereas it is 3 m above sea surface for tropical atmosphere/ocean (TAO) array. Similarly, the sea surface temperature (SST) products derived from satellite-based sensors provide the data in varied scales and time periods of measurements. Below are some examples of data and products at different scales. • The National Center for Environmental Prediction (NCEP) Reynolds Optimally Interpolated (OI) SST product consists of weekly and monthly global sea-surface temperature fields gridded at a 1 degree 1 degree resolution. • SST data from the NOAA Geostationary Operational Environmental Satellites are available at 6-km spatial resolution and the data are available as 1-, 3-, and 24-h gridded files. • Multichannel sea-surface temperature (MCSST) product is gridded at a resolution of 18 km and 7-day time periods and available at a one-week lag from data acquisition. • SST derived from the moderate-resolution imaging spectroradiometer (MODIS) data are currently available globally at 4 km, 36 km, and 1 degree resolutions with daily, weekly, and monthly time intervals Thus, the data are available in multiple scales (coarse or finer scales). A naming conflict is a commonly observed conflict in coastal sensor data. For example, the parameter windspeed differs in naming convention between distributed ocean data system (DODS) served data and NDBC, as shown in Fig. 2(A). Thus, it can be seen that there exists a semantic translation problem for integration of information sources. C. Heterogeneities in Global Land-Cover Thematic Data Sets In the land-cover domain, several global and regional landcover datasets were derived that are used as inputs to a variety 2[Online].

Available: http://www.ndbc.noaa.gov/cman.php Available: http://www.ndbc.noaa.gov/ 4[Online]. Available: www.gomoos.org 5[Online]. Available: http://www.gomoos.org/ 3[Online].

IEEE SYSTEMS JOURNAL, VOL. 2, NO. 3, SEPTEMBER 2008

of models [e.g., land information systems (LIS), land biosphere models, general circulation models (GCMs)] and as an information source in several decision-making scenarios where the need for understanding the land-cover dynamics is critical. The development of thematic land-cover data sets was driven by different national or international initiatives; the subsequent mapping standards adopted reflect the varied interests, requirements, and methodologies of the originating programs. Some of the currently available data products include IGBP DISCOVER, the MODIS land-cover product, University of Maryland (UMD) land-cover product GLC 2000, and CORINE land cover 1990 and 2000, AFRICOVER [12]. The intent of the different classifications is mainly to reduce the high amounts of information by abstracting from details. However, there is only limited compatibility and comparability between the data sets generated by different organizations. Fig. 2(B) depicts the semantic conflicts between two land-cover classification systems [i.e., International Geosphere Biosphere Programme (IGBP) and Simple Biosphere (SiB)]. The United Nations Framework Convention on Climate Change (UNFCCC) emphasizes how this lack of homogeneous observations limits our capacity to monitor terrestrial changes relevant to climate as well as our ability to investigate the causes of land-surface changes [12], [13]. In addition, dissemination of information through classified thematic data is a standard methodology in several Earth science domains, such as wet land classification systems (e.g., U.S. Fish and Wildlife Service scheme, Ramsar classification system for wetland type [14], and Cowardian system [15]), soils classification, and hydrology. All of these data sets differ in their intended use, meaning, and granularity. D. Integration Problem The integration problem between disparate GEOSS data streams is finding the right data that matches a given criteria. The above problem can be formally defined as follows [16]. Let (1) (2) be information sources; then, a bilateral integration problem is equivalent to finding a membership such that, and , we have for all (3) where and are the source ontologies, and are a set of class names, is a mapping that assigns a class definition over the terms from to every class term in , and is a set of information items. Consider two repositories in two different ocean sensor netand NDBC (Fig. 2). Then, works such as GoMOOS a query such as the following: Retrieve all of the data corresponding to atmospheric and pressure in

Authorized licensed use limited to: WASHINGTON UNIVERSITY LIBRARIES. Downloaded on October 12, 2008 at 03:16 from IEEE Xplore. Restrictions apply.

DURBHA et al.: INFORMATION SEMANTICS APPROACH FOR KNOWLEDGE MANAGEMENT AND INTEROPERABILITY FOR THE GEOSS

361

Fig. 2. Semantic conflicts between different information sources. (A) Ocean’s meteorological data served by two different organizations (NDBC and GoMOOS): several conflicts occur that hinder their interoperability. (B) Conflicts between two land-cover classification schemes (IGBP and SiB) in terms of intended meaning of the land cover classes, naming schemes, and granularity.

would have to result in the retrieval of barometric_pressure_ counts (BPC), barometric_pressure_min_counts (BPMINC), barometric_pressure_max_counts (BPMAXC), and barometric_pressure_stddev_counts (BPSTDC) in a distributed ocean data system (DODS) served GoMOOS data, while in NDBC served data it would be only atmospheric pressure . However, with the current keyword-based search strategies, these parameters would not be retrieved correctly.

Thus, such a query can be efficiently answered only if the semantics of both the information sources is well understood. The Web Ontology Language (OWL) is a W3C [17] standard to represent semantics of information sources through the development of ontologies. Domain-specific ontologies help to define concepts in a finer granularity. These fine-grained concepts then allow a user to determine specific relationships among features that may be used to classify them. Exploration of ontologies at

Authorized licensed use limited to: WASHINGTON UNIVERSITY LIBRARIES. Downloaded on October 12, 2008 at 03:16 from IEEE Xplore. Restrictions apply.

362

IEEE SYSTEMS JOURNAL, VOL. 2, NO. 3, SEPTEMBER 2008

III. MODULARIZATION OF GEOSS ONTOLOGIES

Fig. 3. Providing the necessary and sufficient conditions allows formulating a defined concept.

Fig. 4. Knowledge representation about different data sources though OWL constructs and knowledge acquisition by instantiating the semantic model with data from different sources.

various levels of granularity necessitates defining classes by restricting their property values. Then, by a combination of various restrictions, they are inherited into subclasses. The combinations of these restrictions define all conditions that must hold for individuals of the given class [3]. Classes can be either primitive or defined. Primitive classes are those which are defined by only necessary conditions. In contrast, defined classes have at least one set of necessary and sufficient conditions (see Fig. 3). Thus, by building OWL statements about resources, knowledge representation of disparate GEOSS resources can be accomplished. As shown in Fig. 4, the data in GEOSS could be available in multiple ways (i.e., texts, spreadsheets, relation data bases, or XML web services) by developing wrappers to instantiate the concepts in the ontology with the data, thus developing a knowledge repository. Ontology plus its instances data constitutes a knowledge base. The combination of knowledge bases from different information sources would enable generation of new knowledge and hence provide a repository for inference with description logic (DL) [18] based reasoning engines such as RACER [19] and Pellet [20]. In addition, it would permit the combining of resource description framework (RDF) data and querying using RDF query languages such as SPARQL [20].

In a spatially heterogeneous and distributed environment like GEOSS, modularization provides the necessary backbone to build integrated systems while enabling the necessary loose coupling of the systems. The goal is to provide seamless access to data at distributed locations while maintaining the integrity of the systems, without compromising the security, local laws, and international treaties (e.g., permission to obtain measurements in national economic zones of countries) and use the information for decision making. The foundation of a knowledge-based system for GEOSS requires a paradigm shift; “acquire knowledge locally, but manage it globally.” However, it is envisioned that the GEOSS ontologies would contain a huge number of concepts and that modularization is inevitable. Modularization would facilitate the development of a GEOSS framework by: • putting knowledge in its proper context; • giving a handle to perceive a knowledge entity in relation to other interconnected knowledge sources; • enabling only specific parts of the knowledge base’s retrieval for a particular decision-making process. Modularization is significant in that it addresses copyright, privacy, and/or security concerns. It does so by limiting access only to a partial ontology for a certain subsets of users. This is particularly relevant in a GEOSS context where data providers from different countries may need to conform to the laws of their country while still sharing knowledge. The module of an entity is the minimal subset of axioms in the ontology that capture its meaning precisely enough. Thus, achieving some sort of modularity is a key requirement to facilitate collaborative, large-scale long-term development. We address the issue of representing the subsets in the ontology that would enable to capture the meaning of an entity. A specific problem that occurs in the case of distributed ontologies, as well as very large models, is the problem of efficient reasoning. Hidden dependencies and cyclic references can cause serious problems in a distributed setting [16]. Modules with local semantics and interfaces will facilitate the development of methods for local inference mechanisms. Bao et al., provide a set of minimal requirements that modular ontologies should possess the following [22]: • Localized semantics (modularity in terms of both syntactic and semantic representations). • Exact reasoning (reasoning over modules should be equivalent to the reasoning on an integrated ontology). • Directed semantic relations from a source module to a target module. • Transitive reusability (knowledge should be reusable directly or indirectly). Fig. 5 depicts independent domain-specific ontologies in a GOOS. It is obvious that modules would be developed by people across geographical boundaries and would conform to the geopolitical and socioeconomic conditions of the respective countries. Hence, some level of standardization at the metadata level is one of the prime needs of GEOSS. It takes tremendous efforts to standardize elements in syntactic metadata, which normally are collections of terms that

Authorized licensed use limited to: WASHINGTON UNIVERSITY LIBRARIES. Downloaded on October 12, 2008 at 03:16 from IEEE Xplore. Restrictions apply.

DURBHA et al.: INFORMATION SEMANTICS APPROACH FOR KNOWLEDGE MANAGEMENT AND INTEROPERABILITY FOR THE GEOSS

363

provides some syntactic modularity, but not a logical modularity, which would be more desirable [24]. Below is an example construct of the owl:import statement. 1. owl:imports 2. external resources Fig. 5. Need for modularization in GOOS.

3. namespaces xmlns:geossresource=“http://geoss.org/geossresourcecatalog.owl#”

cannot be completely defined at the syntactic level and leave room for ambiguities. For example, the global resource catalog could be developed with ISO/IEC 11179 [23], Metadata Registries, which provides a standardized terminology for representing data in a common registry. However, such a standard currently operates only at the syntactic level, while the semantic part is crucial for proper exchange of information; we intend to develop the semantic metadata based on ISO/IEC 11179 which would form a top-level ontology, the concepts from which could be extended based on the specific data requirements relating to different countries. Similarly, such ontologies are required for different broad domain representations (e.g., coastal zone). Defining a module as a subontology translates to the fact that an ontology is turned into a module when considering it in a wider framework where the targeted service is to be provided by a collection of modules. Modules can be independently developed ontologies that are combined to form a collection providing some new services. This enables the importing of new ontologies into existing ones, thus facilitating access and knowledge extraction from remote systems while conforming to the specific rules pertaining to a region. The next section summarizes some of the current approaches of modularization and describes a GEOSS modularization scheme. IV. MODULARIZATION APPROACHES Modular ontologies provide support for processing heterogeneous and huge data sets, dynamic updating of interconnected modules when change happens in one or more of the modules, and efficient reasoning over connected modules. Some of the current approaches that provide support for some or all of the above requirements are summarized as follows. A. OWL Imports OWL provides the owl: imports construct, which allows including by reference all of the axioms contained in (and imported by) the remote ontology. An owl: imports statement allows both the ontologies to stay in different files. This certainly

In the above example, an external ontology file was imported that could belong to a different organization (or country). It could also be a different domain representation in which we are interested in certain subsets of the concepts to develop the local application-specific ontology model. Therefore, it is necessary to import the ontology into the local application-domain representation to access the subsets of concepts. However, it is unclear how much of the referred entity needs to be imported to make the local domain conceptualization complete. The use of owl: imports results in a completely flat ontology (i.e., none of the imported axioms or facts retain their context). While it is possible to track down the originator(s) of some assertions by inspecting the imported ontology, OWL reasoning does not take such context into account. B. Normalization To represent ontologies as modules, it is required that the domain ontologies be represented in a normalized form [25]. This means that the distinct modules must be represented as disjoint classification trees and binary relations between the classes in the distinct modules must be established (e.g., hasLocation, hasMeasurement, hasSensor, and OwnedAndMaintainedBy). The approach is based on a set of orthogonal taxonomies that provide a basis for defining more complex concepts. Rector [25] argues for the benefits of this strategy in terms of easier creation and reuse of ontological knowledge. C. View-Based Approaches In this type of approach, the ontology modules are connected by conjunctive queries [16], [26]. This provides for better connecting concepts from different modules than the simple one-to-one mapping, but does not take full advantage of the logical representation of the concept as provided by the language (e.g., DL). This method provides the use of a knowledge compilation approach where the result of each mapping query is computed offline and the result is added as an axiom in the current ontology. An internal concept definition is specified using regular DL- based concept expressions of or , where and are atomic and the form

Authorized licensed use limited to: WASHINGTON UNIVERSITY LIBRARIES. Downloaded on October 12, 2008 at 03:16 from IEEE Xplore. Restrictions apply.

364

IEEE SYSTEMS JOURNAL, VOL. 2, NO. 3, SEPTEMBER 2008

Fig. 6. Ontology that conceptualizes the information derived from coastal buoy sensors. A concept called WaterTemperature has several properties, for example, hasStationID which is restricted to values from a class called StationID which has properties such as ownedAndMainteinedby and hasMeasurement.

complex concepts, respectively. An external concept definition , where is a module and is an axiom of the form is a conjunctive query over . This enables queries on the remote concepts which are present as logical entities in the local ontology. Thus, it is not necessary to import the full ontology from remote sites, but only those concepts that match the query are added as axioms. D.

-Connections

An -Connection is a set of connected ontologies. An -Connected ontology typically contains information about classes, properties, and their instances, as in OWL, but also about a new kind of property (link property) which is somewhat similar in spirit to data-type properties [4]. -connections between DLs divide roles into disjoint sets of local roles (connecting concepts in one module) and links (connecting inter-module concepts). E. Modular GEOSS Ontologies An emerging framework for an integrated ocean observing system is being proposed in the form of satellite remote sensing (NASA, Ocean Biology and Biogeochemistry; NOAA, CoastWatch) and in situ measurement programs (e.g., NAWQA, PORTS, NOS tide gauge network, the COE network of wave gauging stations, and the NDBC network of meteorological buoys including C-MAN sites) [27]. Such a framework in a GEOSS context needs to handle information at a much larger geographical scale. Several top-level and domain-specific ontologies need to be developed that take into consideration the independent nature of the measurements. For example, satellite sea surface temperature can be defined as a knowledge entity that has the following: — temporal resolution; — spatial resolution; — at least one infrared wavelength; — a geolocation; — representative temperature of the skin of the ocean.

If there are two ontologies, one that describes coastal marine measurements with a concept called SeaSurfaceTemperature and the other representing geographic locations with a concept called GeoLocations, then the restriction on the property hasGeolocation of the concept SeaSurfaceTemperature in the first ontology might have someValueFrom {GeographicLocations} where GeographicLocations is a class in the second ontology (this ontology might belong to some other country in a GEOSS context). The property hasGeolocation will be defined as a property in the coastal marine measurements ontology with target ontology GeoLocations. Similarly, several modules have to be linked in this manner and they would ultimately enable answers to queries such as follows: Find all data corresponding to water temperature in geographical regions X and Y at time resolution T and spatial resolution S. As shown in Fig. 6, a buoy ontology could be used for such a query where the class StationID provides the necessary information about the spatial and temporal resolution. However, a single ontology like the one shown in Fig. 6 does not permit answers to queries at a GEOSS level, since the locations and ownership of the buoys change geographically and the information on access restrictions may also change. Hence, following the view-based approach described in the previous section, it is required to formulate concepts that are conjunctive queries on a remote ontology and add the results as axioms in the local ontology. For example, a new concept called SensorDetails in a local ontology could be formulated based on the classes StationID, GeoLocations, TempRes, and SpatialRes in the remote ontology, by mixing elements from different ontologies in the local definition of concepts: SensorDetails [and StationID (not local) (some hasLocation GeoLocations) (some hasTemporalResolution TempRes) (some hasSpatialresoultion SpatialRes)]. Thus, the development of modular ontologies requires a different formulation of the current ontology representation standard (e.g., OWL).

Authorized licensed use limited to: WASHINGTON UNIVERSITY LIBRARIES. Downloaded on October 12, 2008 at 03:16 from IEEE Xplore. Restrictions apply.

DURBHA et al.: INFORMATION SEMANTICS APPROACH FOR KNOWLEDGE MANAGEMENT AND INTEROPERABILITY FOR THE GEOSS

V. CONCLUSION The data exchange and dissemination component of GEOSS requires methodologies for accessing and retrieving geographically distributed resources that are highly heterogeneous in syntactic, structural, and semantic representations. Semantic reconciliation establishes an information resource in its proper context, thus enabling the provision of core data that will be required by several end users regardless of location or region. Modularization of the ontological models provides a certain level of flexibility for regional data providers to represent their data, while conforming to a broader, scalable framework that encapsulates crucial components of GEOSS that range from inter-governmental interoperability to the core measurements. REFERENCES [1] V. Walter and D. Fritsch, “Matching spatial data sets: Statistical approach,” Int. J. Geograph. Inf. Sci., vol. 13, pp. 445–473, 1999. [2] S. S. Durbha and R. L. King, “Interoperability in costal zone monitoring systems: Resolving semantic heterogeneities through ontology driven middleware,” in Proc. IEEE Geosci. Remote Sensing Symp., 2005, pp. 4236–4239. [3] S. S. Durbha and R. L. King, “Semantics enabled framework for knowledge discovery from earth observation data archives,” IEEE Trans. Geosci. Remote Sens., vol. 43, no. 12, pp. 2563–2572, Dec. 2005. [4] D. Bedford, “Charter Statement of Taxonomy and Semantics Special Interest Group,” 2004 [Online]. Available: http://www.km.gov [5] L. Obrst, “Ontologies and the semantic web: An overview,” MITRE, Center for Innovative Computing & Informatics, 2004. [6] T. R. Gruber, “A translation approach to portable ontology specifications,” Knowledge Acquisition, vol. 5, pp. 199–220, 1993. [7] [Online]. Available: http://www.earthobservations.org/docs/10Year%20Implementation%20Plan.pdf [8] Y. Bishir, “Overcoming the semantic and other barriers to GIS interoperability,” Int. J. Geograph. Inf. Sci., vol. 12, pp. 299–314, 1998. [9] T. Berners-Lee, J. Hendler, and O. Lassila, “The semantic web,” Scientif. Amer., pp. 29–37, 2001. [10] N. Shadbolt, T. Berners-Lee, and W. Hall, “The semantic web revisited,” IEEE Intell. Syst., vol. 21, no. 1, pp. 96–101, Jan. 2006. [11] G. H. Cheng, “Representing and reasoning about semantic conflicts in hetrogeneous information sources,” Ph.D. dissertation, Sloan Sch. Management, Mass. Inst. Technol., Cambridge, 1997. [12] M. Herold, C. Woodcock, A. DI. Gregorio, P. Mayaux, A. Belward, J. Latham, and C. C. Schmullius, “A joint initiative for harmonization and validation of land cover datasets,” IEEE Trans. Geosci. Remote Sens., vol. 44, pp. 1719–1727, 2006. [13] “GCOS Implementation Plan for the Global Observing System for Climate in Support of the UNFCCC,,” GCOS-92, WMO Technical Document NO. 1219,, Oct. 2004 [Online]. Available: http://www.wmo.ch/ web/gcos/gcoshome.html [14] S. Frazier, An Overview of the World’s Ramsar Sites. Slimbridge, U.K.: Wetlands Int., 1996, vol. 29. [15] L. M. Cowardin, V. Carter, F. C. Golet, and E. T. LaRoe, “Classification of Wetlands and Deepwater Habitats of the United States,” U.S. Department of the Interior, Fish and Wildlife Service, 1979 [Online]. Available: http://www.npwrc.usgs.gov/resource/wetlands/ classwet/index.htm [16] H. Stuckenschmidt and F. V. Harmelen, Information Sharing on the Web. Berlin, Germany: Springer-Verlag, 2005. [17] D. McGuinness and F. V. Harmelen, “Web Ontology Language (OWL) overview,” 2004 [Online]. Available: http://www.w3.org/TR/owl-features/ [18] D. Calvanese, G. D. Giacomo, D. Nardi, and M. Lenezerini, Reasoning in Expressive Description Logics, Hand Book of Automated Reasoning. Amsterdam, The Netherlands: Elsevier , 2001, pp. 3–12. [19] V. Haarslev and R. Möller, “RACER system description,” in Proc. Int. Joint Conf. Autom. Reasoning (IJCAR 2001), 2001, vol. 2083, Lecture Notes in Artificial Intelligence, pp. 701–705. [20] B. Parsia and E. Sirin, “Pellet: An OWL DL reasoner,” in Proc. Int. Workshop Description Logics (DL2004), R. Möller and V. Haaslev, Eds., 2004. [21] E. Prud’hommeaux and A. Seaborne, “SPARQL query language for RDF,” W3C Candidate Recommendation, 2006 [Online]. Available: http://www.w3.org/TR/rdf-sparql-query/

365

[22] J. Bao, D. Caragea, and V. Honavar, “Modular ontologies—A formal investigation of semantics and expressivity,” in Proc. Asian Semantic Web Conf., R. Mizoguchi, Z. Shi, and F. Giunchiglia, Eds., 2006, vol. LNCS 4185, pp. 616–631. [23] [Online]. Available: http://www.metadata-standards.org/11179/ [24] B. C. Grau, B. Parsia, and E. Sirin, “Combining OWL ontologies using E-Connections,” Web Semantics: Science, Services and Agents on the World Wide Web, vol. 4, pp. 40–59, 2006. [25] A. Rector, “Modularization of domain ontologies implemented in description logics and related formalisms including OWL,” in Proc. Int. Conf. Knowledge Capture, 2003, pp. 121–128. [26] M. Klein and H. Stuckenschmidt, “Evolution Management for Interconnected Ontologies,” [Online]. Available: http://www.citeseer.ist.psu.edu/672047.html [27] T. Malone, N. Andersen, P. Brewer, E. Buckley, H. Frey, F. Grassle, G. Gross, K. Tenore, L. Walstad, C. Woody, and J. Yoder, “An Ocean Observing System for US Coastal Waters First Steps,” [Online]. Available: www.csc.noaa.gov/coos/docs/CGOOS_firststeps_wrkrpt.pdf

Surya S. Durbha (M’06) received the B.S. degree in civil-environmental engineering and the M.S. degree in remote sensing from Andhra University, Visakhapatnam, India, in 1994 and 1997, respectively and the Ph.D. degree in computer engineering from Mississippi State University (MSU), Starkville, in 2006. He was an Application Scientist with the Indian Institute of Remote Sensing, Department of Space, India, from 1998 to 2001. He is currently an Assistant Research Professor with the GeoResources Institute and the Department of Electrical and Computer Engineering at MSU. His current research interests are information and knowledge-based systems, semantic web, web services, and multi-angle satellite remote sensing.

Roger L. King (M’73–SM’95) received the B.S. degree from West Virginia University, Morgantown, in 1973, the M.S. degree in electrical engineering from the University of Pittsburgh, Pittsburgh, PA, in 1978, and the Ph.D. degree in engineering from the University of Wales Cardiff, U.K., in 1988. He began his career with Westinghouse Electric Corporation, but soon moved to the U.S. Bureau of Mines Pittsburgh Mining and Safety Research Center. Upon receiving his Ph.D. degree in 1988, he accepted a position with the Department of Electrical and Computer Engineering, Mississippi State University (MSU), Starkville, where he now holds the position of Giles Distinguished Professor. At MSU, he presently serves as the Associate Dean for Research and Graduate Studies in the Bagley College of Engineering. Dr. King has received numerous awards for his research including the Department of Interior’s Meritorious Service Medal. He is a Registered Professional Engineer in the state of Mississippi. Over the last 30 years, he has served in a variety of leadership roles with the IEEE Industry Applications Society, the IEEE Power Engineering Society, and the IEEE Geosciences and Remote Sensing Society. He presently is a member of the IEEE GRSS AdCom.

Nicolas H. Younan (S’87–M’88–SM’99) received the B.S.E.E. and M.S.E.E. degrees from Mississippi State University (MSU), Starkville, in 1982 and 1984, respectively, and the Ph.D. degree in electrical engineering from The Ohio University, Athens, in 1988. He joined the faculty of the Department of Electrical and Computer Engineering, MSU, as a Visiting Assistant Professor in 1988, and he is currently a Professor and the Graduate Program Director in the department. His current research activities include signal and image analysis with applications to remote sensing, image information mining, automated target recognition, data fusion, and estimation/detection. Dr. Younan is a member of Sigma Xi, Tau Beta Pi, Etta Kappa Nu, and Phi Kappa Phi.

Authorized licensed use limited to: WASHINGTON UNIVERSITY LIBRARIES. Downloaded on October 12, 2008 at 03:16 from IEEE Xplore. Restrictions apply.