Distributed Emergent Semantics in P2P Networks - CiteSeerX

3 downloads 256365 Views 430KB Size Report
location in mobile ad hoc networks. ... information distributed in a mobile ad hoc P2P network. .... subjective and based on the creator's own point of view.
Distributed Emergent Semantics in P2P Networks Paul Fergus, Anirach Mingkhwan, Madjid Merabti, Martin Hanneghan Network Appliances Laboratory School of Computing and Mathematical Sciences Liverpool John Moores University Byrom Street, Liverpool, L3 3AF, UK Email: {cmppferg, A.Mingkhwan, M.Merabti, M.B.Hanneghan}@livjm.ac.uk

Abstract

Distributing semantic information in P2P networks enables knowledge management solutions to regress from centralised knowledge representations reliant on global consensus. This allows distributed peers to create and share knowledge with other peers connected in the network and automatically self-organise knowledge based on structural similarities. In order to achieve this, peers must collaborate and share information with each other, therefore toolsets to merge and categorise semantic information conceptually remains paramount. Our work highlights a key solution for discovering and merging semantic information distributed in peer-to-peer network environments. This paper shows how peers discover and merge semantic structures using evolutionary programming techniques, which capture the general consensus within the peer network. We describe the Distributed Emergent Semantics (DistrES) protocol used and describe the functionality of the prototype system we developed.

Key Words Evolutionary Programming, Peer-to-Peer, Knowledge Management, Ontology, JXTA, General Consensus.

1. Introduction The plethora of explicit knowledge possessed by large enterprise organisations brings added value to the organisation and can be considered one of its main assets. This type of knowledge is distributed throughout the organisation and explicitly exists in databases, knowledge bases and several other digital formats that capture organisation related information. The storage of this knowledge is inherently centralised and restricts the true potential of knowledge mining, by formalising the capturing process itself. The organisation must enable knowledge to emerge in a completely distributed environment whereby each user within the organisation is treated as a self-governing knowledge node. Within this environment the user is free to share or discover knowledge contained within the peer-to-peer network. The challenge is to enable a distributed environment that provides the following services: •

Services that enable the representation and discovery of semantic information.

• •

Services that capture the general consensus within peer responses in terms of semantic categorisation. Services that evolve and merge semantic knowledge over time.

In this world of distributed system technology, distributed knowledge management has been brought to the attention of numerous researchers as a way forward for future intelligent information gathering, providing services and self-learning capabilities [6, 4, 13, 14]. Sharing knowledge between peers in a distributed network allows information exchange analogous to the way we exchange information in our daily lives. Codifying this human activity will enable knowledge management solutions to distance themselves from centralised knowledge representations reliant on global consensus and allow peers to evolve local knowledge structures to conceptually understand and discover services. Rich information structures will become an emergent property as fragmented knowledge structures are discovered and merged by peers within the network. Our research focuses on semantically discovering and evolving distributed conceptual knowledge for service location in mobile ad hoc networks. Semantic knowledge is distributed throughout the peer network and locally merged by peers over time. The key technique our solution focuses on is to merge information structures based on a general consensus found within all responses received from the P2P network. Determining the common consensus is achieved by applying evolutionary programming techniques [7], which allow patterns within information structures to be extracted. Evolutionary programming techniques implement principles found within biological evolution to evolve knowledge and ensure that the fittest structures progress through subsequent generations. This technique creates new information structures that contain only the common structures found within the peer’s current information structure and information structures found within all responses received. Our research highlights a key solution for collaboratively discovering and merging semantic information distributed in a mobile ad hoc P2P network. We show how mobile ad hoc systems can construct semantic knowledge structures and discover services

Fergus, P., Mingkhwan, A., Merabti, M., Hanneghan, M., "Distributed Emergent Semantics in P2P Networks," (IKS'2003) Information and Knowledge Sharing, Scottsdale, Arizona, USA, 17th - 19th November, 2003, pp. 75-82.

through peer collaboration. We describe the Distributed Emergent Semantics (DistrES) protocol used and describe the functionality of the prototype system we developed.

2. Background and Related Work The concept of “emerging knowledge” is currently gathering pace as a research field. For example, [3] focuses on sharing information within and across organisations. Their solution aims to overcome the limitations inherent in traditional centralised solutions that experience bottleneck problems and high maintenance costs by distributing information sources across a network of interconnected peers. An interesting feature that will be used in our research is the provision of a semantic layer enabling concept based processing. An approach used in. [1] was to capture knowledge through gossiping. Their approach aims to interconnect peers within a P2P network via user-defined schemas to share and incrementally evolve the search capabilities within the network. Their approach assumes a large amount of data exists and that they have been organised and annotated according to local schemas, which is not always the case in distributed networks. PROMPT [10] is an algorithm that provides a semiautomatic approach to ontology merging and alignment. It performs some tasks automatically and guides the user through other tasks by taking two simple ontologies as input and attempting to merge them into a single ontology. The algorithm merges the ontologies based on similarities between classes, slots and bindings between slots. This presents an interesting solution, however the merging process is based on the subjective opinions of the user merging the two ontologies. An approach in [8] is to provide assistance with the task of merging KBs produced by multiple authors. This is achieved using Chimaera, which is a web-based ontology editor. The approach merges two or more ontologies together based on identical terms and subsumption relationships between terms. This approach experiences the same short-comings as PROMPT in that an assumption is made that experienced ontology engineers will carry out the merging process and that we have control over how knowledge is structured. ONION, [9] combines two separate ontologies to form an articulated ontology. Rather than merging, ONION performs an alignment between two ontologies by capturing the semantic gap between the two. This approach is similar to Aberer’s approach in that the technique acts like a mapping between two different representations. The process of creating the semantic gap involves semantically relating classes and creating and managing semantic bridges. ONION uses a semiautomatic approach, which relieves the user from having to maintain the bridges, however this approach assumes that a domain expert, knowledgeable of both structures, creates the semantic bridges.

3. System Requirements Applying knowledge engineering methods to decentralised peer-to-peer networks proves more difficult than traditional centralised solutions. The primary reasons for this can be directly attributed to the following factors; reliability, device capability, platform and software heterogeneity and non-standardised knowledge representations. This section describes the requirements paramount to overcoming these limitations by creating an environment that addresses these issues using the scenario described in 3.1. 3.1 Scenarios The idea of gathering knowledge and implementing the ability to learn new things in a P2P network is modelled on how humans learn. In nature, a baby is born with innate knowledge structures, which enable it to suckle when placed on the mother’s breast. As the baby progresses through childhood and adolescence it learns from its environment and continually builds on information structures it already has. Our research aims to capture this human ability to learn, within a distributed computer environment. Figure 1 illustrates that our peer begins with a limited amount of information, represented as KA0, and evolves its information structure by interacting with neighbouring peers within the environment over time. Figure 1 illustrates that there are three information structures presented to KA labelled KX, KY, and KZ. At time 0 KA0 determines that KY is a knowledge structure that matches a query it has propagated within the peer network. The KY structure is identified as the most successful structure based on several responses received from the peer network. The success of this structure is determined by evolving all response knowledge structures received at time 0 and extracting the common patterns found within those response to produce the KY structure. This new structure is merged with KA0 and becomes KA1. At time 1, KA1 propagates a query to the peer network. This time KZ has information that matches the query and again this structure is the result of the evolutionary process described above. In this case, the structure KZ is identified as the best information structure based on the number of common patterns found in all the responses received at time 1. This new structure is merged with KA1 and becomes KA2. This is the basic operation and motivation behind our research and there are a number of points to note. It is possible that this process leads to isolated information structures within the peer’s knowledge base, which are detached from the root node. However over time these structures will form connections to other knowledge structures as the peer’s information evolves. This is illustrated at time 2 in Figure 1. When KZ is merged with the current information structure a relationship is found

Fergus, P., Mingkhwan, A., Merabti, M., Hanneghan, M., "Distributed Emergent Semantics in P2P Networks," (IKS'2003) Information and Knowledge Sharing, Scottsdale, Arizona, USA, 17th - 19th November, 2003, pp. 75-82.

between the information structure we had at time 0 and the information structure we merged at time 1 – as a result this technique is able to determine relationships between fragmented information structures and perform appropriate merges to connect them to the main structure.

KY

KZ

KA0

T0

KA2

KA1

T1

T2

TN

KX

Fig. 1 Knowledge Learning Diagram This scenario illustrates that a peer’s information structure will evolve as the peer moves through time and interacts with other peers within its environment. 3.2 DistrES protocol Requirements Conceptually merging information structures based on a common consensus requires an algorithm capable of conducting several processes. We believe that distributing, extracting and merging information within an ad hoc network environment must implement the following requirements in order to achieve this. − Knowledge Structure: Knowledge structures must be nodes sub-classed taxonomically from a root node, however fragmented structures may exist but must be merged into existing structures as the peer’s information evolves over time. The structure of information must be represented using XML and stored within a localised knowledge base on the peer. The peer must return information to requesting peers using standardised message formats. − Targeted Knowledge Discovery: Peer nodes must have the ability to evolve existing information structures by propagating queries within the peer network about subsections of their information structures they wish to extend, for example “Fast Food Takeaways”. They must have the ability to select the knowledge to extend and prompt the peer network to send any information structures it has regarding the propagated query. − Extraction Engine: When a peer processes a query and determines that it has relevant information structures, it has to extract this information from the knowledge base and return this information back to the querying peer. The peer must extract information structures and semantically convert the representations to XML.

− Evolutionary Pattern Extraction: Within the P2P network a querying peer will receive several responses and the structure of the information within these responses will differ. There is no centralised control and no assumptions can be made about the level of expertise creators of knowledge have. Information structures need to be evolved based on general consensus, which must be determined by evaluating information structures in all responses received. This is achieved using evolutionary techniques [7], which extract structural patterns from several information structures. These information structures must be breed together for a number of generations until the most optimal information structure is produced. − Merge Engine: When a peer receives an optimised response from the Evolutionary Pattern Extraction Engine, this information needs to be merged with the peer’s existing knowledge structures. The peer must extract the conceptual information from an XML representation and convert this information into a representation used by the target knowledge base. This process is the reverse of the process executed by the Extraction Engine.

4. DistrES Protocol This section describes how the DistrES protocol discovers semantic information from within the Peer network and merges the results with existing knowledge structures. It describes the query structures sent between peers, the process of extracting information from knowledge structures, the process of evolutionary pattern extraction and the process of merging information structures within existing knowledge structures. 4.1 Protocol Design The DistrES protocol extracts information from knowledge bases, evolves information structures to produce optimal solutions based on a general consensus and merges new information structures with existing structures. This section describes the capabilities of all algorithms used. 4.1.1 Extraction Engine Listening peers’ process queries propagated within the peer network by extracting the concept node name from the XML structure contained in a standardised message. This concept node name is used to query the peer’s knowledge base to see if the concept exists. If the concept exists the process begins by extracting all the dependents for the node and for each dependent found, the peer extracts all the associated ‘isa’ and generalisation relationships. 4.1.2 Evolutionary Pattern Extraction Engine Peers propagate queries containing information structures they wish to evolve, to neighbouring peers within the peer network. The peer may receive several responses which contain information structures that are

Fergus, P., Mingkhwan, A., Merabti, M., Hanneghan, M., "Distributed Emergent Semantics in P2P Networks," (IKS'2003) Information and Knowledge Sharing, Scottsdale, Arizona, USA, 17th - 19th November, 2003, pp. 75-82.

subjective and based on the creator’s own point of view. This leads to structural and possibly lexical variation between all responses received. Our research aims to address this problem using the Evolutionary Pattern Extraction (EPE) engine. The EPE engine extracts

Information Structures C1

R1

Travel Itinerary

Travel Itinerary

Transport

Transport

Accommodation

Accommodation

Mobile Caravan

R2 Travel Itinerary Transport

Fitness Functions Node

Occurrence

Travel Itinerary

4

Mobile Caravan

2

Transport

4

Entertainment

1

Insurance

1

Car Rental

1

Location

1

Accommodation

3

R3 Travel Itinerary Car Rental

Transport

Mobile Caravan

After EPE

Result Travel Itinerary Transport

Car Rental

Accommodation

Insurance

Entertainment

Location

(C1) and all responses (R1 – R3) are crossed over to produce 24 offspring. Crossover swaps structural information between two structures to produce a new structure. The EPE engine crosses over structures based on a random split – the split may be 40% of the first

Node Relationship Travel Itinerary, Mobile Caravan

Occur.

Travel Itinerary, Transport

4

Travel Itinerary, Accommodation

2

Mobile Caravan, Transport

2

Mobile Caravan, Accommodation

1

Transport, Accommodation

1

0

Mobile Caravan

Accommodation

Fig. 2 Evolutionary Pattern Extraction Engine structural patterns based on commonalities found within all responses and produces an optimal information structure. Figure 2 illustrates that we have the peer’s current structure (C1) and three responses (R1 – R3) representing the results the peer has received from the peer network. It is clear that although structurally R1 – R3 are different, there are commonalities within the structures that are apparent in them all. For example, nodes “Travel Itinerary” and “Transport” have a direct relationship between each other in C1 and R1 – R3. The EPE engine determines that this is a common pattern by evolving the structures and calculating the fitness value of each structure as they progress through each generation. In contrast Figure 2 illustrates that nodes “Entertainment”, “Insurance”, “Car Rental” and “Location” are low scoring nodes because each node only appears in one structure. The EPE engine classes these low-scoring nodes as uncommon, which lessens their chance of evolving to subsequent generations. The probability of these nodes appearing within the optimal structure is greatly reduced. Initially the current structure

structure and 60% of the second structure. This crossover procedure produces two new structures, more formally known as offspring, which are calculated to determine their fitness value and whether they have scored enough to progress to the next generation. Using two pre-defined fitness functions, shown in Figure 2, the structure of each offspring is calculated to determine its level of fitness. The fitness value assigned to a new structure is used to decide whether the structure is fit enough to progress to the next generation. The first fitness function places all the unique nodes, found within all structures, into an array. These nodes are given a fitness value based on the number of times a node appears within all structures, which we call term frequency. For example, the node “Travel Itinerary” is given a fitness value of four because it appears once in all structures. The second fitness function places all the possible relationships into an array that may exist between any two nodes. The fitness value of each relationship is calculated based on the number of times a relationship appears within each structure, which we call relationship frequency. For example the relationship between “Travel

Fergus, P., Mingkhwan, A., Merabti, M., Hanneghan, M., "Distributed Emergent Semantics in P2P Networks," (IKS'2003) Information and Knowledge Sharing, Scottsdale, Arizona, USA, 17th - 19th November, 2003, pp. 75-82.

Itinerary” and “Transport” is given a fitness value of four because the relationship appears within each of the four structures.

illustrated in the binary vector above we can say that the EPE would have converged on the most optimal structure when it has a fitness value of 7.

Once we have a list of ranked nodes and relationships, we can use these to produce a binary vector, with each position in the vector representing a gene (node or relationship). Once the binary vector is determined we can convert all the structures into a binary vector whereby a value of one is placed in the vector based on the presence of a node in the structure or the presence of a relationship between two nodes. The most optimal structure will be a vector that contains a combination of ones’ and zeros’ in the most optimal configuration. We can determine the optimal solution by specifying that the optimal binary vector must have the top n nodes and the top n relationships. For example if we convert the nodes and the relationships contained in the C1 structure in Figure 2, into a binary vector we will have a vector as follows: [1,0,1,0,0,0,0,1,0,1,1,0,0,0]

4.1.3 Merge Engine

whereby the first eight bits (genes) represent the nodes and the last six bits (genes) represent the relationships. The position of the gene is important because it corresponds to particular words. For example a word version of the above vector may be as follows:

This section describes the systematic interactions between subsystems in the DistrES protocol. It shows how knowledge is discovered, evolved and merged.

[Travel Itinerary, Mobile Caravan, Transport, Entertainment, Insurance, Car Rental, Location, Accommodation, Travel Itinerary|Mobile Caravan, Travel Itinerary|Transport, Travel Itinerary|Accommodation, Mobile Caravan|Transport, Mobile Caravan|Accommodation, Transport|Accommodation] If a value of one was in the third position then this means that a structure has a node called ‘Transport’, furthermore if a value of one was in the fourteenth position then this means the structure has a relationship between the node ‘Transport’ and the node ‘Accommodation’. For illustration purposes the ‘Result’ structure in Figure 2 is generated by using the top four nodes within the array of nodes found, and using the top three relationships within the array of relationships found. This means that the most optimal structure would be represented by the following binary vector: [1,1,1,0,0,0,0,1,0,1,1,1,0,0] This structure specifies that the nodes ‘Travel Itinerary, ‘Mobile Caraven’, ‘Transport’ and ‘Accommodation’ including the following relationships ‘Travel Itinerary|Transport’, ‘Travel Itinerary|Accommodation’ and ‘Mobile Caravan|Transport’, must appear in the optimal solution. Each structure is evaluated to determine how many genes match the value of the genes in the target structure. The structures fitness value is incremented by one for each matching gene found. Within the target structure

The peer iterates through the structure produced by the EPE engine and attempts to merge the knowledge with existing knowledge structures. This process begins by iterating through all the nodes found within the structure and determining whether the node already exists in the knowledge base - if the node does not exist, a new node representing the current node is created. Before we can do this the Merge Engine iterates through the binary vector and resolves the bit values by converting them into corresponding knowledge base structures. Once this is complete all the nodes are inserted into the knowledge base that does not exist. The final stage merges all the ‘isa’ and ‘generalisation’ relationships associated with the current node being processed. 4.2 DistrES Algorithm

The sequence diagram in Figure 3 illustrates the sequence of events performed to discover, evolve and merge semantic structures. Initially a peer queries its local knowledge base to see if it has the knowledge it requires. If the knowledge exists, the peer uses the information it already has. If the peer is unable to discover the information from its local knowledge base or it has the information but needs to evolve the knowledge structure, it propagates the query within the P2P network and waits for responses – we pre-configured this time period to one minute, however this can be changed to accommodate implementation requirements. Peers in the network listen for requests and try to return information structures that relate to queries received. When a Peer receives a request for information, it extracts the query from the message and determines if it contains information that matches the conceptual meaning of the query. The query is represented using XML and is automatically created after being submitted via the user interface. The peer uses the extracted query to determine whether its local knowledge base contains relevant information. If the peer has the required knowledge structures, it begins to extract the information and represent it taxonomically as XML. The peer begins by determining if a ‘Node’ within the knowledge base structure matches the query received. If a node is found the process begins by extracting all the dependents of that node. For example if the query is “PlaceToEat”, the peer finds all the nodes that appear taxonomically under the root node “PlaceToEat”. The peer iterates through the extracted nodes and determines what relationships exist between nodes.

Fergus, P., Mingkhwan, A., Merabti, M., Hanneghan, M., "Distributed Emergent Semantics in P2P Networks," (IKS'2003) Information and Knowledge Sharing, Scottsdale, Arizona, USA, 17th - 19th November, 2003, pp. 75-82.

After the extraction process is complete the information is converted to XML and returned to the querying peer. A peer may receive several responses from peers within the peer network, therefore all responses are processed by the EPE Engine, which computes the most optimal solution based on the general consensus found within the peer network. Once an optimal structure has been found it is passed to the Merge Engine and the information is merged with the peer’s existing information structure.

and DAML+OIL knowledge structures located on the peer machine. We have achieved this by creating the DistrES protocol, which constitutes a plug-in for the DiSUS framework and provides an extra level of functionality, which extracts, evolves and merges information structures. The DistrES prototype was developed using Java, based on the 1.4 JRE. Several Java objects were created to extract information from OpenCyc knowledge bases and DAML+OIL ontology structures located on the peer machine. We also developed tools to extract common patterns from several information structures and evolve them based on general consensus as well as tools to merge information into localised knowledge bases. The test environment used two Notebooks and two

4.3 DistrES Prototype In our previous work, we developed the Distributed Semantic Unstructured Services (DiSUS) framework [5], which implements our P2P network and uses the OpenCyc [11] knowledge base to store knowledge structures and make inferences on those knowledge :PeerService-A

:PeerService-B

:ExtractorEngine

:EPE Engine

:MergeEngine

:KnowledgeBase-B

:KnowledgeBase-A

1.Query() 2. Query() 3. Execute() 4. Response

5. ConstructXML()

6. Response 7. Response 8. EvolveStructure() 9. Response 10. MergeStructure() 11. Execute() 12. Response

Fig. 3 Evolutionary Pattern Extraction Engine structures. Within the DiSUS framework we use the JXTA Resolver Service [15] to propagate XML strings, encoded as FIPA-ACL [12], through the peer network. In this section we describe how the response XML strings, returned to a querying peer, have been changed to capture XML representations of conceptual information represented within the OpenCyc knowledge base [11]

PCs. To test the DistrES protocol we hand crafted four information structures in two different formats; OpenCyc and DAML+OIL. The OpenCyc information structures where inserted into the OpenCyc knowledge bases and the DAML+OIL structures where stored locally. We standardised the messaging between peers in the peer network by wrapping XML structures inside a FIPA-

Fergus, P., Mingkhwan, A., Merabti, M., Hanneghan, M., "Distributed Emergent Semantics in P2P Networks," (IKS'2003) Information and Knowledge Sharing, Scottsdale, Arizona, USA, 17th - 19th November, 2003, pp. 75-82.

ACL [12] content object. When a peer receives a FIPAACL response the XML structures are extracted from the content object for use. The XML format used by the OpenCyc servers is an initial representation we developed which allows us to capture the main semantics associated with CycL [11]. The XML format used for the DAML+OIL knowledge base is represented as DAML+OIL. At the time of our experimentation the DAML+OIL import and export tools had not been developed enough to provide the level of functionality we required and a decision was made to implement this solution using our own predefined knowledge representation and separate DAML+OIL knowledge structures. However we envisage that we will base our end solution on DAML+OIL when the OpenCyc 1.0 version is finally released. The Extraction engine is a simple Java object that makes full use of the OpenCyc API and in part provides wrapper methods around existing OpenCyc methods. The only additional functionality added to the prototype was the transformation mechanisms used to convert CycL representations into XML. The Extraction engine also processes DAML+OIL structures and has built in functionality to retrieve sections of DAML+OIL knowledge. The EPE Engine is a Java object we developed and provides a substantial part of the coding for the DistrES protocol. Within this object we implement the genetic programming techniques [7], which allows us to extract common patterns from several information structures. This object was custom developed for the DistrES protocol. The Merge engine is a Java object, which again makes full use of the OpenCyc API and in part provides wrapper methods around existing OpenCyc methods. The only additional functionality added to the prototype was to transform the XML representations into CycL. The Merge engine also processes DAML+OIL structures and has built in functionality to merge sections of DAML+OIL knowledge.

5. Results Firstly, our initial results have proved that we can merge information structures within a distributed ad hoc network environment. Secondly we have proved that we can evolve a peer’s knowledge by merging information structures based on common pattern extraction from several information structures using evolutionary techniques. One of the main issues we uncovered was the time it takes to evolve information structures and the fact that the algorithm could get stuck and repeatedly breeds structures that do not appear to evolve any further. In this situation we believed there to be a better structure available, however after evolving structures for several hours we

could not create it. We have attributed this to the limited capabilities we programmatically applied to the EPE engine. Future work aims to look at more advanced operators that take into account the semantics of knowledge. Furthermore we intend to implement other fitness functions to carry out more sophisticated structural analysis. The DistrES protocol was tried and tested on simple information structures and in most cases it could find the optimal solution. As structures become more complex the processing requirements increase and it takes longer to produce the optimal solution. We tried this on different structures and proved that although we were not able to extract exact optimisations in a short period of time, we were able to provide solutions that extract most of the common patterns found within all responses. We believe that with a more sophisticated algorithm highly optimal solutions can be found for structures that are more complex. The DistrES protocol was designed to capture a common way of representing information structures based on general consensus. We believe we have achieved this and that our solution could aid the construction of information structures by provided back-end consensus processing within a larger application.

6. Conclusion and Future Work Our Distributed Emergent Semantics (DistrES) protocol enables several information structures to be collected and evolved over a period of time in order to extract common patterns found within those structures. The structure that proves to be the most optimal solution is selected and merged with existing knowledge structures. The DistrES prototype plugs into the DiSUS [5] framework we developed in our previous work and consists of three main functions: the extraction engine, the evolutionary pattern extractor engine and the merging engine. Using the DiSUS framework we were able to propagate queries within the peer network requesting information regarding a specified concept. Peers answering these queries successfully used the DistrES protocol to extract information from local knowledge bases and convert the information into XML. We further enhanced the peers within the DiSUS framework allowing them to process several responses from the peer network and evolve these responses in order to extract information structures that conform to the general consensus evident in all responses. Once the optimal information structure was found the peer successfully converts the XML structures and merges them with existing structures. Through the simple simulations we developed we proved that we could evolve the knowledge structures of a peer and furthermore we were able to capture the general consensus in terms of how these structures should be represented.

Fergus, P., Mingkhwan, A., Merabti, M., Hanneghan, M., "Distributed Emergent Semantics in P2P Networks," (IKS'2003) Information and Knowledge Sharing, Scottsdale, Arizona, USA, 17th - 19th November, 2003, pp. 75-82.

This research provides a novel solution to ontology evolution using genetic programming techniques and to the best of our knowledge is an unexplored area of research. Our approach ensures that it is the general consensus (user interaction over time) found within the peer network that determines how semantic structures are evolved and not experienced knowledge engineers. This technique does not rely on any form of centralised control and continually maintains knowledge structures automatically based on user interaction. This differs from a number of other approaches [10, 2], which suffer from a number of drawbacks. Such systems are inherently centralised and rely on explicit global agreements to determine how conceptual information should be represented. Obtaining explicit global agreements regarding the way information should be structured is difficult, if not impossible. Another fundamental problem relates to the maintenance overheads associated with maintaining large centralised knowledge base systems. Structural changes need to be implemented, conflicts need to be resolved and the overall size of the knowledge base needs to be controlled. This requires time and a great deal of effort. During future work we will look at the co-existence of correct and incorrect information within the information structures we are processing. We do not make any assumptions that the information created by peers in the peer network will be correct or consistently represented in a pre-determined knowledge structure. We also envisage that people will call things differently and spell things incorrectly. A further area of work is to understand and develop a means of capturing the representations – at present this is achieved manually in order to test the DistrES protocol – within the distributed peer network.

References [1]

[2]

[3]

K. Aberer, Cudre-Mauroux, P., Hauswirth, M. The Chatty Web: Emergent Semantics Through Gossiping, Twelth International World Wide Web Conference. 2003. Budapest, Hungary: Springer M. Arumugam, Sheth, A., Arpinar, I. B. Towards Peerto-Peer Semantic Web: A Distributed Environment for Sharing Semantic Knowledge on the Web, Proceedings of the eleventh international conference on World Wide Web Workshop. 2002. Honolulu, Hawaii, USA, p. 1 - 9. SWAP: Ontology-based Knowledge Management with Peer-to-Peer Technology. 2002, SWAP, Accessed: 0503-2003.

[4]

[5]

[6]

[7] [8]

[9]

[10]

[11] [12]

[13]

[14]

[15]

http://swap.semanticweb.org/public/publications/ehrig03 swapb_wiamis-3.pdf. D. Fensel, Staab, S., Studer, R., van Harmelen, F., Peer2-Peer enabled Semantic Web for Knowledge Management, in Towards the Semantic Web: OntologyDriven Knowledge Mangement, J. Davies, Fensel, D., van Harmelen, F., Editor. 2002, John Wiley & Sons, Ltd: Chichester, West Sussex, PO19 8SQ, England. p. 310. P. Fergus, Mingkhwan, A., Merabti, M., Hanneghan, M. DiSUS: Mobile Ad Hoc Network Unstructured Services, presented at Personal Wireless Communications (PWC2003). 2003. Venice, Italy: Springer J. Heflin, Hendler, J. Dynamic Ontologies on the Web, Seventeenth National Conference on Artificial Intelligence (AAAI-2000). 2000. Austin, Texas, U.S.A.: AAAI/MIT, p. 443 - 449. C. G. Langton, Artificial Life, in The Philosophy of Artificial Life, M. A. Boden, Editor. 1996, Oxford University Press Inc.: New York. p. 39 - 93. D. L. McGuinness, Fikes, R., Rice, J., Wilder, S. An Environment for Merging and Testing Large Ontologies, Proceedings of the Seventh International Conference on Principles of Knowledge Representation and Reasoning (KR2000). 2000. Breckenridge, Colarado, USA: Morgan Kaufmann Pubishers, p. 483 - 493. P. Mitra, Wiederhold, G., Kersten, M. L. A GraphOriented Model for Artriculation of Ontology Interdepndencies, 7th International Conference on Extending Database Technology. 2000. Konstanz, Germany: Springer, p. 80 - 100. N. F. Noy, Musen, M. A. PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment, The Seventeenth National Conference on Artifical Intelligence (AAAI'00). 2000. Austin, Texas, USA.: AAAI Press/The MIT Press, p. 450 - 455. OpenCyc Project. 2002, Cycorp, Inc., Austin, TX, USA, Accessed: 5-02-03, http://www.opencyc.org. M. Schalk, Liebig, T., Illmann, T., Kargl, F. Combining FIPA ACL With DAML+OIL - A Case Study, Proceedings of the Second International Workshop on Ontologies in Agent Systems. 2002. Bologna, Italy R. Siebes, van Harmelen, F. Ranking Agent Statements for Building Evolving Ontologies, Workshop on Meaning Negotiation, in conjunction with the Eighteenth National Conference on Aritificial Intelligence. 2002. Emonton, Alberta, Canada L. M. Stephens, Gangam, A. K., Huhns, M. N., Constructing Consensus Ontologies for the Semantic Web. Under consideration for publication in Knowledge and Information Systems, 2002. B. J. Wilson, JXTA. First edition ed. 2002, 201 West 103rd Street, Indianapolis, Indiana 46290: New Riders Publishing. Pages: 350.

Fergus, P., Mingkhwan, A., Merabti, M., Hanneghan, M., "Distributed Emergent Semantics in P2P Networks," (IKS'2003) Information and Knowledge Sharing, Scottsdale, Arizona, USA, 17th - 19th November, 2003, pp. 75-82.