Semantic P2P - CiteSeerX

1 downloads 33791 Views 991KB Size Report
pendent of any particular application such as name resolution, routing information provision, identifying .... of the mobile peer-to-peer protocols and some utility tools (Fig. 7). All of the APIs .... value of “creator” property is defined as a Resource value of “device” ..... set of the best candidates is selected for query forwarding.
A Platform for Peer-to-Peer Communications and its Relation to Semantic Web Applications Norihiro Ishikawa1, Takeshi Kato1, Hiromitsu Sumino1, Johan Hjelm2, Ye Yu3, and Zhongwu Zhu3 1

NTT DoCoMo Inc, 3-5 Hikarino-oka, Yokosuka, Kanagawa, Japan {t_kato, ishikawa, sumino}@mml.yrp.nttdocomo.co.jp http://www.nttdocomo.co.jp/English/ 2 Ericsson Research, Torshamnsgatan 23, Kista SE-16480, Stockholm, Sweden [email protected] 3 Ericsson Research Japan, 3-5 Hikarino-oka, Yokosuka, Kanagawa, Japan {Ye.Yu, Zhong-Wu.Zhu}@nrj.ericsson.se

Abstract. The peer-to-peer service has entered the public limelight over the last few years. Several research projects are underway on peer-to-peer technologies, but no definitive conclusion is currently available. Compared to traditional Internet technologies, peer-to-peer has the potential to realize highly scalable, extensible, and efficient distributed applications. This is because of its basic functions: resource discovery, resource sharing, and load balancing in a highly distributed manner. An easy prediction is the emergence of an environment in which many sensors, people, and many different kinds of objects exist, move, and communicate with one another; it is called the "ubiquitous communication environment". Peer-to-peer is one of the most important and suitable technologies for ubiquitous networking since it supports discovery mechanisms, simple one-to-one communication between devices, free and extensible distribution of resources and distributed search to handle the enormous quantities of resources. Peer-to-peer communication raises issues such as efficient resource discovery mechanisms and interoperation. The Semantic Web technology is one key to assuring that peer-to-peer can resolve these issues. The Semantic Web technology adequately can describe resource information. RDF (Resource Information Framework) is attracting attention for its flexibility. One of the purposes of this study is to explore a ubiquitous peer-to-peer network architecture that will allow various devices to communicate with one another across various networks. The other is to explore the applicability of the Semantic Web technology to peer-to-peer systems. We have been designing an architecture and protocols for realizing peer-to-peer networking among various devices. We are currently designing APIs for peer-to-peer applications and implementing a prototype peerto-peer networking infrastructure toward the ubiquitous communication environment. We are also considering semantic peer-to-peer applications and developing prototypes over our ubiquitous peer-to-peer network.

1 Introduction From its beginning the Internet has used conventional cooperative technologies such as the server-client approach to handle network resources and provide Internet services. This offers advantages in that Internet services can be regulated by just a few central servers, but some major concerns have been raised, such server overload and cost, and the demand for mutual and direct communications between many clients. As a result, the peer-to-peer technology has become popular and has been used in networks that handle vast amounts of data daily. It can balance the loads over a large number of servers and so is being used for applications such as distributed search [1], file sharing [2] [3], distributed storage [4] and groupware [5]. Additionally, a generalized platform for peer-to-peer applications has been proposed [6] and developed. At the same time, various devices have recently been extended by the addition of communication abilities. In the near future, an environment where many sensors, persons and different kinds of objects exist, move, and communicate with one another, called the "ubiquitous communication environment", will appear. In fact, peer-to-peer communication is one of the most important and suitable networking technologies for ubiquitous networking since it easily supports one-to-one communication between devices, the free and extensible distribution of resources, and the distributed search needed given the enormous amount of resources available on the Internet. Additionally, peer-to-peer communication raises issues such as efficient resource discovery mechanism and interoperation. These issues will be resolved with common semantics, a useful query language, an efficient search mechanism and so on. The Semantic Web technology provides “RDF” [7], common semantics for adequately describing resources such as information, services, and content. Several RDF query languages are being developed. The Semantic Web technology is seen as a key to realizing peer-to-peer for resource discovery and service combination in the ubiquitous communication environment. One of the principal goals of our work is to design a ubiquitous peer-to-peer architecture and a general peer-to-peer platform that enhances the communication capabilities of various devices and various networks by utilizing network resources efficiently and supporting mobility in an integrated and practical way. The other is to explore the possibility of applying the Semantic Web technology to peer-to-peer systems. We are considering peer-to-peer applications that use the Semantic Web technologies and developing prototypes for our ubiquitous peer-to-peer network. The rest of this paper is organized as follows. Section 2 overviews our peer-topeer architecture. Section 3 describes the design of the proposed protocol. We report the current status of our prototype in Section 4. We briefly mention a related work in section 5. In section 6, we report the prototype semantic peer-to-peer applications and give some consideration of semantic peer-to-peer search. Finally, a conclusion is given in Section 7.

2 Peer-to-peer Architecture Overview The proposed peer-to-peer architecture is shown in Figure 1. This paper discusses peer-to-peer communication between peer-to-peer nodes. A peer-to-peer node belongs to one or more communities defined as follows. Community: The term “community” means a logical collection of peer-to-peer nodes that have a common interest and obey a common set of policies. A community is identified by a community ID. For clarity, the following description considers only one peer-to-peer community. This architecture consists of the following basic components: Peer-to-peer node: The peer-to-peer node is an independent, bidirectional communication entity in the peer-to-peer network. In our architecture, it can be a mobile device, a PDA, a personal computer, a server, a workstation, or any of a variety of devices. Peer-to-peer community Control node Peer-to-peer node Pure peer-to-peer Pure peer-to-peer

Hybrid peer-to-peer

Gateway node

Gateway node

Peer-to-peer node

Gateway node

Peer-to-peer node

Peer-to-peer node Pure peer-to-peer

Fig. 1. The mobile peer-to-peer architecture

Beyond these basic components, the hybrid peer-to-peer network uses a control node (central point) to manage the pure peer-to-peer network. Pure peer-to-peer architecture: There are only peer-to-peer nodes in the pure peerto-peer architecture, see Figure. 2 (a). The connection between peer-to-peer nodes is established on mutual trust. Each peer-to-peer node is an independent entity and can enter or depart the peer-to-peer network at its convenience. Messages are sent from one peer-to-peer node to another directly or via some intermediary peer-to-peer nodes. Routing information is discovered by broadcasting an inquiry message to the network. Hybrid peer-to-peer architecture: The hybrid peer-to-peer architecture eliminates the weaknesses of the pure peer-to-peer architecture such as inefficient routing, splits in the network, and insufficient security, by introducing a control node. The proposed hybrid peer-to-peer architecture is shown Figure. 2 (b). In our architecture, the control node provides the functions needed for discovering routing information to the destinations, identifying the adjoining peer-to-peer node, recovering from splits in the peer-to-peer network, and improving the network topology and security.

Peer-to-peer node

Control node

Peer-to-peer node (a) Pure peer-to-peer architecture

(b) Hybrid peer-to-peer architecture

Fig. 2. Pure peer-to-peer and hybrid peer-to-peer

To realize the hybrid peer-to-peer architecture, the control node and the gateway node are introduced and defined below. Control node: The control node is an administration entity that manages the peer-topeer community forming the peer-to-peer network. It provides several functions independent of any particular application such as name resolution, routing information provision, identifying the adjoining peer-to-peer node, network topology optimization, node authentication, and multicast group management. Gateway node: The gateway node is a connection entity linking a pure peer-to-peer network to a hybrid network. It provides several proxy functions for nodes in the pure peer-to-peer network such as routing information provision, node authentication, and multicast group management. The control node receives a request from a peer-to-peer node and provides it with routing information, topology optimization or security functions. The gateway node collects topology information on the pure peer-to-peer network and reports it to the control node. A peer-to-peer node in a hybrid peer-to-peer network reports its existence and its adjacent nodes to the control node; the nodes can efficiently communicate with each other by using the routing information provided by the control node. A gateway node supports seamless communication between peer-to-peer nodes in pure peer-to-peer network and hybrid peer-to-peer network.

3 Protocol Design 3.1 Protocol Overview As shown in figure 3, the proposed protocols are composed of P2P Core Protocol that is independent of a particular transport mechanism (currently implemented over TCP and blue-tooth) and six upper layer protocols on the top of it. P2P Core Protocol takes fundamental roles such as sending, receiving or forwarding peer-to-peer messages in the manner of the peer-to-peer communication model, whereas the six upper layer protocols provides additional services such as for multicasting or reporting errors. With these existent protocols, a user can easily implement new applications that require the peer-to-peer features.

P2P Basic Communication Protocol

P2P Basic Service Protocol

P2P Multicast Communication Protocol

P2PMulticast Service Protocol

P2P Control Message Protocol

P2P Application Protocols

P2P Core Protocol TCP

No-IP (Bluetooth)

IP Network

Fig. 3. Protocol stack

Details of the protocol components are given in the next section.

3.2 P2P Protocols The P2P Core Protocol has been designed to process peer-to-peer messages according to the peer-to-peer communication model. We defined three message types to realize reactive and proactive communication modes. Request and response messages are defined for the reactive communication mode while an advertise message is defined for the proactive communication mode. Additionally, we defined three communication types: unicast, multicast and broadcast. A unicast message is sent to the destination node either directly or using multi-hop unicast or multi-destination unicast. When a node receives a multicast message from a multicast member node, it forwards the received message to remaining adjacent member nodes using multi-hop unicast. When a node sends a broadcast message to all adjacent nodes, the forwarding of a broadcast message is controlled by its hop count. Since the naming and the message routing mechanisms of the P2P Core Protocol are defined to be independent of transport protocols, the P2P core protocol can be designed over any transport protocol. Currently, we are designing it over TCP and Bluetooth. Table 1 shows the protocols defined over the P2P Core Protocol. Details of those protocols are given below. The P2P Basic Protocol is defined to realize the establishment and the release of a peer-to-peer session. The P2P Multicast Communication Protocol is defined to construct a multicast distribution tree among multicast member nodes and to forward multicast messages over it. In the hybrid peer-to-peer architecture, the P2P Basic Service Protocol is used between a peer-to-peer node and a control node to efficiently enhance communications between peer-to-peer nodes. Additionally, the P2P Multicast Communication Protocol is used between a peer-to-peer node and a control node to efficiently enhance multicast communication between peer-to-peer nodes. The P2P Control Message Protocol is defined to provide ancillary functions such as notification of a message forwarding error, keep-alive for peer-to-peer sessions, and peer-topeer node discovery.

Table 1. Methods and Messages Protocol

Method

Message

P2P Basic Communication Protocol

Hello method

Hello HelloResponse

Bye method

Bye

Resource Information Exchange method

ResourceInformationAdvertise ResourceInformationRequest ResourceInformationResponse

P2P Basic Service Protocol

Service Provide method

P2P Multicast Communication Protocol

Join method

ServiceAdvertise ServiceRequest ServiceResponse Join JoinResponse

Leave method

Leave

P2P Multicast Service Protocol

Multicast Service Provide method

MulticastServiceAdvertise

P2P Control Message Protocol

Error Report method

ErrorReport

Diagnose method

Diagnose

MulticastServiceRequest MulticastServiceResponse

DiagnoseResponse Lookfor method

Lookfor LookforResponse

We have designed these protocols using XML. Since XML supports the design of general tree-structured data, it is quite appropriate for designing the complicated protocol messages required by peer-to-peer applications. Layered P2P Protocols are independently defined using XML Namespace; XML is well suitable to design application protocols. As shown in Fig. 4, the Core element identifies a peer-to-peer message and an upper layer protocol message is encapsulated in the MsgBody element. Fig. 5 shows an example of a peer-to-peer protocol message. This message is the Hello message of the P2P Basic Communication Protocol. Elements of parameters of Peer-to-peer Core Protocol Elements of parameters of P2P Basic Communication Protocol P2P Basic Service Protocol P2P Multicast Communication Protocol P2P Multicast Service Protocol P2P Control Message Protocol P2P Application Protocols



Fig. 4. Peer-to-peer message structure

urn:ED:Community:DoCoMo urn:ED:Community:DoCoMo Request Request 123456-200201241600-nd1 123456-200201241600-nd1 nd2 nd2 nd1 nd1 Unicast Unicast MulticastCommunication Pure Pure

Fig. 5. Example of peer-to-peer message

3.3 Protocol Sequences Fig. 6 shows a message exchange sequence when a peer-to-peer node participates in a peer-to-peer network and searches for a particular content. Node A sends a Lookfor message by broadcast and receives the corresponding LookforResponse messages. Node A sends a Hello message to one of the nodes from which a LookforResponse message was received. If Node A sends a Hello message to Node B, Node A then establishes a peer-to-peer session with Node B and they exchange own resource information. Node A then sends a Query message by broadcast to search for the desired metadata. If Node B has the desired metadata, it sends a QueryResponse message to Node A directly or indirectly. Finally, Node A gets the desired metadata. More details of the distributed metadata search application are given in section 5.2.

Node B

Node A Looking for peer- node

Lookfor

Node C

………

Node X

Connection establishment

LookforResponse LookforResponse Connecting first peer node

Hello

LookforResponse

HelloResponse Connection establishment ResourceInformation Advertise

Exchanging Resource Information

ResourceInformation Advertise Query Distributed Semantic Search

Query QueryResponse Hello HelloResponse Connection establishment QueryResponse

Fig. 6. Example of message exchange sequences

4 Prototype Implementation We are currently prototyping the peer-to-peer node, the gateway node and the control node. All nodes are implemented in Java (J2SE 1.3.1) on Microsoft Windows 2000 and Red hat Linux 7.2. Protocol APIs (Application Programmable Interfaces) have been provided for application developers. This prototype provides the basic functions of the mobile peer-to-peer protocols and some utility tools (Fig. 7). All of the APIs are designed as Java APIs. Fig. 8 shows the API design within a peer-to-peer node. We prepared three levels of API. The application developers can easily design a new protocol for a particular application using P2P Core API and implement functions such as multicast communication using P2P Communication Service API and P2P Application Service API.

Fig. 7. Utility tool for mobile peer-to-peer platform

Instant Instant Messaging Messaging Application Application Environment Environment

Application Application Environment Environment Metadata Metadata Search Search Application Application Environment Environment

P2P P2P Application Application Service Service API API

Application Application functions functions

P2P P2P Application Application Services Services General General function function --Participation Participation inin Peer-toPeer to-Peer network Peer-to-Peer network --Peer Peer Discovery Discovery&& connection connection --Multicast Multicast group group Setting Setting

Mobile Mobile Proxy Proxy Function Function --User User Profile ProfileManagement Management --Transcoding Transcoding --Protocol Protocol Conversion Conversion

Instant Instant messaging messaging functions functions

Metadata Metadata Search Search functions functions

P2P P2P Communication Communication Service Service API API

P2P P2P Basic Basic Service Service Protocol Protocol

P2P P2P Multicast Multicast Communication Communication Protocol Protocol

P2P P2P Multicast Multicast Service ServiceProtocol Protocol

… …

Other Other functions functions

P2P P2P Application Application Protocols Protocols

P2P P2P Communication Communication Services Services P2P P2P Basic Basic Communication Communication Protocol Protocol

Other Other Application Application Environment Environment

…… ……

P2P P2P Control Control Message MessageProtocol Protocol

Instant Instant Messaging Messaging Protocol Protocol

Metadata Metadata Search Search Protocol Protocol

… …

Other Other Protocol Protocol

P2P P2P Core Core API API P2P P2P Core Core Protocol Protocol

Fig. 8. Software architecture

5 Related Work JXTA [6] is probably the most widely deployed general-propose peer-to-peer platform in peer-to-peer communities; we have the similar design goals as JXTA. Both are aiming at a general peer-to-peer platform for peer-to-peer user applications, providing a set of functionalities the applications require, being independent of particular user applications, programming languages, operating systems or network devices (PC or PDA etc.). Comparing to the JXTA platform, we introduce a control node component in our architecture to realize hybrid peer-to-peer network. The main role of the control node

is to be able to administrate peer-to-peer network topology, in order to efficiently detect the topological changes and recover from the network splitting. In both systems, each peer-to-peer node has a route cache to the destination nodes. However when the network topology changes or the node failure occurs along the route, JXTA cannot recovery it until the broadcast-based route query procedure completes. In our system, since control node holds the up-to-date topology information, a peer-to-peer node can be informed either by receiving a network improvement notification from the control node before it sends message to the destination peer, or by obtaining new route information form the control node. The consumption of network resources can be reduced more by this approach than the broadcast-based route query approach. The second different point is peer-to-peer multicasting. Our system provides peerto-peer level multicasting that enables a peer to send a message to all peers belonging to a multicast group in efficient manner rather than broadcasting. In the JXTA approach, they propose a concept of "Group" for group membership and "Pipe" for useful communication model inside a group. However they do not provide multicasting. The third is the difference of security consideration in the two approaches. JXTA provides an access control framework based on concept of "Group". Under this concept, any service belongs to the specific group. Peer-to-peer nodes which want to access the service must be given the membership right of the group and follow necessary authentication procedures. The benefit of this idea could be that the peer-to-peer platform provides an access control framework to the service developers. However this approach seems complicated because a peer-to-peer node has to belong to multiple groups if it wants to access multiple services provided by different groups and each different group has to prepare individual access control mechanism for this purpose. In our approach, we proposed a simple authentication mechanism for peer-to-peer nodes. When a peer-to-peer node participates in the hybrid peer-to-peer network, it must be authenticated and authorized by the control node. When a peer-to-peer node starts the communication with its peer-to-peer node, both nodes should be mutually authenticated. We don't provide security function to individual peer-to-peer applications at the moment in order to keep our approach simple.

6 Consideration of Semantic Web Applications on P2P Platform Peer-to-peer technology in association with semantic web might bring appreciable benefits. In this framework, we have researched on several topics listed below: Management of Distributed Metadata: As the number of mobile phone users who use multimedia content is increasing rapidly, the demand for mobile multimedia content search is becoming larger. It is essential to provide an effective search mechanism to meet users’ needs. Using metadata is one way of improving efficiency of multimedia content search which has potential to make the search more intelligent and precise. This has been known as one of the benefits we can get from Semantic Web Technology. We use RDF as the syntax of metadata. Additionally, we applied peer-to-peer to a metadata search system to handle the increase in metadata content.

Section 5.1 describes a multimedia content search prototype system that uses metadata for multimedia content search. The system is implemented over a peer-to-peer network as a distributed search system which consists of distributed search nodes. Query Language and database: Several RDF query languages have been proposed (RQL [8] SquishQL RDFPath etc). However, most of them suffer from deficiencies such as non-XML syntax and limited functionalities for operating distributed metadata. We defined a new RDF query language based on XML syntax with enhanced manipulations for better handling RDF metadata. Its functionalities, for example updating and deleting metadata, changing schemas and inquiring about schemas, are important for implementing distributed metadata systems. Additionally, we are developing a lightweight database system to support this language. More details are given in section 5.2. Query Routing: Efficiency in searching and data retrieval is a key issue in Peer-toPeer networks. We have analyzed semantic routing and proposed a new approach based on semantic web technologies and on RDF to achieve better retrieval performance. The main idea standing behind semantic routing is to use the content of queries to drive routing decisions. More details are given in section 5.3. 6.1 Mobile Multimedia Content Search using RDF [9] In this section, we describe a system architecture and a prototype implementation, which realizes distributed multimedia content search for mobile phones. Our prototype system is based on RDF to make multimedia content searches more intelligent and precise. Furthermore, it is developed over the peer-to-peer network. The reason we adopt peer-to-peer technology for search is because peer-to-peer has better scalability and robustness than traditional server-client system. Content Content Class Class

subClassOf

Music Markup Movie Music Markup Movie Content Content Content Content Content Content class class class class class class

value of “creator” property is defined as a Resource

value of “device” property is defined as a Resource Device Device class value of “rights” property

is defined as a Resource

Rights Rights class class

Picture Picture Content Content class class

JAVA JAVA Content Content class class



Person Person class class

class





Fig. 9. Metadata Definition for Mobile Multimedia Content

We have designed a set of metadata using RDF schema. Though this metadata set is intended to be used for NTT DoCoMo i-mode service [10], it can be generally applied to mobile multimedia content as well [11]. As shown in Fig. 9, our definition is derived from the “Dublin Core” metadata definition [12]. While the “Dublin Core” metadata was originally defined for document content, it is also applicable to any kind of content. We have newly defined a few new elements (e.g. and ), to extend the 16 basic elements of “Dublin Core”. In our metadata definition, 5 classes of multimedia content (e.g. music content class and Java content class) are derived from a general content class, by using a “subClassOf” property. The property value of a resource class is either a literal (e.g. instance) or a resource class. For example, the property value of a “creator” property belonging to a “Content” class is a “Person” class. Othello game, puzzle, easy puzzle game Thu,01Feb2001 05:15:52 7556 CLDC-1.0 100Yen DoCoMo Hikarinooka Yokosuka [email protected] x503

Fig.10. An example of metadata

In Fig. 10, we give an example of the metadata of Java content. It is defined according to our metadata definition shown in Fig. 9. In current systems, a Web content search service such as Google is logically provided by a single Web site on the Internet. However, distributed architectures such as the peer-to-peer network are very attractive for content search since they provide enhanced scalability. By adopting the peer-to-peer architecture, the rapid update of search indexes becomes possible in a distributed manner. Our search system adopts the distributed architecture based on a peer-to-peer network. The architecture of our search system is shown in Fig. 11. In a distributed system, the metadata that corresponds to the indexes of Web content are distributed across Web servers. In our system, metadata of contents are distributed among peer-to-peer nodes, but the actual content is not always necessarily to be stored together. The metadata of content is uploaded to the database of a search

engine on a peer-to-peer node. The search engine searches its database and the location of the result hit which identified by an URI is indicated in the search result. Some existing mobile systems, e.g. I-mode system, the mobile terminals e.g. I-mode terminals use HTTP to access the Internet, they are not capable to participate a peerto-peer network directly. In order to solve this problem, our prototype provides a proxy node towards mobile phone as shown in Fig. 11. The proxy node provides an HTML/HTTP interface for mobile phones and sends a web page for multimedia content search when a mobile phone user accesses it. On the other hand, the proxy node communicates with other peer-to-peer nodes using peer-to-peer protocol, it acts as a ordinary peer-to-peer node simultaneously. Mobile network

Mobile Proxy Search

Dow nloa

Mobile Phone (i-mode)

ery Qu

Metadata files

Query

Qu ery d

Content Server (a) System architecture

Peer-to-peer network i-appli files (contents)

(b) Transition of display

Fig. 11. Distributed metadata search application

In Fig. 12, we show the sequence of mobile content search. When a mobile phone user requests a mobile content search, the proxy node extracts the search information, translates it into a query message, and sends it to the connected peer-to-peer nodes. Upon receiving a query message, the search engine on a peer-to-peer node accesses its database to find the metadata that satisfies the search condition. At the same time, the peer-to-peer node relays the query message to neighboring peer-to-peer nodes.

Node B

Node A

Node C

………

Node X

Connection establishment Connection establishment

Query Query

Query

QueryResponse QueryResponse QueryResponse Hello

HelloResponse Connection establishment

QueryResponse

Fig. 12. Search Sequences

6.2 An XML-based RDF Query Language On the distributed metadata-based search system, metadata will be distributed among peer-to-peer nodes and maintained at each peer-to-peer node. To distribute metadata, the RDF query language acts as the critical interface between peer-to-peer nodes; there is no practical query language for RDF. We need to standardize a query language and protocol to access RDF metadata, which would promote the Semantic Web. The interface for the metadata database is one of the important points in designing the distributed metadata-based search system. In peer-to-peer environment, it is especially important to define the format of metadata query/response for communication between peer-to-peer nodes. Furthermore, a query for the metadata search should be described in detail in order to get precise results. However, there are some deficiencies with existing RDF query languages as follows. XML Syntax: Since existing RDF query languages use SQL-like syntax to describe an RDF query, a special parser other than an XML parser is required. The capability of SQL-like syntax cannot fully express complicated RDF queries, compared to XML-based syntax. RDF Metadata Operations: The existing RDF query languages only support RDF metadata search operations. In order to satisfy different needs, some other functions such as creation, modification and deletion of RDF metadata and schema are required. Response format: The existing RDF query languages have no ability to specify the format of result response. The ability for specifying the format of a response is an important issue for interoperability of peer-to-peer system. Based on the above issues, we have been designing an XML-based RDF query language called xRQL. xRQL solves these issues by defining a RDF query language based on XML syntax with enhanced manipulations for RDF metadata. A basic xRQL query consists of an operation element, a location element, an object element, a

condition element, a result element and a namespace element. The operation element is used to declare an xRQL operation. xRQL operations can be classified into creation, search, modification and deletion of RDF metadata. For a search operation, a result element is used to define the result format of a RDF query, which is based on XML syntax, so that users can get a RDF query result in their favorite format without additional processing. In addition to operations for RDF metadata, xRQL also defines RDF schema operations to retrieve the domain of a property or the sub-class of a resource class, and so on. Fig. 13 shows examples of a query request and a query result respectively. select Javacontent:a{title}STRING:b, Javacontent{description}STRING:c, Javacontent{date}STRING:d, Javacontent{device}Device:e, Javacontent{device}Device{devicetype}STRING:f, Javacontent{creator}Person:g, Javacontent{creator}Person{name}STRING:h b=”othello” h=”DoCoMo” a typeof(a) b c d g h e f ns=www.mml.yrp.nttdocomo.co.jp/Java

(a) Query http://.../othello.jam http://.../schema#JAVAContent Othello pazzle geme Thu,01Feb2001 05:15:52 http://.../Creator/s/001 DoCoMo http://.../Device/503x x503

(b) Result

Fig. 13. An example of query and result

6.3 Semantic Search in peer-to-peer networks based on RDF schema On the peer-to-peer metadata search system, query messages will be forwarded across the peer-to-peer network. Efficient query routing is important for avoiding traffic congestion of messages. For this reason, we need to consider efficient query routing

based on RDF query language, which will promote the Semantic search on the peerto-peer network. Despite existing peer-to-peer systems provide good properties like self-organization, sharing of large amounts of resources and fault tolerance, key issues as efficient search and retrieval of data remain still open and require improvements. For that purpose several searching techniques have been lately developed each with its own advantages and drawbacks. The approach we propose tries to improve the efficiency of search applications defined on top of peer-to-peer architecture exploiting a technique known as semantic routing. Semantic-routing, as the name itself suggests, is a technique where queries are routed according to their content. Each node has to build and maintain a routing table (or knowledge-base) where the most significant data of the queries are associated to specific nodes of the peer-to-peer network. This association expresses, from the node perspective, the ability of these peers to satisfy certain type of queries and it may get stronger or weaker over time. In this framework the node learns from its past “experience” in which “topic” peers in the network are good; in this way it dynamically develops a knowledge related to how the content is distributed in the network. Given thus a query, the node will forward it just to the subset of peers, listed in the knowledge-base (KB), that have the best credentials to provide the appropriate responses. The nodes, to which the query has been passed, will in turn check if they can answer and then will apply iteratively the same procedure according to a propagation control. The expectation is that by this selective and “intelligent” routing, the overload of networks and nodes may be avoided without decreasing quality of results. Response time and accuracy of results are indeed fundamental parameters for a satisfactory search service. To achieve higher quality results we describe the features and relationships of the resources throughout metadata and RDF schema. Contents can therefore be precisely defined and queries can be given with structured information. Besides leading to more accurate search results, these conditions allow also a more efficient exploitation of the content of the queries for their efficient routing. The approach we are now going to explain shows how RDF schema information can be exploited in the context of semantic routing. In this framework, we mainly focus on KB design explanation that is basically the principal point reflecting RDF. In the peer-to-peer network, each node is supposed to manage a repository of contents metadata. More precisely, given defined RDF schema, every node owns RDF data models where classes and property values of the resources are specified. As first step in KB design, nodes are grouped per classes and subclasses of the RDF schema. For simplicity of explanation let us take, as an example, the RDF schema about cultural resources introduced in [13]. The RDF schema modified for our needs is shown in figure 14. According to the schema, KB of each node may keep node-Museum association, node-Artist association and so on. KB also contains an indicator expressing the ability of these nodes to satisfy query on such particular contents (Museum, Artist, etc.). This indicator is modified search-by-search depending on the behavior of the nodes whenever queried. Furthermore, the values of the properties listed in the query condition are tracked. According to the number of results a specific node provides, an

estimation of the minimum number of its resources having those property values is also computed.

Fig. 14. Example of RDF schema Class Museum

NodeId Node B

Quality 12

Sculpture

Node B

25

Painter

Node C

7 12 13

Property location location type material material exhibited lName creates

Value Tokyo Osaka futuristic wood metal Picasso -

Min.n. resource 10 5 3 2 6 4 1 12

Fig. 15. Knowledge base representation

Fig. 15 is a KB representation of a fictitious node A that has received results from the peer nodes B and C for given queries. Node B has for instance answered to queries whose conditions were related to Museum content and contained a possible combination of ‘location’ and ‘type’ data. These data (‘Tokyo’,’futuristic’, etc.) have been logged. The associated values (10, 3 etc.) express how many resources with these specific ‘location’ and ‘type’ characteristics node B can at least contain. These associated values are based on the number of matches Node B returned to each query. The number of results is used to compute the Minimum number of resources with certain characteristics. As regards field Quality, it is a general measure of node goodness in the given classes. It expresses the ability of a node to reply on a certain topic on the base of its observed behavior. The formula for the Quality computation is the following: Qj,i+1=Qj,i+rj/Maxz=1,n(rz) here, Qj,i+1 is the new Quality of the j-th node after answering to the i+1-th query, Qj,i is its current Quality in the KB, rj is the number of matches the j-th node has provided and Maxz=1,n(rz) is the maximum among the number of results provided by all the n queried nodes. As the formula highlights, nodes are compared query by query on the number of results returned and their quality is adjusted with a value that is proportional to the number of matches provided and relative to the nodes overall behavior on the specific query.

These computed data, stored in the KB, are then combined for a global node evaluation and the respective results are used to rank nodes whenever a search has to be processed. The nodes get ordered according to their believed goodness and a subset of the best candidates is selected for query forwarding. Node selection, however, should include a factor of randomness so that also new nodes, whose quality value is still low, have the chance to prove their goodness. Furthermore possible wrong evaluation might be recovered. Node learning strategy is fundamental in semantic routing however there are also some other issues and aspects that are correlated and worth to be stressed. Knowledge base size management is an example. The knowledge base that has been designed and presented in the previous sections keeps detailed and useful information, anyway storing all the property-value pairs specified in the query conditions lead inevitably to a quick size increase. Another delicate point, that anyway involves the semantic routing approach from a general perspective, is the initialization phase when KB of the node is empty. In this case query flooding or random neighbors selection might be used or, in order to speed up the learning process, neighbors node might exchange content information to initialize their KB. Anyway, in all these scenarios we should consider that peer-to-peer networks are dynamic in term of content and node mobility: nodes can join and leave the system at any time. Node stability may also be considered as an index of goodness that could be reflected in the KB. If a node is not available whenever it is contacted, its quality value could be decreased. Finally, some user feedback, like selecting a node result for downloading, might be also positively reflected in the computation of the quality value.

7 Conclusions In this paper, we first presented a platform for development of mobile Peer-to-Peer systems and applications. The platform includes peer-to-peer system models and P2P communication protocols. Then, we introduced a RDF-format metadata schema that is used to describe multimedia contents and an XML-based RDF query language (xRQL) that used as a critical interface to help a user extract his/her target metadata from peer-to-peer nodes. Additionally, on the proposed peer-to-peer system platform, we also presented a metadata-based search application that realizes advanced multimedia content search for mobile phones by using a semantic routing approach. In the application, semantic routing techniques with RDF schema are used as possible solution to improve efficiency in peer-to-peer resource discovery. We also described xRQL. We are implementing an xRQL processor over a general RDF engine. xRQL has the ability to update and delete metadata, change schemas and inquire about schemas, indicate result format and so on. These abilities are essential in a semantic peer-to-peer system. This paper briefly analyzed semantic routing techniques as possible solution to improve efficiency in peer-to-peer resource discovery. A proposal based on RDF

schema has been presented briefly. The adoption of RDF brings itself large margins of improvement in search applications. RDF is furthermore a standard and everybody may define his own model for resource description with the advantage of providing information machine-understandable. Our proposal thus benefits from this flexibility/generality and may be open to future extensions. For the further work, we plan to evaluate the system platform and the semantic content search application. We will also continue to examine the issues about the semantic web technology and peer-to-peer systems.

References 1. 2. 3. 4. 5. 6. 7. 8.

Gnutella, http://gnutella.wego.com/. NAPSTER, http://www.napster.com/. WinMX, http://www.frontcode.com/. FREENET, http://frenet.sourceforge.net/. GROOVE, http://www.groove.net/. Sun Microsystems, Project JXTA, http://wwwjxta.org/ Resourse Description Framework (RDF), web page http://www.w3.org/RDF/ G.Karvounarakis, S.Alexaki, V.Christophides, D.Plexousakis, Michel Scholl, RQL: A Declarative Query Language for RDF, The 11th World Wide Web Conference (2002) 9. Hiromitsu Sumino, Norihiro Ishikawa, Takeshi Kato, Johan Hjelm, Ye Yu, Zhongwo Zhu, Mobile Multimedia Content Search using RDF, Mobile Data Management (2003) 10. i-mode, http://www.nttdocomo.co.jp/p_s/imode/ 11. Johan Hjelm et al., Towards Mobile MetaSearch, The 11th World Wide Web Workshop on Mobile Search (2002) 12. Doublin Core Metadata Initiative, http://dublinecore.org/ 13. Gregory karvounarakis, Vassilis Chistophides, Dimitris Plexousakis, Sofia Alexaki, “Querying RDF Decriptions for Community Web Portals”, 17ieme Jounees Bases de Donnees Avancees (BDA’01), pp 113-144, Agadir Maroc