Trust-Aware In-Network Aggregation for Wireless ... - IEEE Xplore

3 downloads 0 Views 357KB Size Report
Abstract—In this paper we present an efficient trust-aware in- network aggregation approach for resilient wireless sensor networks. The work is motivated from ...
Trust-Aware In-Network Aggregation for Wireless Sensor Networks Hongmei Deng1, Guang Jin1, Kun Sun1, Roger Xu1, Margaret Lyell1, Jahn A Luke2 1

Intelligent Automation Inc. {hdeng, gjin, ksun, hgxu, mlyell}@i-a-i.com 2 AFRL/RYTC [email protected] Abstract—In this paper we present an efficient trust-aware innetwork aggregation approach for resilient wireless sensor networks. The work is motivated from the well studied reputation and trust relations in the field of social sciences. In our approach, the trust evaluation mechanism is applied to identify trustworthiness of sensor nodes, distinguish illegal/misbehaving nodes, and filter out bogus data in the aggregation process. The objective of this effort is to return the highest-fidelity possible response to the user, while monitoring the health of the network by flagging suspected compromised nodes. The experimental results demonstrate the effectiveness of the proposed approach. Keywords-Security; Wireless Sensor Networks; Aggregation; Trust

I. INTRODUCTION Corresponding to the increasing trends of using Wireless Sensor Networks (WSNs) for daily operations in both commercial and defense sectors, designing an efficient and secure query processing mechanism becomes critical. Aggregation query [1][2][3][4], as one of the primitive query types for collecting and evaluating sensor readings, is efficient to fit into the resource-constrained nature of wireless sensor nodes. For example, in the commonly used tree-based topology, an aggregation node computes a partial aggregation result based on its local sensor reading and the readings reported by its children, and then sends the aggregation result to its parent node at a higher level. During this in-network aggregation processing, each node only needs to transmit one constant-sized message to its parent, and saves the precious bandwidth resources from the constrained WSNs. Most existing sensor querying approaches do not have a security aspect based on the fundamental assumption that the sensor nodes will cooperate and not cheat. In reality, WSNs are expected to be deployed in various hostile environments, for instance, battlefield, and these unattended wireless sensor nodes are faced with a variety of risks and attacks. The assumption of having always cooperative nodes is invalid when the node is hijacked or compromised. In addition, traditional data encryption and authentication mechanisms can only provide a certain level of security, but is not sufficient to provide a complete solution due to the unique characteristics and new misbehaviors encountered in WSNs. For example, if a node is compromised with valid encryption keys, it can easily insert bogus sensor readings or change the aggregation result. Through message encryption and verification, the recipient nodes can check if the messages are from a particular node and are not modified during transmission, however, it cannot tell if the received sensor reading is real. The issue

becomes even more serious when the in-network aggregation process is applied, as every node needs to do local aggregation based on the received sensor readings. If the aggregation node intentionally modifies the aggregation result and passes the manipulated information into the network, the recipient node cannot tell. A compromised aggregator usually causes more severe security impacts than forged sensor readings. Outlier detection [5] has been applied to check if a raw sensor reading makes sense by comparing it against a set of values by using some prior domain knowledge regarding the measured physical phenomena. Without prior domain knowledge, however, it is impossible to detect a forged sensor reading directly by the data generator. If domain knowledge is given, abnormal sensor readings can be defined and detected automatically. For example, for spatially continuous phenomena, such as temperature, spatially neighboring sensor readings can be used to detect abnormal (forged) readings. For the temporally continuous phenomena, such as humidity, temporally readings can be used. In some situations, outlier detector barely based on spatially or temporally neighboring readings may mark an authentic reading as an outlier. Let’s take the example of using a WSN to track a fast moving vehicle. If a sensor node A detects the vehicle, neither the current readings located nearby A nor the historical readings from A can provide similar vehicle detection results. Spatiotemporally neighboring readings are useful in this case. If A detects a vehicle, it is very likely that a node nearby A has detected the vehicle previously. Using these kinds of prior knowledge, the forged sensor readings are able to be detected. Fortunately, most physical phenomena monitored by a WSN can be considered as temporally, spatially, or spatiotemporally continuous, thus, outlier detection algorithm can be applied. While an outlier detector has the potential to detect bogus sensor data, it can hardly tell if a partial aggregation result is forged, as the partial aggregation results from different subnetworks should contain more dissimilarities than nearby sensor readings. A naïve approach would simply collect raw readings and analyzes them at a central base. The naïve approach, however, can significantly increase wireless communication overhead and sacrifice the benefit of innetwork aggregation processing. In this paper, we present a trust-aware in-network aggregation approach for resilient WSNs, in which the trust evaluation mechanism is applied to identify trustworthiness of

978-1-4244-4148-8/09/$25.00 ©2009 This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2009 proceedings.

sensor nodes, distinguish illegal/misbehaving nodes from normal ones, and filter out bogus data in the aggregation process. The work is motivated from the well studied reputation and trust relations in the field of social sciences. The objective of this effort is to return the highest-fidelity possible response to the user, while monitoring the health of the network by flagging suspected compromised nodes. It is noted that we are not aiming to provide a complete security solution to WSNs; nevertheless our focus is to provide the user the most-trusted query reply with the imperfect WSN of today in response to user queries. The remainder of this paper is organized as follows: Section II presents our proposed trust-aware in-network aggregation approach, including three models, namely, inconsistency check, trust evaluation, and resilient aggregation; Section III provides some performance evaluation results; Section IV discusses the robustness of the approach and some extensions; Section V presents related work, and conclusions are summarized in Section VI. II. TRUST-AWARE IN-NETWORK AGGREGATION APPROACH From our observations, security breaches on data aggregation arise due to the information asymmetry. When a node receives a partial aggregation result, without knowing the input values from the previous aggregator, the recipient cannot judge the validity of the aggregation result. The core idea here is to require additional evidences for each partial aggregation result and perform consistency checking. In other words, we are trying to transfer the information asymmetry into information symmetry. The proposed trust-aware in-network aggregation mechanism is shown in Figure 1, which includes three models, namely, inconsistency checking, trust evaluation, and resilient aggregation. The approach takes advantages of the inherent redundancy of WSNs and adds the trust evaluation model into the data aggregation process.

Figure 1: Trust-Aware In-Network Aggregation

In our approach, first, the commonly used single-parent tree topology is extended to a multi-parent tree topology. Each node selects multiple nodes as the parents instead of only one, and broadcasts the partial aggregation results to its parents. In this way, the partial results from a particular node will be received and processed by multiple parents. If the partial aggregation result is modified by a compromised node, it will cause some inconsistency against other results from its sibling nodes due to the network redundancies. Second, a set of rules is defined in the inconsistency checking model to check the inconsistencies. Based on the results of inconsistency checking, the trust of each individual node can be built up using the trust evaluation model. Third, the established trust information can further be used to identify compromised nodes and assist in-network aggregation process, therefore improve the reliability of sensor querying.

It is worth noting that the approach does not introduce extra communication overhead into the aggregation process, which makes it resource efficient to the energy-constrained WSNs. In current single-parent tree topology, a node only chooses one parent node even if it is in the range of multiple parent level nodes. The parent only processes the messages received from its children and ignores others, even if it can successfully receive them due to the broadcasting nature of wireless channels. Our approach just takes advantage of a wireless channel and fully explores the inherent connectivity within the network. The experimental results demonstrate the effectiveness of our approach. A. Network Model The major security breach we address here is the bogus partial aggregation result in the network aggregation process. It can be generated by the compromised sensor nodes or due to the failure of some system components such as radio/sensors etc. However, whether a node produces a wrong partial aggregation result either after being compromised or due to system failure, both are equally detrimental to the functioning of the network. Security mechanisms developed using cryptographic techniques cannot prevent such type of misleading data, resulting into inaccurate querying results. Since our main focus is the in-network aggregation, some inessential features are abstracted away to simplify the network model. We assume there are n associated sensor nodes for each aggregation point. Each sensor node takes a measurement and reports the observed value xi to the aggregation point. The aggregation node's goal is to compute an aggregation value y that summarizes the sensor readings x1,…, xn using the aggregation function f; thus, y = f(x1,…,xn). We assume there is a secure transmission link between sensor nodes, such that the data can be securely transmitted. We also assume that there are efficient outlier detectors applied in the aggregation nodes to detect and remove the forged sensor readings. As a consequence of these assumptions, we have no need to worry about the security breaches on data generation and spoofing or interception of data in transit. This leaves only the question of whether the aggregation node is trustworthy or not. B. Inconsistency Checking For inconsistency checking, we need additional evidence for the partial aggregation result. As mentioned earlier, we extend the commonly used single-parent tree topology to a multi-parent tree topology. A multi-parent tree contains one root which serves as a gateway between the base station and the WSN. The level of root node is 0. The level of nodes which can directly communicate (hear and send) with the root is 1. The level of nodes which can directly communicate with level 1 nodes (any one among them) is 2. And so on. A level i node can choose all level (i-1) nodes which it can directly communicate with as parents. When processing aggregation queries, a level i node should process all messages from its children, and send the partial results to its parents through broadcasting. Note here, the message sending is done through broadcasting, which means no additional energy is consumed compared with the traditional single-parent tree approach.

978-1-4244-4148-8/09/$25.00 ©2009 This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2009 proceedings.

We define the partial aggregation message is as follows: (AggValue; Base; Support; LocalValue). The AggValue is the local aggregation result computed by an aggregation node based on its children’s readings. In this paper, we mainly consider two duplicate insensitive aggregation types, MAX() and MIN(), and will explain how our approach can be extended to other types of aggregation operations. For aggregation function MAX(), the AggValue equals to the local maximal value; while it is the local minimal value if MIN() is used. For duplicate insensitive aggregation operation, the AggValue can be provided directly by its one child. We define the Base is the ID of the child node which provides the local AggValue. The Support is an ID set, which indicates the provider set based on whose values the node computes aggregation. If we define the local family of a node s is the node plus all its s . If children (denoted as s ), then no child misses to report its result, the support is the IDs of all its local family members (i.e. s ). For a static WSN, the Support might be removed from the aggregation message, since it becomes fixed after the multi-parent tree is built up. The LocalValue is the value of sensor reading. Based on LocalValue, we can apply outlier detectors to check if a node is faking its readings. To check inconsistencies, in each epoch, a sensor node S receives n messages from its children, m1, m2, … mn. If one child node forges its reported partial aggregation result, the sensor node S detect the inconsistencies using the rules shown in Table 1. TABLE 1: RULES FOR INCONSISTENCY CHECKING , for MIN() , for MAX()

Figure 2 illustrates an example of how the inconsistency checking works. In this example, a MAX() query is being processed. The eclipses indicate a small subset of nodes in a large WSN. The letters inside eclipses represent the node IDs, while the numbers inside are the local sensor readings. The rectangles represent the partial aggregation results which contain the AggValue, Base, Support and LocalValue as defined above. The nodes D, E, F are located at the depth n. The arrows indicate the communication links from children to parents. For those nodes which are not shown in Figure 2, dashed arrows are used. In Figure 2, we assume node E is a compromised node and fakes its AggValue to a large value to mislead the network. It is noted that Node E may forge the AggValue to a smaller one. However, it almost has no impact to the final aggregation result. We ignore it for further discussions. When node E reports the forged partial aggregation result to its parents (i.e. the nodes A, B and C), first, the LocalValue and AggValue from node E are not consistent. This type of

inconsistency can be easily captured by the parent nodes. If node E also fakes its LocalValue and makes it consistent with its AggValue, the partial result from E also raises inconsistency against other partial results from its sibling nodes. For example, as the E’s Base is within D’s Support, the E’s AggValue cannot be larger than D’s AggValue. Also, as both E and F share the same Base, their AggValue should be the same. In this way, the nodes A, B or C can detect this type of inconsistency. A:1

B:3

C:5

9 G DGH⋯ 5

10 H EHGI 6

8 H FHI⋯ 3

D:5

E:6

F:3

9 G G⋯ 9 G:9

8 H H⋯ 8

n-1

n

7 X IX ⋯ 6

H:8

I:6

n+1

Figure 2: Example of Inconsistency Checking

C. Trust Evaluation The above inconsistency checking rules can tell if a partial result is forged. However, they cannot tell what the true partial value is. If we simply exclude all inconsistent partial results from further processing, the network may lose a number of true sensor data and it also makes the final aggregation results unreliable. We need a way to keep the trustful results. To achieve this, we introduce the concept of trust into the innetwork aggregation process, and build up the trust evaluation model based on the inconsistency checking results. We use a range 1, 1 to map the trustfulness. 1 indicates an untrustful value and bad reputation, while 1 means a trustful value and good reputation. Initially, every node gives the trust value 0 to all of its children. If a node detects an inconsistent result from one of its children, the node penalizes the child and set the new trust value as 1

.

(1)

If the partial aggregation result passes inconsistency checking, the child is awarded a new trust value as 1

,

(2) where Tnew is the updated trust value, Told is the pervious trust value, and the parameter w controls the balance between them. with the Here we use two different w ( consideration that we expect trustworthiness is gained slowly, but can be ruined easily by intense negative behaviors. One related issue is the stability of the trust evaluation results. In literature, S. Ganeriwal et al. have studied and demonstrated the stability of trust value update [17][18]. We have also performed some tests, and got similar results. Due to the space limitation, we will not discuss this issue in detail here.

978-1-4244-4148-8/09/$25.00 ©2009 This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2009 proceedings.

D. Trust-Aware Aggregation The data aggregation operation is guided by the trustworthiness computed from the trust evvaluation model. There are three ways (or methods) to perform m trust-aware data aggregation. we exclude both 1) Method 1: If inconsistency found, w values from aggregation. The trust vaalue is not used. 2) Method 2: If inconsistency found, select the node with The trust is built higher trust value for aggregation. T only based on its local observation. 3) Method 3: If inconsistency found, ask recommendation from sibling node,, then select the node with higher trust value. The truust is built based on its local observation and the recom mmendations. III. PERFORMANCE EVALUATIO ON We implemented our approach in TinyOS S [6] and extend ware in-network the TinyDB [7] to support the new trust-aw aggregation operations. Figure 3 illustrates the multi-parent tree topology used in our experiments, in whiich the root node is located at the network center. The arroows in Figure 3 indicate the child-to-parent relations. Except ffor the root node, each node has more than two parents.

compromised nodes. Since the co ompromised nodes always forge the partial aggregation resultt to 1, which is larger than the real sensor readings, the final aggregation a result is always 1. The green line, as a base line, sho ows the true maximal value over time if no compromised nodee exists in the network, and the black line is the final aggrregation results using our proposed trust-aware aggregation ap pproach. From these figures, we can see that all the three proposed trust-aware aggregation approaches produce better aggregation results comparing with the one without security considerations. It is noted here th hat the inconsistency check still might not be able to fully y recover the real partial aggregation results. Figure 4(a) shows s Method 1 achieves lower performance than Method 2 and a 3 shown in Figure 4(b) and Figure 4(c), since some un ncompromised results are excluded in the aggregation proceessing. By using the trust value to assist the aggregation operaation, we can achieve better performance. Among these three methods, m Method 3 (shown in Figure 4(c)) achieves the mo ost trustful result (almost perfectly match the true maximal values), v as the trust is built upon a more complete knowled dge by taking both selfobservation and recommendation in nto account. Figure 4(d) illustrates the glob bal trust evaluation result, which is a contour map of the trust t values based on the collected trust information from disstributed nodes. The arrows indicate the trust direction given by b a parent to a child. The node with lower trust value mean ns it is less trustful to its parents. We also mark the real loccation of the compromised nodes as solid circles. We can see that the trust contour map matches well with the locations of o the compromised nodes. Thus, it is very helpful to identify the compromised nodes. It is noted that the trust contour map is i a by-product of our trustaware data aggregation mechanism. Test 2: Five randomly chosen com mpromised nodes working in a collusive way

Figure 3: Multi-Parent Tree Structure for Expperiments

In the following tests, the MAX() aggregaation operation is used, and the local sensor readings are in thee range of (0, 1). The compromised nodes forge the partial aggrregation result by setting the AggValue to 1. We compare thee performance of data aggregation with and without our propposed trust-aware in-network aggregation approach. We plot thee trust evaluation results based on inconsistency check and the aggregation results using the three methods defined in Secttion II-D. Test 1: Five randomly chosen compromised d nodes working independently The first set of tests is based on five rrandomly chosen compromised nodes, which work independenntly. Figure 4(a), (b) and (c) illustrate aggregation results using methods defined in Section II-D. The red line indicates thhe compromised MAX() value over time with five raandomly chosen

o address a more realistic The second set of tests is to scenario that the attacker comprom mises multiple sensor nodes in a nearby region and make the nodes n to collude with each. First, we let a pair of parent-and d-child nodes collude and always report their AggValue as 1. Other compromised nodes report AggValue as 1 but at scaattered geo-locations. The similar results are obtained, shown in Figure 5. Our approach can successfully detect a comprromised value. The final aggregation results, based on diffferent methods to handle inconsistent results, are similar to the results from the above test where all compromised nodes are geographically isolated from each other. To handle incon nsistent partial aggregation results, method 3 is still the best am mong the three methods. We also plotted the global tru ust contour map in Figure 5(d), in which the real locations of the t compromised nodes are marked as solid circles. Again, thiis result also demonstrates the proposed approach is efficient in dealing with parent and child colluding attack. Another test we performed is to simulate sibling node colluding attaack in which similar results were obtained.

978-1-4244-4148-8/09/$25.00 ©2009 This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2009 proceedings.

1

1 0.995

0.99 0.99 0.985 max value

max value

0.98

0.97

0.98 0.975 0.97

0.96

0.965

0.95 0.96

0.94

0

10

20

30

40

50 60 70 time compromised max

real max

80

90

0.955 0

100

10

20

trustful max

30

40

real max

(a) Aggregation results using Method 1

50 time

60

70

compromised max

80

90

100

trustful max

(b) Aggregation results using Method 2 Trust based on inconsistency

1

100

0

90

0.99

-0.1

80 70

-0.2

60

0.97 y

max value

0.98

0.96

50

-0.3

40 -0.4

30

0.95 20

0.94 0

10

20

30

40

50 60 time compromised max

real max

70

80

90

-0.5

10

100

0 0

trustful max

20

40

60

80

100

-0.6

x

(c) Aggregation results using Method 3

(d) Trust contour map

1

1

0.99

0.99

0.98

0.98 max value

max value

Figure 4: Aggregation and Trust Evaluation Results in Test 1

0.97

0.97

0.96

0.96

0.95

0.95

0.94

0

10

20

30

real max

40

50 60 time compromised max

70

80

90

100

0.94 0

10

20

trustful max

30

40

50 60 time compromised max

real max

(a) Aggregation results using Method 1

70

80

90

100

trustful max

(b) Aggregation results using Method 2 Trust based on inconsistency

1

100

0.995

90

0.99

80

0.985

70

0.98

60

0.975

50

0

y

max value

-0

0.97

40

0.965

30

0.96

20

0.955 0

10

20

30

real max

40

50 time

60

70

compromised max

80

90

trustful max

10

-0

-0

-0

-0

10 0 0

20

40

60

80

100

-0

x

(c) Aggregation results using Method 3

(d) Trust evaluation results

Figure 5: Aggregation and Trust Evaluation Results in Test 2

978-1-4244-4148-8/09/$25.00 ©2009 This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2009 proceedings.

1

1 0.995

0.99 0.99

0.98 max value

max value

0.985 0.98 0.975

0.97

0.96

0.97 0.965

0.95 0.96 0.955 0

10

20

30

real max

40

50 time

60

70

compromised max

80

90

100

0.94

0

10

20

30

40

50 60 70 time compromised max

real max

trustful max

(a) Aggregation results using Method 1

80

90

100

trustful max

(b) Aggregation results using Method 2 Trust based on inconsistency

1

100

0.995

90

0.99

80

-0.1

0.985

70

0.98

-0.2

60

0.975 y

max value

0

50

-0.3

0.97 40

0.965

20

0.955 0.95

-0.4

30

0.96

-0.5

10

0

10

20

30

real max

40

50 time

60

70

compromised max

80

90

100

trustful max

0 0

20

40

60

80

100

x

(d) Trust contour map

(c) Aggregation results using Method 3

Figure 6: Aggregation and Trust Evaluation Results in Test 3

Test 3: Five randomly chosen compromised nodes with three of them colluding To further study the performance of the proposed approach under the colluding attacks, we ran another test to have three colluding nodes by reporting the AggValue as 1. The aggregation results and trust contour map are shown in Figure 6. We can see that the performance is worse than previous tests when three sensor nodes are colluding. Method 1 shows a little bit better performance than Method 2 and 3 as it excludes all inconsistent results from the aggregation processing, but also sacrifices some true partial aggregation values. Method 2 and 3 could not perform well as a majority of sensor nodes in the local area were compromised and their results mislead the network. IV. DISCUSSIONS A. Robustness From our evaluation results, we can easily see that our designed trust-aware in-network aggregation mechanism is effective in dealing with the security breach of faking partial aggregation results. Compromised aggregation results can be detected through the local inconsistency checking. In this way, our approach can successfully identify the compromised nodes and exclude the compromised partial aggregation result from the in-network processing.

Besides faking the AggValue, the compromised node may also choose to forge other components in the aggregation messages. For example, the node may fake the Support. This issue can be handled based on the following observations: 1) if the network is static, nodes can acquire neighbors’ local family information when the network starts up; 2) a compromised node cannot add faked children ID to its Support, since the Support of a node can only be a subset of the node’s local family; 3) by intentionally reducing its Support set, a compromised node can cause its result locally unverifiable by its parents. Since the unverifiable results are excluded in further processing, an attacker has little gain on forging Support. Thus, our approach can be considered as robust against the Support forgery. The compromised node may also choose to forge the Base in the aggregation messages. In fact, this is equivalent to the case of forging AggValue. By testing the collusive attacks, we have shown that our approach can also handle a certain level of collusion. Its robustness to collusive attacks highly depends on the network density and connectivity. If a majority of sensor nodes in the local area are compromised, the true aggregation value in this local area cannot be recovered. B. Optimization As discussed above, in our approach, each node receives constant-sized partial results from its children. After

978-1-4244-4148-8/09/$25.00 ©2009 This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2009 proceedings.

inconsistency checking and local aggregation, each node broadcasts a new constant-sized partial aggregation result to its parents. Our approach does not require additional number of messages than current in-network aggregation approaches, which makes it resource efficient to the energy constrained WSNs. We also notice that the approach increases the data length by introducing the Support and Base. Further optimization can be done to reduce the message length and further improve its performance. Both Support and Base consist of node IDs from the local family. A subnet name which can be represented by fewer bits can be used to represent the IDs in Support and Base. If the network is static, nodes can cache neighbor’s family information when the network starts. A node only needs to provide the IDs whose results have been excluded from its local processing (e.g. a child providing unverifiable or inconsistent results). We can revise the Support as the difference between the local family and the actual current Support. If the communication quality is good and only a few compromised nodes exist, the required data size to represent Support can be reduced. C. Extension for Other Aggregation Types So far, we have only discussed our approach for duplicate insensitive aggregation types, MAX() and MIN(). It, however, cannot be directly applied to duplicate sensitive aggregation types, such as SUM(), COUNT() or AVG(). Fortunately, we can use the Flajolet and Martin (FM) algorithm [8] to convert the duplicate sensitive aggregation type to duplicate insenstive type. By using the FM algorithm, each local partial aggregation value can be converted into a binary number by a hash function. By doing so, the binary number represent an estimated value of the input partial aggregation value. More importantly, the aggregation can be done through the bitwise union which makes the aggregation duplicate insensitive. Through the FM algorithm, our approach can be extended to support the other aggregation types. Due to the space limitation, this paper cannot discuss it in detail. V. RELATED WORK In existing in-network aggregation approaches, such as TAG [1], sensor nodes are connected through a single-parent tree structure. In each epoch, each node collects partial results from its children, generates new partial result and sends the result to its parent. This in-network aggregation processing suffers from the security issues of partial aggregation result forging. Du et al [9] proposed a witness-based approach for innetwork aggregation. In short, when a sensor node does its local aggregation, one of its neighbors is chosen as a witness to verify the partial aggregation result. In practice, however, the wireless radio has limited communication range. The chosen witness may not be able to directly receive all input partial results of the node which the witness monitors. This approach requires additional communication cost for witness nodes to collect all necessary information, which also increases the system vulnerability. For example, the witness

may receives a modified partial results, and make false assurance. Different sampling schemas have been used to detect compromised aggregation results [2][10][11]. The key idea is to sub-sample a small number of raw sensor readings. Based on the samples, it is possible to statisically detect if the aggregation result is significanly abnormal. To reduce the overhead, differnet sampling techniques are used to reduce the sample size. For example, FM algorithm is used in [11]. The method provides some help, but choosing a suitable sampling strategy to ensure sensitivity and security is a hard issue. In [12], Yang et al uses a dynamic tree structure to run innetwork aggregation. The approach is based on the assumption that a false aggregation result exhibits abnormality. If a node sends suspicious aggregation results, the approach prunes the sub-tree rooted at the node. In this way, the compromised node has less number of children and less power to affect the final aggregation result. To detect the abnormality, a partial aggregation result needs to be compared against other partial aggregation results from different locations. In practice, however it is hard to tell the abnormality of partial aggregation results, especially if the underlying phenomenon is spatiotemporally heterogeneous. Recently, several approaches are proposed to decompose the aggregation integrity checking [4][13]. In these approaches, each node is required to check if its local result is correctly aggregated by its parent. Due to the limited wireless communication range, a node needs its parent to send the results from the node’s siblings, which requires additional communication overhead and also faces some security challenges. The trust and reputation management has been proven to be useful in other domains, such as E-Commerce, peer-to-peer environments [14][15][16]. The trust management has also been applied to WSN to check the integrity of sensor readings [17][18], and communication routing [19]. The success of trust management inspires us to use trust to the in-network aggregation processing. VI. CONCLUSIONS In conclusion, we present an efficient trust-aware innetwork aggregation approach, which can successfully identify compromised nodes and exclude the compromised partial aggregation result from the in-network aggregation processing. The aggregation result obtained from our approach is more close to or even match the true aggregation value. The approach provides a certain level of robustness to collusive attacks. Another nice feature of the approach is that it does not require additional number of messages, which makes it resource efficient to the energy constrained WSNs. For future work, we will further study how to optimize the approach and make it more robust against collusive attacks. ACKNOWLEDGMENT Thanks to OSD for funding this research under Air Force Research Laboratory (AFRL) contract FA8650-08-M-1437.

978-1-4244-4148-8/09/$25.00 ©2009 This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2009 proceedings.

REFERENCES [1]

S. Madden, M. J. Franklin, J. Hellerstein, and W. Hong, "TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks," SIGOPS Oper. Syst. Rev., vol. 36, pp. 131-146, 2002. [2] H. Chan, A. Perrig, B. Przydatek, and D. Song, "SIA: Secure information aggregation in sensor networks," Journal of Computer Security, vol. 15, pp. 69-102, 2007. [3] Y. Hida, P. Huang, and R. Nishtala, "Aggregation query under uncertainty in sensor networks," University of California, Berkeley, Technical Report, 2004. [4] H. Chan, A. Perrig, and D. Song, "Secure Hierarchical In-network Aggregation for Sensor Networks," In Proceedings of the 13th ACM Conference on Computer and Communications Security (CCS 2006), 2006. [5] M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, "LOF: Identifying Density-Based Local Outliers," Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, May 16-18, 2000, Dallas, Texas, USA, pp. 93-104, 2000. [6] TinyOS: http://www.tinyos.net/ [7] Sam Madden, Michael J. Franklin, Joseph M. Hellerstein and Wei Hong, “TinyDB: An Acquisitional Query Processing System for Sensor Networks,” ACM TODS, 2005. [8] P. Flajolet and G. N. Martin, "Probabilistic counting algorithms for data base applications," J. Comput. Syst. Sci., vol. 31, pp. 182-209, 1985. [9] W. Du, Y. S. Han, and P. Varshney, "A witness-based approach for data fusion assurance in wireless sensor networks," In Proceedings of the IEEE Global Telecommunications Conference, pp. 1435-1439, 2003. [10] B. Przydatek, D. Song, and A. Perrig, "SIA: secure information aggregation in sensor networks," SenSys '03: Proceedings of the 1st international conference on Embedded networked sensor systems, pp. 255-265, 2003.

[11] P. M. M. Garofalakis and J. Hellerstein, "Proof sketches: Verifiable innetwork aggregation," In Proceedings of the IEEE 23rd International Conference on Data Engineering, 2007. [12] Y. Yang, X. Wang, S. Zhu, and G. Cao, "SDAP: A Secure HopbyHop Data Aggregation Protocol for Sensor Networks," MobiHoc, 2006. [13] K. B. Frikken and I. J. A., "An efficient integrity-preserving scheme for hierarchical sensor aggregation," WiSec '08: Proceedings of the first ACM conference on Wireless network security, pp. 68-76, 2008. [14] Z. Despotovic and K. Aberer, "P2P reputation management: probabilistic estimation vs. social networks," Computer Networks: The International Journal of Computer and Telecommunications Networking, vol. 50, pp. 485-500, 2006. [15] L. Xiong and L. Liu, "Building trust in decentralized peer-to-peer electronic communities," In The 5th International Conference on Electronic Commerce Research. (ICECR), 2002. [16] S. D. Kamvar, M. T. Schlosser, and H. Garcia-molina, "Eigenrep: Reputation management in p2p networks," the 12th International World Wide Web Conference (WWW 2003), 2003. [17] S. Ganeriwal and M. B. Srivastava, "Reputation-based framework for high integrity sensor networks," SASN '04: Proceedings of the 2nd ACM workshop on Security of ad hoc and sensor networks, pp. 66-77, 2004. [18] S. Ganeriwal, L. K. Balzano, and M. B. Srivastava, "Reputation-based framework for high integrity sensor networks," ACM Transactions on Sensor Networks, vol. 4, pp. 1-37, 2008. [19] T. Ghosh, N. Pissinou, and K. Makki, "Collaborative Trust-based Secure Routing Against Colluding Malicious Nodes in Multi-hop Ad Hoc Networks," LCN '04: Proceedings of the 29th Annual IEEE International Conference on Local Computer Networks, pp. 224-231, 2004.

978-1-4244-4148-8/09/$25.00 ©2009 This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2009 proceedings.