A group-based security scheme for wireless sensor networks ...

3 downloads 2110 Views 629KB Size Report
Distributed nature and their deployment in remote areas, these networks are vulnerable to numerous security threats that can adversely affect their proper ...
Ann. Telecommun. (2012) 67:455–469 DOI 10.1007/s12243-011-0278-3

A group-based security scheme for wireless sensor networks Md. Abdul Hamid & A. M. Jehad Sarkar

Received: 26 April 2011 / Accepted: 4 November 2011 / Published online: 22 November 2011 # Institut Télécom and Springer-Verlag 2011

Abstract In recent years, wireless sensor networks have been a very popular research topic, offering a treasure trove of systems, networking, hardware, security, and applicationrelated problems. Distributed nature and their deployment in remote areas, these networks are vulnerable to numerous security threats that can adversely affect their proper functioning. The problem is more critical if its purpose is for some mission-critical applications such as in a tactical battlefield. This paper presents a security scheme for groupbased distributed wireless sensor networks. Our first goal is to devise a group-based secure wireless sensor network. We exploit the multi-line version of matrix key distribution technique and Gaussian distribution to achieve this goal. Secondly, security mechanisms are proposed for such a group-based network architecture in which sensed data collected at numerous, inexpensive sensor nodes are filtered by local processing on its way through more capable and compromise-tolerant reporting nodes. We address the upstream requirement that reporting nodes authenticate data produced by sensors before aggregating and the downstream requirement that sensors authenticates commands disseminated from reporting nodes. Security analysis is presented to quantify the strength of the proposed scheme against security

M. A. Hamid (*) Department of Information & Communications Engineering, Hankuk University of Foreign Studies, Yongin, Kyounggi-do 449-791, South Korea e-mail: [email protected] A. M. J. Sarkar Department of Digital Information Engineering, Hankuk University of Foreign Studies, Yongin, Kyounggi-do 449-791, South Korea e-mail: [email protected]

threats. Through simulations, we validate the analytical results. Keywords Wireless sensor networks . Security . Groupbased data aggregation . Gaussian distribution . Node capture attack

1 Introduction It becomes feasible to deploy densely distributed wireless networks of sensors since new fabrication and integration technologies reduce the cost and size of micro-sensors and wireless interfaces. Consequently, sensors and sensor networks have received a great deal of attention from diverse communities, including hardware, networking, operating systems, database, security, and various application-specific areas. These networks of sensors promise to revolutionize environmental, earth and biological monitoring applications, exhibiting data gathering at granularities unrealizable by other means. The availability of sensors and low-power wireless communications capability [1] drive the wireless sensor networks (WSNs) to be economic solutions to many applications as forest fire detection, battlefield surveillance, disaster management, homeland security operations, target tracking, border control and so on in soil, marine, and atmospheric contexts [2, 3]. In these networks, a distributed collection of sensors forms a network interconnected by wireless communication links and each sensor acts as an information source, sensing and collecting data from its environment. A new set of security challenges arises in sensor networks due to the fact that current sensor nodes lack hardware support for tamper-resistance and are often deployed in unattended environments where they are vulnerable to capture and compromise by an adversary. A

456

Ann. Telecommun. (2012) 67:455–469

severe consequence of node compromise is that once an adversary has obtained the cryptographic key(s) of a sensor node, it can surreptitiously breach the security (e.g., insert replicas of that node within the network; know the basic security functionality of the network). In this paper, we design a group-based security scheme focusing on missioncritical applications of WSN (e.g., military applications where sensor networks have the task of battlefield surveillance). However, the proposed scheme may be applied to any general sensor network applications for group-based data aggregation. The goals of this work are to develop a group-based network and to design a set of mechanisms to enable the communications security. A preliminary version of this work can be found in [4]. In this new version, we have provided two different sections in this paper. Section 6.3 that shows the simulation results of secure network formation overhead and Section 7 that outlines the discussion and further research scope. Furthermore, we have extended the proposed scheme with more simulations on the network connectivity and impact of security strength in a wide variety of network settings. As shown in Fig. 1, in one form of data processing locally, powerful reporting nodes (RNs) aggregates sensor data collected from a group of collocated sensor nodes (SNs) before transmitting the aggregated data on to a base station (BS) directly or via other RNs on its way to the BS. For example, in a group of sensors, the RN may collect readings from different sensors in a given time interval, locally authenticate (process) these readings, and then forward an aggregated report of these readings. In another form, the BS generates and disseminates commands to RNs only and each RN may communicate locally its member Fig. 1 A group-based sensor network for ground monitoring application (e.g., battlefield surveillance). Reporting node collects the sensed data from its group members (sensor nodes) and makes an aggregated report for the destination

sensors. The benefits of such a group-based WSN are: (1) data aggregation results in better scalability, prolonged lifetime since aggregation reduces the volume of data communicated throughout the network and interest/query can be implemented in a localized manner, and (2) data dissemination (command/control messages) from the base station outwards through the reporting nodes and downstream towards the sensors in local groups results in a scalable and efficient means of communication. A key advantage of WSN is that the network can be deployed on the fly and can operate unattended, without the need for any pre-existing infrastructure and with little maintenance. Typically, sensor nodes are deployed randomly (e.g., via aerial deployment), and are expected to self-organize to form a multi-hop network. Such deployment in practice, it is usually very difficult, and sometimes impossible, to guarantee the knowledge of sensors’ expected locations. Moreover, assumption that the locations of sensor nodes can be pre-determined to a certain extent, severely limits the deployment of sensor networks. Our first goal is to devise a group-based secure network combining the pre-distributed cryptographic keys and deployment model using Gaussian distribution, respectively. Using Gaussian distribution, we can achieve the desired network where sensor nodes are deployed in groups, and the nodes in the same group are close to each other after the deployment. Through extensive simulation, we show that more than 93% network connectivity can be achieved with Gaussian (Normal) distribution technique. Secondly, security mechanisms are proposed for such a group-based network architecture in which sensed data collected at numerous, inexpensive sensor nodes are filtered by local

Opponent Side

BS / Sink

Adversary Reporting Node Sensor Node Controlling Side

Ann. Telecommun. (2012) 67:455–469

processing on its way through more capable and compromise-tolerant reporting nodes. This paper presents a group-based deployment approach similar to the scheme in [5] to model the network. The work in [5] develops a group-based key pre-distribution where it does not require the knowledge of sensors’ expected locations and greatly simplifies the deployment of sensor networks. However, our approach differs from [5] as our primary goal is to design the group-based secure network to deal with sensed information aggregated from the groups of collocated sensors and to show the tolerance against node capture attacks. In particular, we have performed the security analysis to see the impact of node capture attacks both in intra-group and network-wide scenarios. As stated earlier, a severe threat is that the sensors can be physically captured/compromised and attackers can retrieve secret information (e.g., cryptographic materials) from the node. Nevertheless, if attackers understand the behavior of the software running on the node, they can totally control the behavior of the compromised node without violating the communication protocols used by the sensor networks. Analysis and simulation results show that group-based data aggregation exhibits strong security against node capture attacks. Analysis shows that robustness can significantly be improved by increasing the deployment density using both the reporting and/or ordinary sensor nodes. Since the resources of a sensor node are very constrained, the key establishment protocols should be lightweight and minimize communication overhead and energy consumption. Moreover, it should be possible to add new sensor nodes incrementally to the sensor network. The keying protocols should be scalable, i.e., the size of the sensor network should not be limited by the per node storage and energy resources. The main contributions of this work can be summarized as follows: &

& &

We engineer a group-based network combining the predistributed cryptographic keys and deployment model using multi-line matrix key and Gaussian distribution, respectively. Then, security mechanisms are proposed for such a group-based network architecture in which sensed data collected at numerous, inexpensive sensor nodes are filtered by local processing on its way through more capable and compromise-tolerant reporting nodes. Theoretical analysis (using quantitative and Poisson process analyses) is presented to quantify the strength against security threats. Through simulations, we validate the analytical results. We also present the secure network formation overhead and the network connectivity from the simulations.

The rest of the paper is organized as follows. Section 2 presents related works. Assumptions are outlined in

457

Section 3. In Section 4, we present our security scheme in details. A theoretical analysis for its security performance is given in Section 5. Performance evaluation through simulations is presented in Section 6. Discussion and further research issues are described in Sections 7 and 8 concludes our works.

2 Related works Generally, each sensor node in the network acts as an information source, sensing and reporting data from the environment for a given task. The low-cost sensor nodes forward the relevant data to a querying sink/BS. However, reporting sensed data is often unnecessary as in many cases sensor nodes in an area detect the common phenomena. So there is a high redundancy in sensed data. Nevertheless, it is very inefficient for every single sensor to report their data back because every data packet traverses many hops to reach BS and nodes are stringent in resources (e.g., memory, communication, computation and battery). Although, related works in sensor networks ignore the aspects of group-based local data processing, recently, many data aggregation protocols [6–10] have been proposed to eliminate the data redundancy in sensor data of the network, hence reducing the communication cost and energy expenditure in data collection. Directed diffusion in [8] data routing for WSNs, local data processing reinforces routes at intermediate nodes such that sensor data is routed correctly to the sinks that express interest in those events. However, if the aggregation process is not secured, it can be an easy target for attackers in which, an adversary can inject false data or modify transmitted data, or more dangerously compromise or claim to be an aggregator, in order to significantly falsify the result of aggregation. To defeat attacks against aggregation process, several secure aggregation protocols were proposed in the literature [11–14]. However, these protocols either introduce some heavy communication or computation overheads [13], handle a special kind of aggregation [13], provide a limited resilience against aggregator nodes compromising [12], or require expensive interactive verifications between the BS and aggregators [11]. A secure aggregation protocol is proposed in [14] for cluster- based WSN, which does not require trusted aggregator nodes. Authors in [11] addressed secure data aggregation in sensor networks from the point of detecting forged aggregation data values. However, this paper does not address the issue of how to design a secure network infrastructure to support hierarchical WSNs, (i.e., how to set up trust between aggregators and sensors and how to securely disseminate commands from aggregators). Recently, a number of security protocols have been proposed for wireless sensor networks

458

to defend various kinds of security threats. Okorafor et al. [15] designed a novel lightweight secure integrated routing and localization scheme that exploits the benefits of link directionality inherent to wireless optical sensor networks. It leverages the resources of the base station and a hierarchical network structure to identify topological information and detect security violations in neighborhood discovery and routing mechanisms. In [16], authors have investigated the security problems unique to Unattended WSNs and proposed some simple and effective countermeasures for a certain class of attacks. Authors mainly focused on unattended WSNs characterized by intermittent sink presence and operation in hostile settings. Potentially lengthy intervals of sink absence offer greatly increased opportunities for attacks resulting in erasure, modification, or disclosure of sensorcollected data. Zhu et al. [17] have proposed a distributed security protocol called Localized Multicast for detecting node replication attacks. They have shown that Localized Multicast is more efficient in terms of communication and memory costs in large-scale sensor networks, and at the same time achieves a higher probability of detecting node replicas. To provide communications security, the right choice of cryptographic methods and key distribution techniques are crucial. For example, public key cryptography is a possible solution for key establishment in wireless networks. Though recent work [18] demonstrates cases in which public key cryptography can be implemented on some resource-constrained devices, it is not yet feasible for all multi-hop networks. The use of public key cryptography to authenticate the sender and receiver for every packet results in additional delays due to high computational complexity. Moreover, using public key for authentication requires signature generation and verification which may lead to high computational overhead and DoS attack respectively. Therefore, we will apply a deterministic symmetric key distribution technique to avoid: (1) the complex computation overhead for resource-constrained sensors (e.g., Public key), (2) the problem of centralized scheme since it uses a master key (single compromised key affects the whole network) and trusted third party, (3) the problem of probabilistic key sharing (i.e., requires large number of key-chain to have a common key [19]). Furthermore, most solutions did not address node re-keying or security at different levels of granularity. Our solution, on the other hand, assigns different level of security loads according to resource capability and is applicable to members that are affiliated subgroups of nodes.

3 Assumptions We consider tiered network architecture (Fig. 1) composing three types of devices, one or more base stations/

Ann. Telecommun. (2012) 67:455–469

sinks (BS), a number of RNs and SNs. The BS is considered as resourceful enough in terms of residual energy, computation power and speed, and communication. SN is simple, inexpensive and stringent in resources, while RN is rich in resources and more compromisetolerant and having transmission range more than 2×RSN, where RSN is the transmission range of an ordinary SN. Within each group, an RN can both aggregate data and disseminate commands and one RN can communicate with its neighbor RN to forward its aggregated messages as well as messages that come from other RN towards the BS. RNs communicate with base station or neighbor RNs using one channel and RN communicates with its member SNs using another channel to send/receive messages. We assume that sensor network is only deployed by a single party and hence nodes deployed by multiple untrusted parties are not part of the same network and all the SNs and RNs are static after they are deployed in the deployment field.

4 Group-based security scheme In this section, we present our proposed security scheme in details i.e., devise the group-based security scheme as shown in Fig. 1. We present a sequence of procedures to accomplish this task: (1) cryptographic key predistribution, (2) group-based deployment, (3) secure network formation (network setup) with pre-distributed secrets and deployment model, (4) secure data aggregation, and (5) re-keying (periodic refreshment of secret keys). 4.1 Key pre-distribution We exploit matrix key distribution technique [20] in our group-based sensor networks to pre-distribute pair-wise shared keys. The matrix key distribution scheme is based on the idea that lets each node have a set of keys of which it shares distinct subset with every other node. We apply this distinctive idea to design a key distribution mechanism for the proposed group-based security scheme. First, we provide a deep look at matrix key distribution technique and describe how this protocol works and then we apply the multi-line version of this technique to engineer the secret key distribution for our proposed scheme. Matrix key distribution Consider the figure (Fig. 2), suppose that there are N nodes in a m×m space, where N=m2, and each node is assigned a position (i,j) and is denoted as nij. Similarly, there are N keys denoted as kij. A key server generates the keys at random and gives node nij a set of

Ann. Telecommun. (2012) 67:455–469

459

find a common key, they solve t(t−1) linear equation groups, each of which has the form of Eq. 3. y  j þ Cp ðx  iÞ ¼ 0 mod ðmÞ y  v þ Cq ðx  uÞ ¼ 0 mod ðmÞ

) ;

ð3Þ

where p, q=1,2,…,t and p ≠ q. The solutions (x, y) are positions on the communication map of keys that nodes nij and nuv have in common. Each pffiffiffiffi node gets t N ¼ tm keys (Property 1. in [20]).

Fig. 2 Matrix key distribution

keys which consists of all the keys that are on either the same row or column as nij. Hence, nij gets the keys according to Eq. 1.   Kij ¼ kxy jx ¼ i or y ¼ j :

ð1Þ

When node A (nij) wants to communicate with B (nuv), it simply finds out B’s position (u, v) and uses the keys kiv and kuj that are common between A and B to compose a session key (as shown in the Figure). Multi-line key distribution The weakness of the simple matrix key distribution technique is that, if nodes A and B are on the same line or column, any node on the same line or column may compromise the session because it shares the same common keys used between A and B. When A and B are not on the same line or column, the situation is better as two correctly positioned colluding nodes are needed to compromise the session key. To overcome this drawback, a multi-line protocol is developed in [20] by allocating more key lines to each node instead of only two, as in the basic scheme. In multi-line protocol, the key set of node nij is assigned according to Eq. 2   Kij ¼ kxy jy  j þ Cl ðx  iÞ ¼ 0 mod ðmÞ ð2Þ where, l= 1,2,…t and Cp ≠ Cq when p ≠ q. Here, the key set Kij is a set of t lines on the key map all passing through point (i,j). If, two nodes nij and nuv want to

Key distribution for the proposed group-based scheme According to our desired network architecture in Fig. 1, in each group, there is one powerful node (RN) under which there are Z ordinary SNs. We will delegate the trust in a distributed manner such that sensors in a group trust its RN and an RN trusts its neighbor RN and/or base station. Let us assume that there are Y RNs with Z SNs under each RN. For Y RNs, a matrix is divided into m×m logical positions (i,j) such that Y≤m2and each position is assigned a key denoted as kij. Secret keys are distributed to RNs and SNs in such a way where each RN is pre-loaded with multi-line key set Kij according to Eq. 2 and sensors within this RN hold a distinct subset, lk of this key set. In pffiffiffiffi this technique, each RN gets t Y ¼ tm keys and each SN gets lk keys according to Eq. 4. lk ¼ Maxfbtm=Z c; 1g

ð4Þ

Note that, for tm≥Z, each of the Z SNs in a group has distinct set of keys. For example, if Y=900, Z=8 and t=2, each RN gets 60 keys and each SN gets lk ¼ Maxfbtm=Z c; 1g ¼ 7 keys. Apart from this, a key server (base station) generates Y individual keys and pre-distributes each RN so that each RN shares this key with the base station. The key server also generates and pre-distributes each RN and SN a unique individual key to communicate with the BS. With preshared secret keys, RNs and SNs that are to be deployed are divided into groups {Gi | i=1,2,…,Y}. There is a limitation with this distribution. The number of keys used throughout the network is confined by the number of RNs Y. To overcome this, we may use more keys by making use of additional key positions [20], i.e., there is no RN associated with those key positions, but secret keys can be used by the RNs and SNs. For example, instead of Y≤m2, we can use 2Y≤m2 or 3Y≤m2 and/or even more according to the number of RNs and SNs in the network. Such a technique can support large number of SNs with the expense of storage overhead for the RN. Interested readers may refer to [20] for more details about multi-line key distribution technique.

460

Ann. Telecommun. (2012) 67:455–469

4.2 Group-based deployment We assume that the groups are evenly and independently deployed on the targeted field. The nodes in the same group Gi are deployed from the same place at the same time with the deployment index i. During the deployment, the resident point of any node in group Gi follows a probability distribution function fi(x, y), which we call the deployment distribution of group Gi. The actual deployment distribution is affected by many factors (e.g., height, speed, nature of deployment area/ground etc.). For simplicity, we model the deployment distribution as a Gaussian (Normal) distribution since it is widely studied and proved to be useful in practice [5]. The mean of the Gaussian distribution μ equals (xi, yi), and the probability distribution function for any node in group Gi is the following: fi ðx; yÞ ¼

1 ½ðxxi Þ2 þðyyi Þ2 =2s 2 e ¼ f ðx  xi ; y  yi Þ; 2ps 2

ðx2 þy2 Þ=2s 2 1 where f ðx; yÞ ¼ 2ps , and σ is the standard 2 e deviation. Each RN’s expected deployment point is centered at point (xi, yi) and sensors follow their respective RNs. Let us assume that VBS, VRN, and VSN are the sets of BS, RN, and SN, respectively, and G(V, E) represents entire network in the graph model where, V ¼ fVBS [ VRN [ VSN g. The network backbone GB ðVB ; EB Þ  GðV ; EÞis formed by the RNs and base stations, provided that, all RNs present in the backbone have at least one path between each other; i.e., each RN has a BS within its transmission range or has another RN that can communicate with any of the BSs, and, at least one path exists between the backbone and a BS. The SNs form RN rooted local groups (Fig. 1) by pair-wise keys. Hence, the connectivity between a SN and its parent RN depends on two basic conditions:

1. d(x, y)≤RSN, where x is a SN, y is an RN node, d(x, y) is the distance between two nodes x and y, and RSN is the transmission range of a SN, and 2. PairwisekeyRN =PairwiseKeySN. Figure 3 shows typical group-based deployment scenario according to proposed model using Gaussian distribution. In the figure, RN represents the local group aggregator and SN represents the constrained sensor nodes. SNs are connected locally to its RN and all the RNs are connected network-wide (i.e., backbone network). The backbone network formed by RNs is connected to the BS and thus, the whole network is connected. As stated earlier that deployment may be affected by many factors. If the nodes are deployed from the air (aerial) or vehicle (from the ground), it may happen that distance between RN and SN may vary after deployment and a SN

may fall at a distance from its parent RN such that it is not connected (i.e., RN is out of its transmission range RSN) due to the speed of deployment, height, and weather. In such cases, SNs may either fall at a distance where another RN is within its transmission range or SNs are completely out of the network (no RN is reached by SN’s transmission range RSN). Considering this fact, we evaluate the performance of the group-based deployment in terms network connectivity in Section 6. 4.3 Secure network formation After the deployment, each SN needs to discover its own RN to form the secure group. For this purpose, SN broadcasts an encrypted SN-JOIN-REQ message to all of the nodes within its transmission range. To securely communicate with its RN, an SN may use common keys in two ways. The simplest way is to use one its lk keys shared between this SN and RN. Prior to sending the join request message, it first discovers the common keys by exchanging the ID of the keys with the corresponding RN. In another way, instead of using direct key, a link key may be generated from those lk keys to communicate with its parent RN. For example, link key may be computed as the hash of all shared lk keys based on the order they are indexed in the original key pool as follows. linkkey ¼ hashðk1 jjk2 jj::::  kl Þ Although link key computed in this way results in computation overhead for the SN and RN, an adversary needs to know all the lk shared keys between communicating nodes to generate a valid message authentication code (MAC). According to the deployment model, there are 3 cases: (I) the SN is within the group, (II) the SN is out of the group, and (III) the SN is out of the network. Case I. If the corresponding RN is within its transmission range (i.e., one-hop distance), it gets the message and decrypts it as all the individual keys of the subordinate nodes are known to the RN. Upon successful decryption of the message, the RN sends a JOIN-APPROVAL message to the SN encrypting it with the pair-wise key. Thus, the SN becomes a dominated (member) node of its corresponding RN. Case II. If, for any SN, the RN assigned during predistribution of keys and group formation, is not within one-hop distance but another RN is present; the SN needs to inform the BS for resolving the issue. On discovering itself as out of its own RN, the SN sends an encrypted RNERROR message. This message is simply forwarded by the present RN. The BS checks

Ann. Telecommun. (2012) 67:455–469

461

Fig. 3 An exemplary groupbased deployment scenario (Simulated): observed topology from the proposed scheme using Gaussian distribution

y

x

the message and issues a command COMMARN to that RN to be its adopter. Upon receiving this command from the BS, the RN in turn sends subset of keys from its key set to welcome this SN in its own group. Case III. When the SN has no RN (i.e., SN does not get any response from any RN after it has broadcast join request message) as its one-hop neighbor, this SN is considered to be out of the network and it closes all its communications.

encrypts the report using the individual shared key with the base station and sends the report directly to BS or via other RNs with the format:    RN ! BS : IDRN jE k; IDSN1 jMAC1 jmsg1 ; :::; IDSNq jMACq jmsgq

Here, k is the shared key between RN and BS and E(.) is the encryption function. BS finally extracts this report by decrypting with the shared key k between RN and itself. From the aggregated report, BS can check authenticity of the source SNs and the time of the event occurred.

4.4 Secure data aggregation and forwarding 4.5 Re-keying (periodic refreshment of secret keys) Once the network is formed as a group-based distributed network, the sensory data can be transmitted securely to the BS. If there are Z number of SNs in a group, for fidelity and correctness of data, the RN waits for sensing events of that group from at least q(q≤Z) number of the SNs, where q is the threshold value set for a particular group. Let us consider one particular group to describe our upstream data aggregation protocol. Suppose an event occurs, then q out of Z (0≤q≤Z) SNs within the sensing area detect the event and send information to the RN as follows. SN ! RN : IDSN jIDk jMACjtjmsg where, MAC=(k,t || msg) Here, MAC is the message authentication code generated from the key k and the concatenated (−−) value of time stamp, t of the event and event msg, k is the shared key between RN and SN. Upon receiving the messages sensed by SNs, RN checks (authenticate) every single MAC and generates an aggregated report. The report contains at least q messages, q MACs and q IDs from a group. Now, RN

To provide mechanisms for securing downstream data dissemination, it requires at the minimum that each RN be able to trust commands sent from the BS and each SN be able to trust commands from its parent RN in each group. In this section, we present a mechanism to disseminate the control messages to describe re-keying strategy (periodical refreshment of secret keys for the whole network (Fig. 4)). BS generates new set of keys for the logical positions and assigns the keys to each RN according to Eq. 2. To refresh the secret keys for the whole network, BS unicasts re-keying messages to all RNs using the individual keys common between BS and RNs. Then, each RN decrypts the message and stores the new keys from the message. To refresh the keys in each group, corresponding RN sends the new subset, lk of keys to its group members according to Eq. 4. Each SN updates its key list for future communication. Thus, the re-keying takes place in a distributed manner and it reduces the communication traffic and computation load on the base station since BS needs to unicast O(Y) and

462

Ann. Telecommun. (2012) 67:455–469

Fig. 4 Control data dissemination: BS sends unicast messages to reporting nodes and each reporting node sends messages to its member sensor nodes BS / Sink

each RN needs to unicast O(Z) messages, where Y is the total number of RNs (i.e., groups) and Z is the number of SNs in a group, respectively. When any new node is deployed, it sends the JOIN_ REQ_NEW message using its own individual key and, if authorized by the access list of RN, it joins the group. Otherwise, RN forwards this to BS. BS informs RN about the SN by authenticating it. Then, RN sends keys to this SN encrypting with the newly added node’s individual key. For example, let’s say, SN4 in Fig. 5 wants to join the existing group, RN sends RN ! SN : EKSN ðlk Þ Similarly, when any SN wants to leave the group, it just sends a leave message as follows: SN4 ! RN : EKSN ðleaveÞ Upon receiving the leave message, RN deletes the leaving SN from its member list. Thus, a RN may locally manage the new nodes that join or nodes that leaves the group.

5 Security analysis In this section, we present analysis on the security threats posed by compromising the SNs and RNs and

RN → SN : EK SN (lk ) SN4 Joins

RN

quantify the security strength of the proposed groupbased data aggregation. An adversary may jeopardize the network in the following ways: First, an adversary can simply deploy some malicious SN into a group. A malicious sensor without having the secret key may try to produce a false sensing result with an identity of a legitimate sensor. Since, a malicious SN does not have the capability of producing a valid MAC for the false sensing result, the RN will identify it. Second, an adversary can copy some SNs from some group and deploy them into another group. As each RN knows the set of SNs within its own group, a RN is able to identify an illegitimate ID of SN. Nevertheless, a compromised SN cannot fool the RN of another group without knowing the valid key. Similarly, a compromised RN from one group does not fool the SNs of another group without knowing the valid keys. Third, given any group having a RN and Z SNs, an adversary may launch attack in three ways: (a) compromise some SNs only, (b) compromise the RN only, and (c) compromise the RN and some SNs concurrently. We term this security threat as intra-group node capture. And fourth, an adversary may launch attack in a distributed manner: (a) randomly compromise SNs throughout the entire network and/or (b) intelligently (selectively) compromise few RNs where maximum number of SNs has already been captured. We term this security threat as network-wide node capture. In the rest of this section, we analyze intragroup and network-wide node capture attacks.

RN

5.1 Intra-group node capture

SN4 Leaves

5.1.1 Sensor nodes are captured lk

lk

lk

lk

lk

lk

lk

SN1

SN2

SN3

SN1

SN2

SN3

SN4

Fig. 5 Re-keying mechanism in a local group (node join/leave)

A compromised ordinary sensor in a particular group may produce an invalid MAC by providing wrong guarantee for an aggregated report. The group-based scheme is robust against this kind of attack as long as no more than q sensors within a local group are compromised since we devise our scheme where each aggregated report carries q number of MACs from a local group.

Ann. Telecommun. (2012) 67:455–469

463

5.1.2 Only the reporting node is captured Z When an RN is captured, it may fabricate a report. But to do that, at least q MACs need to be forged. The probability that at least q out of Z MACs is correct is given by Eq. 5 pRN ¼

Z X

Z

j¼q

q

! p ð 1  pÞ q

Zq

ð5Þ

where, p = 1/2L and L is the MAC size in bits. It can be seen that this probability pRN is negligible for 4-byte CBC MAC [21]; moreover, only one group out of the entire network is in fact affected while other groups are not.

pðj; qÞ ¼

¼

Z x X j¼qx

! Zx p j ð1  pÞZxj j

where, p=1/2L and L is the MAC size in bits. Again, this probability is almost negligible [21].

Qq ! N Q

ð7Þ

j¼1

and, the expected number of groups having q ordinary SNs captured can be calculated by Eq. 8 as follows. "

ð6Þ

!

and let Gq denote the number of groups having q ordinary Y P gj;q ; SNs captured from the entire network. We get Gq ¼

E

pxRN

q

N Z

We define gj,q as:  1; if ordinary senor nodes are captured in jth group gj;q ¼ 0; otherwise

5.1.3 Reporting node and sensor nodes are captured We consider the situation where an adversary has compromised an RN and some number x(0≤x≤q) SNs concurrently. To inject a false report, a RN needs at least q valid MACs. Since an RN has to forge (q−x) more MACs, the probability that (q−x) out of (Z–x) is valid, is given by Eq. 6

!

Y X j¼1

# gj;q ¼

Z Y X   E gj;q ¼ Y :E gj;q ¼ Y :pðj; qÞ ¼ Y : j¼1

q

!

N Z

!

Qq ! N Q

ð8Þ Next, we assume that an adversary has captured some groups having the x(x≥q) SNs compromised and we call this situation as a complete group capture. Let X be the number of completely captured groups from the entire network. We can compute E[X] according to Eq. 9.



N Z Z Z Z X X Qq q

E½ X  ¼ Gq ¼ Y: N q¼z q¼z Q

ð9Þ

5.2 Network-wide node capture According to our deployment model, SNs are deployed uniformly at random and all the SNs are grouped into Y groups having Z SNs in each group. We investigate and quantify the security strength of the proposed groupbased scheme in terms of the distribution of the compromised SNs among the groups and the selective compromise of RNs where maximum number of SNs has already been captured by the adversary to maximize the security threat. 5.2.1 Quantitative analysis We analyze the robustness of our scheme when an adversary has randomly compromised Q(0≤Q≤N) ordinary SNs and j(0≤j≤Y) RNs from the entire network. Let p(j, q) be the probability that jth RN (i.e., jth group) having q(0≤ q≤Z) SNs compromised. Then, p(j, q) can be obtained by Eq. 7.

For demonstration purpose, we take a simple example where total number of SNs is N=170, with Z=10 SNs in each group (i.e., Y=17 groups). Figure 6a shows the expected number of compromised groups against the entire network’s compromised ordinary SNs. When 20 SNs are captured, 5 groups having 0 SNs captured and only 1 group having 3 SNs compromised. Figure 6b demonstrates the number of completely captured groups that depends on the value x. Four cases are shown when x is 4, 5, 6, and 7. When x is 4, 3 (16.66% of entire network) groups are fully captured against 40 (23.5%) ordinary captured SNs. But, as the value of x increases (e.g., 6 or 7), number of fully captured groups are much smaller. We observe that robustness can be improved significantly by increasing the value of x. 5.2.2 Poisson analysis Next, we use the properties of Poisson Process [22] to analyze the robustness of our scheme. In probability theory

464

Avg. no. of groups having q sensor nodes captured

(a)

Ann. Telecommun. (2012) 67:455–469 8

probability of any SN captured by an adversary can be given byp½C i ¼ 1 ¼ Q=N ; i ¼ 1; 2; ::::; Z. If the condition Q≪N holds, then, C1, C2, … , CZ are said to be independent according to the properties of Poisson Process [22] and the number of captured sensors in a particular group follows the Poisson distribution with approximated mean value Q/Y. So, if q sensors are captured in a group, we get the probability as

Q = 20 Q = 25 Q = 30 Q = 35

6

4

q

Þ p½q  eQ=Y ðQ=Y q! . Now, if the total number of groups is Y, we calculate the expected value of the number of groups Y having q sensors captured according to Eq. 10.

2

0 0

1

2

3

4

5

6

7

8

Number of sensors captured per group

Avg. no. of completely captured groups

(b)

x=4 x=5 x=6 x=7

10

ð10Þ

Next, considering the case where an adversary has captured some groups having the x (x≥q) SNs captured and we call this situation as complete group capture as stated earlier. Let X be the number of completely captured groups from the entire network. We can compute the expected value E[X] by Eq. 11.

14 12

 ðQ=Y Þq E Yq  Y :eQ=Y q!

8

E½ X   6

Z X q¼x

Yq 

Z X

Y :eQ=Y

q¼x

ðQ=Y Þq q!

ð11Þ

4

The analytical results of this Poisson process analysis are presented with the simulated results in the performance evaluation section.

2 0 10

20

30

40

50

60

70

80

Number of sensors captured throughout the network

Fig. 6 Quantitative analysis: network-wide robustness against node capture attacks with N=170, Y=17, Z=10. a Expected number of compromised groups against the entire network’s compromised ordinary SNs. b Number of completely captured groups that depends on the value x

and statistics, the Poisson distribution (or Poisson law of small number) is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since the last event [22]. In our proposed scheme, all nodes are uniformly distributed in a space. If we consider that the adversary attacks the networks in a distributed fashion, then we may approximate that the captured nodes are uniformly distributed across the network. Therefore, we may exploit the Poisson analysis to show the security strength/vulnerability for the proposed scheme. We consider the case where the number of compromised SNs Q is much less than the total number of SNs N (i.e., Q≪N). We denote the status of any sensor in a particular group as Ci, for i=1, 2, … , Z and consider it as Bernoulli random variable. Let Ci =1, if the ith sensor is captured and Ci =0, if not captured. Considering the Q captured SNs are uniformly distributed across the network,

6 Performance evaluation To evaluate the performance we have performed extensive simulations using ns-2 [23] to measure the network connectivity, secure network formation overhead and security strengths for the proposed scheme. 6.1 Implementation in ns-2 We have modified the AODV implementation of ns-2 to incorporate group-based network architecture. All SNs in the network are connected to the RNs in one-hop and all RNs form a tree based routing towards the BS. The routing tree is constructed using Warshall’s algorithm so that the sensed data could reach the sink with shortest number of hops. Simulation parameters used for our scheme are shown in Table 1. 6.2 Network connectivity Table 2 shows the simulation results of the deployment in an 1,000 m×1,000 m area with various density and maximum distance, d(x, y)between RN and SN (i.e., max (d(x, y))).

Ann. Telecommun. (2012) 67:455–469

465

Table 1 Simulation parameters Parameter

Value

Area Number of RN Number of SNs in a group RN transmission range SN transmission range BS location max d(x, y)

1,000 m×1,000 m 200/400/500 10/15 60 m 25 m [1,000, 500] 0.9×RSN, 1.0×RSN, 1.1×RSN

It is observed that Gaussian distribution provides better connectivity if the distance is kept within the sensors’ transmission range. For example, as shown in Table 2, more than 90% network connectivity can be achieved whend ðx; yÞ ¼ 0:9  RSN with 500 RNs and 7 , 5 0 0 S N s ( 1 5 S N s i n e a c h g r o u p ) . Wi t h d ðx; yÞ ¼ 1:0  RSN andd ðx; yÞ ¼ 1:1  RSN , connectivity falls down to 89.37% and 82.04%, respectively for the same number of RNs and SNs. Similarly, density also affects the network connectivity since keeping the deployment area same, dividing the area into more groups and/or increasing the number of sensors provides better connectivity as depicted in Table 2. 6.3 Secure network formation overhead Secure network formation involves a sequence of message exchanges as described in Section 4.3. We have quantified

Table 2 Connectivity of the group-based deployment model

max(d(x, y)) 0.9×RSN

1.0×RSN

1.1×RSN

the number of messages exchanged to set up a group-based network. The number of messages is calculated hop by hop (i.e., if a message takes four hops to the BS, it is counted as four messages). Figure 7 shows the number of messages required to build the group-based network with different number of groups and density with different d(x, y) as depicted in Table 2. The overall network formation overhead increases with the increased distance of SNs from the corresponding RNs since the number of out-of-group SNs increases and consequently the number of messages increases (Case II in Section 4.3). However, for all the cases, network formation overhead is linear with the number of nodes in the network. 6.4 Security strength Robustness against node compromise has been plotted in Fig. 8 using the analytical result calculated from Poisson process as well as the results obtained by simulation. The analytical and simulation results are almost similar as can be seen in Fig. 8a shows the average number of groups where q(0≤q≤Z) ordinary SNs are compromised by the attacker in each individual group; 121 groups having 0 and 15 groups having 2 SNs compromised when Q=100 SNs are captured from the entire network. Worst case scenario in our approach, when 400 SNs are capture network-wide, only 18 groups out of 200 having 4 SNs captured. Figure 8b shows the average number of groups that are completely captured when more than q(q≤x≤Z) sensor nodes are captured. For the values x equals 4, 5, 6, and 7,

# of RN

# of SN

In group

Out of group

Out of network

Connectivity

200 200 400 400 500

2,000 3,000 4,000 6,000 5,000

1,751 2,612 3,474 5,213 4,339

45 85 208 312 311

204 303 318 475 350

89.80% 89.90% 92.05% 92.08% 93.00%

500 200 200 400 400 500 500 200 200 400 400 500 500

7,500 2,000 3,000 4,000 6,000 5,000 7,500 2,000 3,000 4,000 6,000 5,000 7,500

6,519 1,573 2,360 3,148 4,718 3,934 5,893 1,293 1,923 2,570 3,873 3,211 4,833

484 68 140 333 515 519 810 149 252 528 847 851 1,320

497 359 500 519 767 547 797 558 828 902 1,280 938 1,347

93.37% 82.05% 83.33% 87.02% 87.22% 89.06% 89.37% 72.10% 72.50% 77.45% 78.67% 81.24% 82.04%

466

Ann. Telecommun. (2012) 67:455–469

d(x,y)