RFID Cardinality Estimation with Blocker Tags - Cse.msu.edu

0 downloads 0 Views 777KB Size Report
blocker tags makes RFID estimation much more challenging as some genuine ...... [4] T. Liu, L. Yang, Q. Lin, and Y. Liu, “Anchor-free Backscatter. Positioning for ... nxp.com/acrobat download/other/identification/SL092030.pdf, 2004. [22] E. Inc ...
2015 IEEE Conference on Computer Communications (INFOCOM)

RFID Cardinality Estimation with Blocker Tags Xiulong Liu∗† , Bin Xiao† , Keqiu Li∗ , Jie Wu‡ , Alex X. Liu¶§ , Heng Qi∗ and Xin Xie∗ ∗ School

of Computer Science and Technology, Dalian University of Technology, China of Computing, The Hong Kong Polytechnic University, Hong Kong ‡ Department of Computer and Information Sciences, Temple University, USA ¶ Department of Computer Science and Engineering, Michigan State University, USA § National Key Laboratory for Novel Software Technology, Nanjing University, China Email: {xiulongliudut, likeqiu, qhclement, xiexin0211}@gmail.com [email protected], [email protected], [email protected] † Department

Abstract—The widely used RFID tags impose serious privacy concerns as a tag responds to queries from readers no matter they are authorized or not. The common solution is to use a commercially available blocker tag which behaves as if a set of tags with known blocking IDs are present. The use of blocker tags makes RFID estimation much more challenging as some genuine tag IDs are covered by the blocker tag and some are not. In this paper, we propose REB, the first RFID estimation scheme with the presence of blocker tags. REB uses the framed slotted Aloha protocol specified in the C1G2 standard. For each round of the Aloha protocol, REB first executes the protocol on the genuine tags and the blocker tag, and then virtually executes the protocol on the known blocking IDs using the same Aloha protocol parameters. The basic idea of REB is to conduct statistically inference from the two sets of responses and estimate the number of genuine tags. We conduct extensive simulations to evaluate the performance of REB, in terms of time-efficiency and estimation reliability. The experimental results reveal that our REB scheme runs tens of times faster than the fastest identification protocol with the same accuracy requirement. Keywords—RFID Estimation, RFID Privacy, Blocker Tags.

I.

I NTRODUCTION

RFID systems have been widely used in a variety of applications such as supply chain management and inventory control [1]–[7] as the cost of commercial passive RFID tags is negligible compared with the value of the products to which they are attached (e.g., as low as 5 cents per tag [8]). For example, in Hong Kong International Airport where RFID systems are used to track shipment, the average daily cargo tonnage in May 2010 was 12K tonnes and has been on the rise [9]. An RFID system typically consists of a reader and a population of tags [10]. A reader has a dedicated power source with significant computing capability. It transmits commands to query a set of tags and the tags respond over a shared wireless medium. A tag is a microchip with an antenna in a compact package that has limited computing capability and longer communication range than barcodes. There are two types of tags: (1) passive tags, which do not have their own power sources and are powered up by harvesting the radio frequency energy from readers; (2) active tags, which have their own power sources. The widely used RFID tags impose serious privacy concerns as when a tag is interrogated by an RFID reader, no matter the reader is authorized or not, it blindly responds

978-1-4799-8381-0/15/$31.00 ©2015 IEEE

with its ID and other stored information (such as manufacturer, product type, and price) in a broadcast fashion. For example, a woman may not want her dress sizes and a patient may not want his/her medication, to be publicly known. An effective solution to this privacy issue is to use commercially available blocker tags [11], [12]. A blocker tag is an RFID device that is preconfigured with a set of known RFID tag IDs, which we call blocking IDs. The blocker tag behaves as if all tags with its blocking IDs are present. A blocker tag protects the privacy of the set of genuine tags whose IDs are among the blocking IDs of the blocker tag because any response from a genuine tag is coupled with the simultaneous response from the blocker tag; thus, the two responses collide and attackers cannot obtain private information. This paper concerns with the problem of RFID (population size) estimation with the presence of a blocker tag. Formally, the problem is defined as follows: given (1) a set of unknown genuine tags 𝐺 of unknown size 𝑔, (2) a blocker tag with a set of known blocking IDs 𝐵, (3) a required confidence interval 𝛼 ∈ (0, 1], and (4) a required reliability 𝛽 ∈ [0, 1), we want to use one or more readers to compute the estimated the number of genuine tags in 𝐺, denoted as 𝑔ˆ, so that 𝑃 {∣ˆ 𝑔 − 𝑔∣ ≤ 𝛼𝑔} ≥ 𝛽. In other words, we have a set 𝐺 of genuine tags with unknown number of unknown IDs and a set 𝐵 of tags with known number of known IDs, we want to estimate ∣𝐺∣ with the presence of 𝐵. The two sets 𝐺 and 𝐵 may overlap, as shown in Fig. 1. This problem may arise in many applications. For example, a jewel store may want to use such an RFID estimation scheme to monitor its stock while a blocker tag is being used to protect the privacy of some precious items. Each ID corresponds to a blocking tag and a genuine tag Each ID corresponds Each ID corresponds to a blocking tag to a genuine tag

Genuine Tag IDs (unknown ) Blocking IDs (known)

Fig. 1.

Three types of IDs in the system containing blocker tags.

To the best of our knowledge, this paper is the first to investigate RFID estimation with the presence of a blocker tag. Although some RFID estimation schemes have been proposed [7], [10], [13]–[18], none of them considers the

1679

2015 IEEE Conference on Computer Communications (INFOCOM)

presence of a blocker tag. Furthermore, none of them can be easily adapted to solve our problem. How about turning off the blocker tag and then using prior RFID estimation schemes to estimate the number of genuine tags? Turning off the blocker tag will give attackers a time window to breach privacy, especially for the scenarios that RFID estimation schemes are being continuously performed for monitoring purpose. Existing tree walking based [19] and framed slotted Aloha based [20] RFID identification schemes can be used to exactly identify the genuine tags, and thus obtaining the genuine tag cardinality. However, they are too slow for our estimation purpose. In this paper, we propose an RFID Estimation scheme with Blocker tags (REB). The communication protocol used by REB is the standard framed slotted Aloha protocol, in which a reader first broadcasts a value 𝑓 and a random number 𝑅 to the tags where 𝑓 represents the number of time slots in a forthcoming frame. Then, each tag computes a hash using the random number 𝑅 and its ID, where the resulting hash value ℎ is within [0, 𝑓 −1], and the tag replies during slot ℎ. For each slot, if no tag replies, we represent it as 0; if only one tag replies, we represent it as 1; if more than one tag replies, the tag responses will collide, and we represent this slot as 𝑐. Note that a reader can detect if there is a collision according to the C1G2 standard. Executing this protocol for the blocking IDs (simulated by the blocker tag) and genuine tags, we get a ternary array 𝔹𝔾[0..𝑓 − 1] where each bit is 0, 1, or 𝑐. As we know the blocking IDs, we can virtually execute the framed slotted Aloha protocol using the same frame size 𝑓 and random number 𝑅 for the blocking IDs; thus, we get a ternary array 𝔹[0..𝑓 − 1] where each bit is 0, 1, or 𝑐. From the two arrays 𝔹𝔾[0..𝑓 − 1] and 𝔹[0..𝑓 − 1], we calculate two numbers: 𝑁00 , which is the number of slots 𝑖 such that both 𝔹𝔾[𝑖] = 0 and 𝔹[𝑖] = 0, and 𝑁11 , which is the number of slots 𝑖 such that both 𝔹𝔾[𝑖] = 1 and 𝔹[𝑖] = 1. REB is based on the key insight that in general the smaller 𝑁00 is, the larger ∣𝐵 ∪ 𝐺∣ is and the larger 𝑁11 is, the larger ∣𝐵 − 𝐺∣ is. In this paper, we show that 𝑁00 monotonously decreases with the increase of ∣𝐵 ∪ 𝐺∣ and 𝑁11 monotonously increases with the increase of ∣𝐵 − 𝐺∣. Thus, from the observed 𝑁00 and 𝑁11 , we can estimate ∣𝐵 ∪ 𝐺∣ and ∣𝐵 − 𝐺∣. Then, we can calculate the size of 𝐺 because ∣𝐺∣ = ∣𝐵 ∪ 𝐺∣ − ∣𝐵 − 𝐺∣. We make the following three key contributions in this paper. First, we make the first effort towards RFID estimation with the presence of a blocker tag. We propose the REB scheme jointly using 𝑁00 and 𝑁11 to achieve an unbiased estimator for the genuine tag cardinality. The key technical development of this paper is on quantitatively and statistically correlating 𝑁00 and ∣𝐵 ∪𝐺∣, 𝑁11 and ∣𝐵 −𝐺∣. Second, we conduct thorough analysis to optimize system parameters, thereby achieving the required confidence interval and reliability in the fastest speed. Third, we implement REB in Matlab and evaluate its performance through extensive simulations. The experimental results reveal that our REB scheme runs tens of times faster than the fastest identification protocol under the same accuracy requirement. The rest of this paper is organized as follows. In Section II, we describe REB and our theoretical analysis. In Section III, we conduct extensive simulations to evaluate the

performance of REB. We discuss related work in Section IV. Finally, we conclude the paper in Section V. II.

REB P ROTOCOL

In this section, we first describe the system model used in this paper. Then, an efficient RFID Estimation scheme with Blocker tags (REB) is proposed to estimate the number of genuine tags by jointly using 𝑁00 and 𝑁11 observed in a time frame. We explicitly give the functional estimator and point out that the estimation using a single time frame is hard to be accurate due to probabilistic variance. Hence, we propose to use multiple independent time frames to refine the estimation. This section further presents rigorous theoretical analysis to investigate how many frames are needed to guarantee the desired estimation accuracy and how to avoid premature protocol termination. We also investigate the parameter settings (i.e., 𝑓 and 𝑝) to optimize the performance of our REB. A. System Model For the clarity of presentation, we first consider the RFID system containing a single reader, a single blocker tag, and a population of genuine tags. Then, we will discuss how to extend REB to the scenario that deploys multiple readers and blocker tags. We represent the set of blocking IDs as 𝐵, whose cardinality is 𝑏. The set of genuine tags is denoted as 𝐺, whose cardinality is 𝑔. We use 𝑈 to denote the union tag set, i.e., 𝐵 ∪ 𝐺, and ∣𝑈 ∣ = 𝑢. The IDs in 𝐵 − 𝐺 do not correspond to any genuine tags, whose cardinality is denoted as 𝑏′ , i.e., 𝑏′ = ∣𝐵 − 𝐺∣. The reader communicates with tags (including both genuine tags and virtual ones simulated by the blocker tag) under control of the backend server. The communication between the reader and tags are based on a time slotted way. Any two consecutive transmissions (from a tag to a reader or vice versa) are separated by a waiting time 𝜏𝑤 = 302𝑢𝑠 [10]. According to the specification of the Philips I-Code system [21], the wireless transmission rate from a tag to a reader is 53𝐾𝑏/𝑠, that is, it takes a tag 𝜏𝑡 = 18.9𝑢𝑠 to transmit 1 bit. The rate from a reader to a tag is 26.5𝐾𝑏/𝑠, that is, transmission of 1 bit to tags requires 𝜏𝑟 = 37.7𝑢𝑠. Then, the time of a slot for transmitting 𝑚-bit information from a tag to the reader is 𝜏𝑤 + 𝑚 × 𝜏𝑡 ; and the time of a slot for transmitting 𝑚-bit information from a reader to the tags is 𝜏𝑤 + 𝑚 × 𝜏𝑟 . The notations used throughout the paper are summarized in Table I. B. Protocol Description Our REB uses the standard framed slotted Aloha protocol specified in EPC C1G2 [22] as the MAC layer communication mechanism. The reader initializes a slotted time frame by broadcasting a binary request ⟨𝑅, 𝑓 ⟩, where 𝑅 is a random number and 𝑓 is the frame size (i.e., the number of slots in the forthcoming frame). Using the received parameters ⟨𝑅, 𝑓 ⟩, each tag initializes its slot counter 𝑠𝑐 by calculating 𝑠𝑐 = 𝐻(𝐼𝐷, 𝑅) mod 𝑓 and the hashing result follows a uniform distribution within [0, 𝑓 − 1]. The reader broadcasts QueryRep command at the end of each slot. Upon receiving QueryRep, a tag decrements its slot counter 𝑠𝑐

1680

2015 IEEE Conference on Computer Communications (INFOCOM)

TABLE I. Notations 𝐺 𝑔 𝐵 𝑏′ 𝑈 𝑢 𝛼 𝛽 𝑔 ˆ 𝑓 𝑝 𝐸(⋅) 𝑉 𝑎𝑟(⋅) 𝑍𝛽 𝑝00 𝑝11 𝑁00 𝑁11

N OTATIONS USED IN THE PAPER

𝑔 of genuine tags by calculating 𝑔 = 𝑢 − 𝑏′ . It may not be sufficient to satisfy the required estimate accuracy by counting the numbers of 𝑁00 and 𝑁11 in a single frame. To improve the accuracy, REB requires the reader to execute 𝑘 independent frames with different random number 𝑅.

Descriptions set of genuine tags. cardinality of 𝐺. i.e., 𝑔 = ∣𝐺∣. set of blocking IDs. cardinality of 𝐵 − 𝐺. i.e., 𝑏′ = ∣𝐵 − 𝐺∣. union set. 𝑈 = 𝐵 ∪ 𝐺. cardinality of 𝑈 . i.e., 𝑢 = ∣𝐵 ∪ 𝐺∣. required confidence interval. required reliability. estimate of 𝑔. frame size. persistence probability. expectation. variance. the percentile of 𝛽. e.g., 𝑍𝛽 = 1.96 when 𝛽 = 95%. probability that a slot pair is ⟨0, 0⟩. probability that a slot pair is ⟨1, 1⟩. # of the persistent empty slots in a frame. # of the persistent singleton slots in a frame.

Note that, the frame size should be set no more than 512 in practice [10], [19], [23] (the detailed reasons can be found in literature [19]). If a large number of tags contend for such a short frame, most slots will become collision slots. To scale to a large tag population, we exploit the method stated in [10]. Specifically, the reader uses a persistence probability 𝑝 ∈ (0, 1] to virtually extends the frame size 𝑓 to 𝑓 /𝑝, but actually terminates the frame after the first 𝑓 slots. Fundamentally, each tag participates in the actual frame of 𝑓 slots with a probability 𝑝. C. Functional Estimator

by 1. In a slot, a tag will respond to the reader if its slot counter 𝑠𝑐 becomes 0. According to the occupation status, slots are classified into three types: empty slot in which no tag responds; singleton slot in which only one tag responds; collision slot in which two or more tags respond.

In this section, we derive the functional estimator 𝑔ˆ from 𝑁00 and 𝑁11 for the REB protocol in one frame. For an arbitrary slot pair, the probability that it is ⟨0, 0⟩, denoted as 𝑝00 , is given as follows.

In the following, we present how our REB estimates the number of genuine tags by observing the slots in a frame. Since the backend server gets full knowledge of the simulated blocking IDs, it is able to predict which slots the blocking IDs are “mapped” to. Thus, it is able to construct a virtual ternary array 𝔹[0..𝑓 − 1]. A bit in 𝔹[0..𝑓 − 1] is set to 0 when no blocking ID is mapped to this slot; 1 when only one blocking ID is mapped to this slot; 𝑐 when two or more blocking IDs are mapped to this slot (a hashing collision). On the other hand, by observing the frame, the reader could get another array 𝔹𝔾[0..𝑓 − 1], also consisting of 𝑓 bits. A bit in 𝔹𝔾[0..𝑓 − 1] is set to 0 when no tag responds in this slot; 1 when only one tag responds in this slot; 𝑐 when two or more tags cause a collision in this slot. To distinguish a singleton slot from a collision one, each tag does not need to respond with the whole 96-bit ID. For saving time, each tag responds with the RN16 (16-bit) [22] that is much shorter than 96-bit ID. Two slots with the same index in 𝔹[0..𝑓 − 1] and 𝔹𝔾[0..𝑓 − 1] are called a slot pair. In our scheme, the reader needs to record the numbers of the following two types of slot pairs.

The approximation in Eq. (1) holds when 𝑓 /𝑝 is relatively large [5], [10], [13]. The number of slot pairs ⟨0, 0⟩, i.e., 𝑁00 , follows 𝐵𝑒𝑟𝑛𝑜𝑢𝑙𝑙𝑖(𝑓, 𝑝00 ). The expectation and variance of the variable 𝑁00 are presented as follows.



𝑁00 is the number of persistent-empty slot pairs ⟨0, 0⟩ (i.e., 𝔹[𝑖] = 0 AND 𝔹𝔾[𝑖] = 0, 𝑖 ∈ [0, 𝑓 − 1]).



𝑁11 is the number of persistent-singleton slot pairs ⟨1, 1⟩ (i.e., 𝔹[𝑖] = 1 AND 𝔹𝔾[𝑖] = 1, 𝑖 ∈ [0, 𝑓 − 1]).

REB can estimate the cardinality of genuine tags by jointly using the number of persistent-empty slots and that of persistent-singleton slots. A persistent-empty slot happens only when no ID in 𝑈 = 𝐵 ∪ 𝐺 is mapped to this index. Thus, 𝑁00 reflects the cardinality 𝑢 of 𝑈 . Latter, we will show that a monotone functional relationship can be established between 𝑢 and 𝑁00 . REB uses this function to estimate 𝑢 from 𝑁00 . Similarly, a persistent-singleton slot happens when only one ID in 𝐵 −𝐺 is mapped to this index. Therefore, 𝑁11 reflects the cardinality ∣𝐵 − 𝐺∣ (denoted as 𝑏′ ). Clearly, if we know 𝑢 and 𝑏′ , we can get the cardinality

𝑝00 = (1 −

𝑝 𝑢 − 𝑢𝑝 ) ≈𝑒 𝑓 𝑓

𝐸(𝑁00 ) = 𝑓 × 𝑝00 = 𝑓 𝑒

(1)

− 𝑢𝑝 𝑓

− 𝑢𝑝 𝑓

𝑉 𝑎𝑟(𝑁00 ) = 𝑓 × 𝑝00 × (1 − 𝑝00 ) = 𝑓 𝑒

(2)

(1 − 𝑒

− 𝑢𝑝 𝑓

)

(3)

Similarly, we use 𝑝11 to denote the probability that a slot pair is ⟨1, 1⟩, which is given as follows. 𝑝11

( ) 𝑝 𝑏′ 𝑝 − 𝑢𝑝 𝑏′ 𝑝 𝑒 𝑓 = ( )(1 − )𝑢−1 ≈ 𝑓 𝑓 1 𝑓

(4)

The number of ⟨1, 1⟩ slot pairs, i.e., 𝑁11 , also follows 𝐵𝑒𝑟𝑛𝑜𝑢𝑙𝑙𝑖(𝑓, 𝑝11 ). The expectation and variance of the variable 𝑁11 are presented as follows. − 𝑢𝑝 𝑓

𝐸(𝑁11 ) = 𝑓 × 𝑝11 = 𝑏′ 𝑝𝑒

− 𝑢𝑝 𝑓

𝑉 𝑎𝑟(𝑁11 ) = 𝑓 × 𝑝11 × (1 − 𝑝11 ) = 𝑏′ 𝑝𝑒

(1 −

(5) 𝑏′ 𝑝 − 𝑢𝑝 𝑒 𝑓 ) 𝑓

According to Eq. (2), 𝑢 can be expressed as follows. 𝑢=−

𝐸(𝑁00 ) 𝑓 ln[ ] 𝑝 𝑓

(6)

(7)

Dividing Eq. (5) by Eq. (2), we have: 𝐸(𝑁11 ) 𝑓 𝐸(𝑁11 ) 𝑏′ 𝑝 = ⇒ 𝑏′ = 𝐸(𝑁00 ) 𝑓 𝑝𝐸(𝑁00 )

(8)

According to Eqs. (7)(8), 𝑔 is expressed as follows. 𝑔 = 𝑢 − 𝑏′ = −

𝐸(𝑁00 ) 𝑓 𝐸(𝑁11 ) 𝑓 ln[ ]− 𝑝 𝑓 𝑝𝐸(𝑁00 )

(9)

By substituting 𝑁00 for 𝐸(𝑁00 ) and 𝑁11 for 𝐸(𝑁11 ) in Eq. (9), we get the estimator of 𝑔 as follows.

1681

𝑔ˆ = −

𝑓 𝑁00 𝑓 𝑁11 ln( )− 𝑝 𝑓 𝑝𝑁00

(10)

2015 IEEE Conference on Computer Communications (INFOCOM)

That is, Eq. (10) exactly specifies how to use the observed 𝑁00 and 𝑁11 to estimate the cardinality 𝑔 of genuine tags. Theorem 1 presents the expectation and the variance of the estimate 𝑔ˆ, which are very important to investigate the probabilistic accuracy of the estimator. Theorem 1. 𝑔ˆ in Eq. (10) is an unbiased estimator of 𝑔, that is, 𝐸(ˆ 𝑔 ) = 𝑔. The variance of the estimator is 𝑉 𝑎𝑟(ˆ 𝑔) = 𝑢𝑝 1 𝑓 (𝑏′2 𝑝2 + 𝑓 2 − 𝑏′ 𝑓 𝑝) − 𝑓 . 𝑓 𝑝2 𝑒 𝑝2 Proof: To get the expectation and variance of 𝑔ˆ, we use a similar method in [10]. Since the variance expression is different from [10] [24], we present the detailed proving procedures for the completeness of this paper. According to Eq. (10), 𝑔ˆ is a function with respect to 𝑁00 and 𝑁11 . Hence, we denote 𝑔ˆ as 𝜑(𝑁00 , 𝑁11 ), that is, 𝑔ˆ = 𝜑(𝑁00 , 𝑁11 ). We present the Taylor’s series expansion [25] of function 𝜑(𝑁00 , 𝑁11 ) around (𝜂0 , 𝜂1 ), where 𝜂0 = 𝐸(𝑁00 ) and 𝜂1 = 𝐸(𝑁11 ). 𝜑(𝑁00 , 𝑁11 ) ≈ 𝜑(𝜂0 , 𝜂1 )+[(𝑁00 − 𝜂0 )

∂𝜑 ∂𝜑 +(𝑁11 − 𝜂1 ) ] ∂𝑁00 ∂𝑁11 (11)

We have the following equation by taking expectation of both sides of Eq. (11). 𝐸[𝜑(𝑁00 , 𝑁11 )] ∂𝜑 ∂𝜑 𝐸(𝑁00 − 𝜂0 ) + 𝐸(𝑁11 − 𝜂1 ) = 𝑔 =𝜑(𝜂0 , 𝜂1 ) + ∂𝑁00 ∂𝑁11 (12)

So far, 𝐸(ˆ 𝑔 ) = 𝑔 is proved, that is, 𝑔ˆ is an unbiased estimator of 𝑔. In what follows, we investigate the variance of 𝑔ˆ. 𝑉 𝑎𝑟(ˆ 𝑔 ) = 𝐸[ˆ 𝑔 − 𝐸(ˆ 𝑔 )]2 ∂𝜑 ∂𝜑 2 =𝐸[(𝑁00 − 𝜂0 ) + (𝑁11 − 𝜂1 ) ] ∂𝑁00 ∂𝑁11 ∂𝜑 2 ∂𝜑 2 =𝑉 𝑎𝑟(𝑁00 )( ) + 𝑉 𝑎𝑟(𝑁11 )( ) + ∂𝑁00 ∂𝑁11 ∂𝜑 ∂𝜑 2𝐶𝑜𝑣(𝑁00 , 𝑁11 ) ∂𝑁00 ∂𝑁11

(13)

𝑓 𝑓 −𝑥 ∑ ∑

𝑥𝑦𝑃 [𝑁00 = 𝑥 ∧ 𝑁11 = 𝑦]

𝑥=0 𝑦=0

( ) ( ) 𝑓 𝑥 𝑓 −𝑥 = 𝑥𝑦 (𝑝00 ) (𝑝11 )𝑦 (1 − 𝑝00 − 𝑝11 )(𝑓 −𝑥−𝑦) 𝑥 𝑦 𝑥=0 𝑦=0 ( ) 𝑓 ∑ 𝑓 −1 =𝑝11 𝑓 (𝑓 − 𝑥) (𝑝00 )𝑥 (1 − 𝑝00 )𝑓 −𝑥−1 𝑥 − 1 𝑥=1 ( ) 𝑓 𝑝00 𝑝11 𝑓 2 ∑ 𝑓 − 1 (𝑝00 )𝑥−1 (1 − 𝑝00 )𝑓 −𝑥 = 1 − 𝑝00 𝑥=1 𝑥 − 1 ( ) 𝑓 𝑓 (𝑓 − 1)(𝑝00 )2 𝑝11 ∑ 𝑓 − 2 − (𝑝00 )𝑥−2 (1 − 𝑝00 )𝑓 −𝑥 1 − 𝑝00 𝑥 − 2 𝑥=2 ( ) 𝑓 𝑓 𝑝00 𝑝11 ∑ 𝑓 − 1 (𝑝00 )𝑥−1 (1 − 𝑝00 )𝑓 −𝑥 − 1 − 𝑝00 𝑥=1 𝑥 − 1 𝑓 𝑓 −𝑥 ∑ ∑

=

𝑢𝑝 𝑏′ ∂𝜑 𝑁00 =𝜂0 1 ∣𝑁11 =𝜂1 = 𝑒 𝑓 ( − ) ∂𝑁00 𝑓 𝑝 ∂𝜑 𝑁00 =𝜂0 1 𝑢𝑝 =− 𝑒𝑓 ∣ ∂𝑁11 𝑁11 =𝜂1 𝑝

𝑓 (𝑓 − 1)(𝑝00 )2 𝑝11 𝑓 𝑝00 𝑝11 𝑝00 𝑝11 𝑓 2 − − = 𝑓 (𝑓 − 1)𝑝00 𝑝11 1 − 𝑝00 1 − 𝑝00 1 − 𝑝00 (14)

(15)

We have obtained 𝐸(𝑁00 𝑁11 ) in Eq. (14), 𝐸(𝑁00 ) in Eq. (2), and 𝐸(𝑁11 ) in Eq. (5). Thus, we can calculate 𝐶𝑜𝑣(𝑁00 , 𝑁11 ) as follows. 𝐶𝑜𝑣(𝑁00 , 𝑁11 ) = 𝐸(𝑁00 𝑁11 ) − 𝐸(𝑁00 )𝐸(𝑁11 ) − 2𝑢𝑝 𝑓

= − 𝑓 𝑝00 𝑝11 = −𝑏′ 𝑝𝑒

(16)

By combining Eqs. (3) (6) (15) (16) into Eq. (13), we then get the variance of 𝑔ˆ as follows. 𝑉 𝑎𝑟(ˆ 𝑔) =

𝑓 1 𝑢𝑝 𝑒 𝑓 (𝑏′2 𝑝2 + 𝑓 2 − 𝑏′ 𝑓 𝑝) − 2 , 𝑓 𝑝2 𝑝

(17)

where 𝑓 and 𝑝 are the used frame size and the persistence probability, respectively. D. Refined Estimation with 𝑘 Frames Because of probabilistic variance, the estimate 𝑔ˆ got from a single frame is hard to meet the predefined accuracy. By the law of large number [26], we issue 𝑘 independent ∑𝑘 frames and use the average estimation result 𝑔ˆ𝑘 = 𝑘1 𝑗=1 𝑔ˆ𝑗 to achieve a more accurate estimate in REB, where 𝑔ˆ𝑗 is the estimate of 𝑔 derived from the 𝑗 𝑡ℎ frame. We propose Theorem 2 to investigate how many independent frames are necessary to guarantee that the average estimate 𝑔ˆ𝑘 can satisfy the predefined (𝛼, 𝛽) accuracy. Theorem 2. The reader performs ∑𝑘 𝑘 independent frames. The average estimate 𝑔ˆ𝑘 = 𝑘1 𝑗=1 𝑔ˆ𝑗 can guarantee the required √ (𝛼, 𝛽) accuracy, if the frame number 𝑘 satisfies 𝑢𝑝𝑗 𝑘 ∑ 𝑍𝛽 𝑓 𝑘 ≥ 𝑔𝛼 [ 𝑓𝑗1𝑝2 𝑒 𝑓𝑗 (𝑏′2 𝑝2𝑗 + 𝑓𝑗2 − 𝑏′ 𝑓𝑗 𝑝𝑗 ) − 𝑝𝑗2 ], where 𝑗=1

In the following, we present how to get the covariance 𝐶𝑜𝑣(𝑁00 , 𝑁11 ). Since 𝐶𝑜𝑣(𝑁00 , 𝑁11 ) = 𝐸(𝑁00 𝑁11 ) − 𝐸(𝑁00 )𝐸(𝑁11 ), we calculate 𝐸(𝑁00 𝑁11 ) below. 𝐸(𝑁00 𝑁11 ) =

As required by Eq. (13), we also calculate the first-order partial derivatives of 𝜑(𝑁00 , 𝑁11 ) as follows.

𝑗

𝑗

𝑓𝑗 and 𝑝𝑗 are the frame size and persistence probability of the 𝑗 𝑡ℎ frame, respectively. ∑𝑘 Proof: We define 𝑔ˆ𝑘 = 𝑘1 𝑗=1 𝑔ˆ𝑗 as the average estimate of 𝑘 successive frames, where 𝑔ˆ𝑗 is the estimate of the 𝑗 𝑡ℎ frame, 𝑗 ∈ [1, 𝑘]. The reader initializes each frame with different random seeds. Hence, the estimate 𝑔ˆ𝑗 is indepen∑𝑘 dent to each other. Thus, we have 𝐸(𝑔ˆ𝑘 ) = 𝑘1 𝑗=1 𝐸(ˆ 𝑔𝑗 ) = ∑ 𝑘 1 𝑔; and 𝑉 𝑎𝑟(𝑔ˆ𝑘 ) = 𝑘2 𝑗=1 𝑉 𝑎𝑟(ˆ 𝑔𝑗 ). Clearly, the average estimate 𝑔ˆ𝑘 still converges to the actual cardinality 𝑔. Given a required reliability 𝛽, the actual√confidence interval is √ within [𝑔 − 𝑍𝛽 𝑉 𝑎𝑟(𝑔ˆ𝑘 ), 𝑔 + 𝑍𝛽 𝑉 𝑎𝑟(𝑔ˆ𝑘 )], where 𝑍𝛽 is a percentile of 𝛽, e.g., if 𝛽 = 95%, 𝑍𝛽 will be 1.96. To guarantee the required confidence 𝛼, we should guarantee: √ ⎧ ⎨ 𝑔 + 𝑍𝛽 𝑉 𝑎𝑟(𝑔ˆ𝑘 ) ≤ 𝑔 + 𝑔𝛼 √ ⎩𝑔 − 𝑍 𝑉 𝑎𝑟(𝑔ˆ𝑘 ) ≥ 𝑔 − 𝑔𝛼 𝛽 ∑ 𝑘 Substituting 𝑘12 𝑗=1 𝑉 𝑎𝑟(ˆ 𝑔𝑗 ) for 𝑉 𝑎𝑟(𝑔ˆ𝑘 ) and solving the above inequalities, we have:

1682

v u 𝑘 ∑ 𝑍𝛽 u ⎷ 𝑉 𝑎𝑟(ˆ 𝑔𝑗 ) 𝑘≥ 𝑔𝛼 𝑗=1

(18)

2015 IEEE Conference on Computer Communications (INFOCOM)

According 1 𝑒 𝑓𝑗 𝑝2𝑗

𝑢𝑝𝑗 𝑓𝑗

to

(𝑏′2 𝑝2𝑗

Eq.

(17),

𝑓𝑗2



we

+ − 𝑏 𝑓𝑗 𝑝𝑗 ) − into Eq. (18), we have:

have 𝑓𝑗 . 𝑝2𝑗

𝑉 𝑎𝑟(ˆ 𝑔𝑗 )

=

Substituting it

v u 𝑘 𝑢𝑝𝑗 ∑ 𝑍𝛽 u ⎷ [ 1 𝑒 𝑓𝑗 (𝑏′2 𝑝2 + 𝑓 2 − 𝑏′ 𝑓𝑗 𝑝𝑗 ) − 𝑓𝑗 ] 𝑘≥ 𝑗 𝑗 2 𝑔𝛼 𝑗=1 𝑓𝑗 𝑝𝑗 𝑝2𝑗

(19)

When the above inequality holds, the predefined (𝛼, 𝛽) accuracy can be satisfied. After 𝑘 frames, the backend server calculates the R.H.S. (Right Hand Side) of Eq. (19). If the result is less than (or equal to) 𝑘, the estimation process terminates; otherwise, the next frame continues to be issued. Note that, we do not know the actual 𝑢, 𝑏′ and 𝑔. We propose to use the first 𝑘 leading frames to estimate them, as shown in Eq. (20). 𝑢 ˆ𝑘 =

𝑘 𝑘 1∑ 1 ∑ 𝑓𝑗 𝑁00𝑗 𝑢 ˆ𝑗 = [− ln( )] 𝑘 𝑗=1 𝑘 𝑗=1 𝑝𝑗 𝑓𝑗

𝑘 𝑘 1 ∑ ˆ′ 1 ∑ 𝑓𝑗 𝑁11𝑗 𝑏ˆ′ 𝑘 = 𝑏𝑗= [ ] 𝑘 𝑗=1 𝑘 𝑗=1 𝑝𝑗 𝑁00𝑗

𝑔ˆ𝑘 =

(20)

𝑘 𝑘 1∑ 1 ∑ 𝑓𝑗 𝑁00𝑗 𝑓𝑗 𝑁11𝑗 𝑔ˆ𝑗 = [− ln( )− ], 𝑘 𝑗=1 𝑘 𝑗=1 𝑝𝑗 𝑓𝑗 𝑝𝑗 𝑁00𝑗

where 𝑁00𝑗 and 𝑁11𝑗 are the numbers of persistent empty slots and persistent singleton slots observed in the 𝑗 𝑡ℎ frame. E. Avoiding Premature Termination In the execution of REB, we can get 𝑢 ˆ𝑘 , 𝑏ˆ′ 𝑘 and 𝑔ˆ𝑘 after 𝑘 frames. However, their estimation is inaccurate due to probability variance. If we directly use them to calculate the R.H.S. of Eq. (19), 𝑘 may have a chance to be larger than it, which is not true and REB will have a premature termination (i.e., the currently achieved accuracy has not met the required one yet). In the following, we propose to solve the issue of premature termination. First, we calculate the variances of 𝑢 ˆ𝑘 , 𝑏ˆ′ 𝑘 and 𝑔ˆ𝑘 as follows. Note that, we can obtain the variances of 𝑢 ˆ𝑘 and 𝑏ˆ′ 𝑘 using similar method of getting 𝑉 𝑎𝑟(ˆ 𝑔𝑘 ). 𝑉 𝑎𝑟(ˆ 𝑢𝑘 ) =

𝑘 𝑢 ˆ 𝑝𝑗 𝑘 1 ∑ 𝑓𝑗 𝑓𝑗 (𝑒 − 1) 2 2 𝑘 𝑗=1 𝑝𝑗

In REB, parameters 𝑝 and 𝑓 can significantly affect the protocol performance, thus need to be optimized. We use the information observed from the 𝑥 leading frames to facilitate the optimization of 𝑝 and 𝑓 in the (𝑥 + 1)𝑡ℎ frame. The optimization goal is to minimize the execution time while guaranteeing the required (𝛼, 𝛽) accuracy. In the first frame, we set 𝑓1 = 512. To coarsely set 𝑝1 , we modify the scheme used in [10], [18], [19]. Specifically, the reader keeps issuing one-slot frames. The persistence probability follows a geometric distribution, 21 , 14 , 18 , ⋅ ⋅ ⋅, i.e., the persistence probability in the 𝛾 𝑡ℎ single-slot frame is 21𝛾 . This process does not terminate until an empty slot appears. Assuming the ℓ𝑡ℎ slot is the first empty slot, we have a coarse estimation of 𝑢 to be 2ℓ [18]. The persistence probability 𝑝1 of the first frame is simply set to 𝑓 /2ℓ . In what follows, we describe how to optimize 𝑝 and 𝑓 for the (𝑥+1)𝑡ℎ frame (𝑥 ≥ 1). Since 𝑝 and 𝑓 are correlated to minimize the total execution time, we first fix the 𝑓 value to get an optimized 𝑝. The range of 𝑓 is from 1 to 512 and its value should be 2, 4, 8, ⋅⋅⋅, 512, as suggested in the C1G2 standard. Thus, we can get an optimized 𝑓 by comparing all possible pairs of 𝑝 and 𝑓 (with only 9 possible 𝑓 values). Note that, the proposed REB is not sensitive to the coarse settings of 𝑝1 and 𝑓1 , because REB will quickly converge to a near-optimal setting of 𝑝 and 𝑓 after a few frames, which will be demonstrated in the simulations. 1) Optimizing 𝑝: We optimize 𝑝 using a binary search method for a given 𝑓 value. Since the smaller the estimation variance of 𝑔ˆ is, the less frames will be required, that is, the less the execution time (𝑓 × frame number) is. We theoretically investigate how to optimize 𝑝 to minimize the estimation variance 𝑉 𝑎𝑟(ˆ 𝑔 ) in Eq. (17). We prove that 𝑉 𝑎𝑟(ˆ 𝑔 ) is a convex function of 𝑝 ∈ (0, 1] in Theorem 3. By virtue of the convex property, we have two claims: (i) There is an optimal 𝑝𝑜𝑝 ∈ (0, 1] minimizing the variance 𝑉 𝑎𝑟(ˆ 𝑔 ). Taking Fig. 2 (a)(b) for example. (ii) 𝑎𝑟(ˆ 𝑔) The first order partial derivation ∂𝑉 ∂𝑝 presented in the 𝑎𝑟(ˆ 𝑔) following Eq. (22) satisfies ∀𝑝 ∈ (0, 𝑝𝑜𝑝 ), ∂𝑉 ∂𝑝 ≤ 0 ∂𝑉 𝑎𝑟(ˆ 𝑔) and ∀𝑝 ∈ (𝑝𝑜𝑝 , 1], ∂𝑝 ≥ 0. Taking Fig. 2 (c)(d) for example. 𝑢𝑝 𝑏′2 𝑢 ∂𝑉 𝑎𝑟(ˆ 𝑔) 𝑏′ + 𝑢 𝑏′ 𝑢 2𝑓 2𝑓 =𝑒𝑓 ( 2 + − − 3)+ 3 ∂𝑝 𝑓 𝑝2 𝑓𝑝 𝑝 𝑝

2 𝑘 𝑢 ˆ 𝑝 𝑏ˆ′ 1 ∑ 𝑘𝑓𝑗 𝑗 𝑏ˆ′ 𝑘 𝑒 + 𝑘) 𝑉 𝑎𝑟(𝑏ˆ′ 𝑘 ) = 2 ( 𝑘 𝑗=1 𝑓𝑗 𝑝𝑗

𝑉 𝑎𝑟(ˆ 𝑔𝑘 ) =

F. Dynamically Optimizing 𝑝 and 𝑓

𝑘 𝑢 ˆ 𝑝𝑗 𝑘 2 1 ∑ 1 𝑓𝑗 𝑓𝑗 𝑒 (𝑏ˆ′ 𝑘 𝑝2𝑗 + 𝑓𝑗2 − 𝑏ˆ′ 𝑘 𝑓𝑗 𝑝𝑗 ) − 2 2 2 𝑘 𝑗=1 𝑓𝑗 𝑝𝑗 𝑝𝑗 (21)

When calculating the R.H.S. of Eq. (19), we can√use 𝑢 ˆ𝑘↑ = √ 𝑢𝑘 ) to substitute 𝑢, 𝑏ˆ′ 𝑘↑ = 𝑏ˆ′ 𝑘 + 𝛿 𝑉 𝑎𝑟(𝑏ˆ′ 𝑘 ) 𝑢 ˆ𝑘 + 𝛿 𝑉 𝑎𝑟(ˆ √ to substitute the first 𝑏′ , 𝑏ˆ′ 𝑘↓ = 𝑏ˆ′ 𝑘 − 𝛿 𝑉 𝑎𝑟(𝑏ˆ′ 𝑘 ) to √ substitute the second 𝑏′ , 𝑔ˆ𝑘↑ = 𝑔ˆ𝑘 +𝛿 𝑉 𝑎𝑟(ˆ 𝑔𝑘 ) to substitute 𝑔. In Section III, simulation results demonstrate that this tactic can effectively avoid the premature termination. The three-sigma rule [27] indicates 𝛿 = 3 is large enough.

(22)

Based on the above two claims, we propose a binary-search algorithm to get the optimal 𝑝𝑜𝑝 for the 𝑗 𝑡ℎ frame, 𝑗 ≥ 2. In step 1 of Algorithm 1, 𝛿 specifies the maximum deviation between the outputted 𝑝 and its actually optimal value. Initially, 𝑝ℎ𝑖𝑔ℎ = 1 and 𝑝𝑙𝑜𝑤 is set to a small enough value 1/ˆ 𝑢𝑥 . By the while loop in steps 4∼10, 𝑝ℎ𝑖𝑔ℎ and 𝑝𝑙𝑜𝑤 progressively approach the optimal 𝑝𝑜𝑝 . When the optimization derivation is less than 𝛿, the average value of 𝑝ℎ𝑖𝑔ℎ and 𝑝𝑙𝑜𝑤 are returned as the optimal 𝑝. The complexity of Algorithm 1 is Θ(lg 1𝛿 ). Theorem 3. 𝑉 𝑎𝑟(ˆ 𝑔 ) in Eq. (17) is a convex function of 𝑝. Proof: 𝑉 𝑎𝑟(ˆ 𝑔 ) is a convex function of 𝑝 ∈ (0, 1], if and only if its second order partial derivative satisfies ∀𝑝 ∈

1683

2015 IEEE Conference on Computer Communications (INFOCOM)

6

x 10

6

6 4

4 3 2 1

Optimal p

0 0.001 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

x 10

5

x 10

2.5

∂V ar(ˆ g) in Eq. (22) ∂p

Eq. (17)

8

2

3

5

10

Var(ˆ g ) in

Var(ˆ g ) in

Eq. (17)

12

7

14

x 10

2 1.5 1 0.5

0.1

0.2

0 −5

Optimal p

−10

Optimal p

−15

0

Optimal p

0 0.001

∂V ar(ˆ g) in Eq. (22) ∂p

12

14

−0.5 0.001 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−20 0.005

0.1

persistence probability p

persistence probability p

persistence probability p

persistence probability p

(a)

(b)

(c)

(d)

0.2

Fig. 2. 𝑏 = 10000, 𝑏′ = 5000, 𝑔 = 10000, 𝑓 = 512. (a) The variance 𝑉 𝑎𝑟(ˆ 𝑔 ) against 𝑝 ∈ [0.001, 1]. (b) Magnified plot of (a), 𝑝 ∈ [0.001, 0.2]. (c) The first order partial derivation of 𝑉 𝑎𝑟(ˆ 𝑔 ) against 𝑝 ∈ [0.001, 1]. (d) Magnified plot of (c), 𝑝 ∈ [0.005, 0.2]. 2

(0, 1], ∂∂𝑝𝑔2ˆ > 0. We get its second order partial derivative as follows.

Algorithm 1: Optimizing 𝑝𝑥+1 for the (𝑥+1)𝑡ℎ frame. Input: 𝑢 ˆ𝑥 , 𝑏ˆ′ 𝑥 , 𝑔ˆ𝑥 , and 𝑓 . Output: The optimized 𝑝𝑥+1 for the (𝑥 + 1)𝑡ℎ frame. 1: 𝛿 = 0.0001; 1 2: 𝑝𝑙𝑜𝑤 = 𝑢 ˆ𝑥 ; 3: 𝑝ℎ𝑖𝑔ℎ = 1; 4: while 𝑝ℎ𝑖𝑔ℎ − 𝑝𝑙𝑜𝑤 > 𝛿 do 5: 𝑝 = (𝑝𝑙𝑜𝑤 + 𝑝ℎ𝑖𝑔ℎ )/2; 𝑎𝑟(ˆ 𝑔) 6: Calculating ∂𝑉 ∂𝑝 in Eq. (22); ∂𝑉 𝑎𝑟(ˆ 𝑔) 7: if ( ∂𝑝 > 0) then 8: 𝑝ℎ𝑖𝑔ℎ = 𝑝; 9: else 10: 𝑝𝑙𝑜𝑤 = 𝑝; 11: end if 12: end while 13: 𝑝𝑥+1 = (𝑝𝑙𝑜𝑤 + 𝑝ℎ𝑖𝑔ℎ )/2; 14: return 𝑝𝑥+1 ;

𝑢𝑝 2𝑏′ + 4𝑢 ∂ 2 𝑔ˆ 𝑏′2 𝑢2 2𝑏′ 𝑢 + 𝑢2 𝑏 ′ 𝑢2 6𝑓 6𝑓 𝑓 ( =𝑒 + − − + 4 )− 4 ∂𝑝2 𝑓3 𝑓 𝑝2 𝑓 2𝑝 𝑝3 𝑝 𝑝 | {z }

denoted as ℜ

′2

2



2



(23) 2

As shown in Eq. (23), we denote ( 𝑏 𝑓 3𝑢 + 2𝑏 𝑓𝑢+𝑢 − 𝑏𝑓 2𝑢𝑝 − 𝑝2 6𝑓 2𝑏′ +4𝑢 + 𝑝4 ) as ℜ. In what follows, we first prove ℜ is 𝑝3 always larger than 0. ℜ =(𝑏′2 𝑢2 𝑝4 + 2𝑏′ 𝑢𝑓 2 𝑝2 + 𝑢2 𝑓 2 𝑝2 − 𝑏′ 𝑢2 𝑓 𝑝3 − 2𝑏′ 𝑓 3 𝑝

− 4𝑢𝑓 3 𝑝 + 6𝑓 4 )/(𝑓 3 𝑝4 ) √ √ 1 2 2 2 1 4 (24) =[(𝑏′ 𝑢𝑝2 − 𝑢𝑓 𝑝)2 + ( 2𝑏′ 𝑢𝑓 𝑝 − 𝑓 ) + 𝑓 2 2 6 √ √ √ 4 3 2 2 3 𝑢𝑓 𝑝 − 𝑓 ) + 2( 𝑏′ 𝑢 − 𝑏′ )𝑓 3 𝑝]/(𝑓 3 𝑝4 ) +( 2 3

Since 𝑢 > 𝑏′ in Eq. (24), we have ℜ > 0. Using 𝑢𝑝 the fourth-order Taylor series expansion, we have 𝑒 𝑓 > 𝑢2 𝑝2 𝑢3 𝑝3 𝑢4 𝑝4 1 + 𝑢𝑝 𝑓 + 2𝑓 2 + 6𝑓 3 + 24𝑓 4 . According to Eqs. (23)(24), we have: 𝑢𝑝 ∂ 2 𝑔ˆ 6𝑓 =𝑒 𝑓 ℜ− 4 ∂𝑝2 𝑝 𝑢𝑝 𝑢2 𝑝 2 𝑢3 𝑝3 𝑢4 𝑝 4 6𝑓 >(1 + + + + )ℜ − 4 𝑓 2𝑓 2 6𝑓 3 24𝑓 4 𝑝 1 𝑢3 𝑝 𝑢3 𝑏′ 𝑝2 2 5𝑢6 𝑝2 𝑝 2 2 ′ 𝑢3 2 = ( 2√ − 3√ ) + + 5 (𝑢 𝑏 − ) 24 2𝑓 𝑓 1152𝑓 5 2𝑓 12 𝑓 𝑓 √ √ 1 2 ′ 3 𝑢3 𝑝 2 𝑢2 𝑢 + 3( 𝑢𝑏 − ) + 3 (𝑏′ − )2 𝑓 3 128 𝑓 3𝑓 2 𝑢5 𝑏′2 𝑝3 2𝑢 − 2𝑏′ 𝑢3 𝑏′2 𝑝 + + >0 + 𝑓4 6𝑓 6 𝑝3 (25)

∂2𝑔 ˆ ∂𝑝2

> 0, which is a Eq. (25) indicates that ∀𝑝 ∈ (0, 1], necessary and sufficient condition to prove that 𝑉 𝑎𝑟(ˆ 𝑔 ) is a convex function of 𝑝 ∈ (0, 1]. 2) Optimizing 𝑓 : We optimize 𝑓 for the (𝑥 + 1)𝑡ℎ frame. Considering C1G2 standard and practical constraints [10], 𝑓 should take a value from 2, 4, 8, ⋅ ⋅ ⋅, or 512. To improve the time-efficiency, it is reasonable to minimize the expected remaining execution time. We denote the minimum frame number that needs to be further executed as 𝑦, our goal is: Minimizing (𝑓 + 1) × 𝑦

(26)

In Eq. (26), 𝑓 + 1 means a slot for transmitting protocol parameters is followed by an 𝑓 -slot frame. According to the termination condition in Eq. (18), the value of 𝑦 should satisfy the following inequality: v u∑ 𝑥 𝑍𝛽 u ⎷ 𝑥+𝑦 ≥ 𝑉 𝑎𝑟(𝑔ˆ𝑗 ) + 𝑦𝑉 𝑎𝑟(ˆ 𝑔) 𝑔𝛼 𝑗=1

(27)

By solving the above inequality, we know that the minimum frame number 𝑦 that needs to be further executed is: ⌈ 𝑍 2 𝑉 𝑎𝑟(ˆ 𝑔) 𝛽

2𝑔 2 𝛼2

v u 2 𝑥 ⌉ u 𝑍𝛽 𝑉 𝑎𝑟(ˆ 𝑔) 𝑍𝛽2 ∑ 2+ 2 − 𝑥+⎷[ − 𝑥] 𝑉 𝑎𝑟(ˆ 𝑔 ) − 𝑥 𝑗 2𝑔 2 𝛼2 𝑔 2 𝛼2 𝑗=1 𝑢𝑝𝑗

(28)

𝑓 1 𝑒 𝑓𝑗 (𝑏′2 𝑝2𝑗 + 𝑓𝑗2 − 𝑏′ 𝑓𝑗 𝑝𝑗 ) − 𝑝𝑗2 , 𝑗 𝑓𝑗 𝑝2𝑗 𝑗 𝑢𝑝 [1, 𝑥], 𝑉 𝑎𝑟(ˆ 𝑔 ) = 𝑓 1𝑝2 𝑒 𝑓 (𝑏′2 𝑝2 + 𝑓 2 − 𝑏′ 𝑓 𝑝) − 𝑝𝑓2 . 𝑔, 𝑢, can be approximated by 𝑔ˆ𝑥 , 𝑢 ˆ𝑥 , 𝑏ˆ′ 𝑥 , respectively.

Here, 𝑉 𝑎𝑟(𝑔ˆ𝑗 ) =

∈ 𝑏′

For any 𝑓 ∈ {2, 4, 8, ⋅ ⋅ ⋅, 512}, we can use Algorithm 1 to get the corresponding optimal 𝑝. Given each pair (𝑓, 𝑝), the smallest 𝑦 can be obtained from Eq. (28). Thus, we can get the optimal 𝑓 from Eq. (26). The calculation complexity of optimizing 𝑝 and 𝑓 is bounded by Θ(9 lg 1𝛿 ).

1684

Persistence probability

50.5% 46.53%

49% 40.1%

0.4

28.52%

0.2 10.6%

0

0%

0%

0%

0% 1.08%

0% 0.72% 0.3%

0%

2.97% 0%

0%

1

2

3

4

actual persistence probability optimal persistence probability

0.05 0.04 0.03 0.02 0.01 0

1

2

3

4

the j th frame (b)

the j th frame (a)

Fig. 3. Verifying the optimized settings of 𝑓 and 𝑝. ∣𝐵 − 𝐺∣ = 5000, ∣𝐵 ∩ 𝐺∣ = 5000, ∣𝐺 − 𝐵∣ = 5000. 𝛼 = 10%, 𝛽 = 90%. (a) Verifying the optimized 𝑓 . (b) Verifying the optimized 𝑝.

B. Estimation Reliability One of the most important performance metrics for estimation protocols is the actual reliability. In an arbitrary simulation, if the estimate 𝑔ˆ is within [𝑔(1 − 𝛼), 𝑔(1 + 𝛼)], we refer to it as a successful estimation. We record the success times among 1000 independent simulations. The ratio, i.e., success times/1000, is treated as the actual reliability. Simulation results in Fig. 4 reveal that REB (𝛿 = 0) does not always meet the required reliability (i.e., 𝛽 = 95%). The reason lies in the variances if directly using 𝑢 ˆ𝑘 , 𝑏ˆ′ 𝑘 and 𝑔ˆ𝑘 to determine the termination condition. By taking their variances into consideration, the proposed 𝛿-sigmabased termination tactic effectively avoids the premature termination. Simulation results in Fig. 4 reveal that the actual reliability of REB (𝛿 = 1) and REB (𝛿 = 2) is always higher than the required one. δ

δ

Premature termination

Estimation Reliability

δ

A. Verifying the Optimized 𝑓 and 𝑝 The setting of parameters 𝑓 and 𝑝 is important to the performance of REB. To achieve the overall optimal 𝑓 and 𝑝, it is necessary to know the values of 𝑢, 𝑏′ and 𝑔 before the execution of REB, which, however, are what we want to estimate. Using the simulation conditions shown in Fig. 3, the overall optimal 𝑓 is 128, and the overall optimal 𝑝 is 0.01175, which are calculated by the actual values of 𝑢, 𝑏′ and 𝑔. In the simulations corresponding to Fig. 3, we aim to verify the convergence of 𝑓 and 𝑝 to their overall optimal values. Results in Fig. 3 (a) demonstrate that about 28.5% of the independent simulations correctly take the overall optimal 𝑓 = 128 in the 3𝑟𝑑 frame. And this ratio reaches 50.5% in the 4𝑡ℎ frame, that is, our REB has a good chance to take the overall optimal 𝑓 just after the 3𝑟𝑑 frame. The simulation results in Fig. 3 (b) demonstrate that the persistent probability 𝑝 approaches its overall optimal value frame by frame. The value of 𝑝 taken in the 4𝑡ℎ is very close to the optimal value 0.01175. All in all, 𝑓 and 𝑝 approach their overall optimal values frame by frame. The underlying reason is that more frames increase the estimation accuracy of 𝑢, 𝑏′ and 𝑔, which eventually facilitates the optimization of 𝑓 and 𝑝.

69.68%

0.6

2 f=3 4 f=6 8 2 f=1 6 5 f=2 2 1 f=5

In this section, we conduct simulations to evaluate the performance of REB in a large scale RFID system that contains thousands of tags. The simulators were implemented using MATLAB. Since REB in the multi-reader scenario is logically the same as that in the single reader scenario, this paper only simulates a single reader following prior literature [7], [10], [16], [31]. The setting of slot length is based on what we specified in Section II-A. In the following, we first verify the effectiveness of our optimization methods on 𝑓 and 𝑝. Then, we conduct simulations to evaluate the actual estimation reliability of REB and its time-efficiency. We run each simulation 1000 times and report the average results.

0.8

2 f=3 4 f=6 8 2 f=1 6 5 f=2 2 1 f=5

P ERFORMANCE E VALUATION

0.06

2 f=3 4 f=6 8 2 f=1 6 5 f=2 2 1 f=5

III.

100%

2 f=3 4 f=6 8 2 f=1 6 5 f=2 2 1 f=5

In practice, a single reader cannot probe all the tags due to the limited communication ranges [10]. Similarly, a single blocker tag cannot “protect” all privacy-sensitive tags that may be distributed across the a large area. A solution is to deploy multiple readers and blocker tags with overlapping regions to cover the whole monitoring area. We assume all the readers and blocker tags are well synchronized by the excellent scheduling schemes [28]–[30]. All parameters 𝑓 , 𝑝, 𝑅 involved in REB are the same across all readers. In what follows, we present how to distributively construct the global 𝔹𝔾[0..𝑓 −1]. For an arbitrary slot 𝑠 in a frame, if all the readers observe an empty slot, the backend server sets 𝔹𝔾[𝑠] = 0; if no reader senses a collision and all the received RN16s are the same, the backend server sets 𝔹𝔾[𝑠] = 1; if at least one reader senses a collision or different RN16s are observed by different readers, the backend server sets 𝔹𝔾[𝑠] = 𝑐. Based on these rules, the reader is able to generate a global actual array 𝔹𝔾[0..𝑓 − 1]. Logically, all the readers co-work like a ‘super reader’ that is able to cover the whole area. The rest of REB in multi-reader scenarios is the same as what former sections have described.

1

Estimation Reliability

G. REB with Multiple Readers and Blocker Tags

Ratio of the taken frame size

2015 IEEE Conference on Computer Communications (INFOCOM)

30

00 6000 9000 1200 1500 1800 2100 0 0 0 0

Cardinality of tag set U (a)

1:1 3:1 1:3 1:1 0:1 1:0 1:1 :1 :1 :1 :3 :1 :1 :0

Tag ratio of B G B ∩G  G B (b)

Fig. 4. Evaluating the reliability of REB. 𝛼 = 5%, 𝛽 = 95%. (a) Tag ratio ∣𝐵 − 𝐺∣:∣𝐵 ∩ 𝐺∣:∣𝐺 − 𝐵∣ is fixed to 1 : 1 : 1, and 𝑢 varies from 3000 to 21000. (b) 𝑢 is fixed to 9000, and tag ratio varies.

C. Time Efficiency Besides the estimation reliability, another important metric is time-efficiency. In this subsection, we evaluate the time-efficiency of the protocols given the same estimation accuracy. No existing estimation protocols can correctly approximate the cardinality of genuine tags in an RFID system with the presence of blocker tags. The only possible solution, to the best of our knowledge, is to perform the comprehensive identification protocols to identify the tags in the system. Hence, we compare REB with two representative identification protocols: one is the Tree Hopping

1685

2015 IEEE Conference on Computer Communications (INFOCOM)

Execution time (s)

104 103 102 101 δ δ

1002

00

00

4 4 3 25 00 3000 5000 0000 5000 5000 0 0 0 Cardinality of tag set U

Fig. 5. Evaluating the time-efficiency of protocols with varying 𝑢. Tag ratio of ∣𝐵 − 𝐺∣:∣𝐵 ∩ 𝐺∣:∣𝐺 − 𝐵∣ is fixed to 1 : 1 : 1 and 𝛼 = 5%, 𝛽 = 95%.

2) Impact of Tag Ratio: The different tag ratio of ∣𝐵 − 𝐺∣ : ∣𝐵 ∩ 𝐺∣ : ∣𝐺 − 𝐵∣ may have significant impact on the execution time of protocols. Here, we fix 𝑢 = 30000, and evaluate the execution time of protocols with varying tag ratio. The simulation results in Fig. 6 demonstrate that our REB still outperforms the existing protocols by significantly reducing the execution time. Moreover, the results in Fig. 6 clearly show the performance trend of the protocols with varying tag ratio, which are elaborated below. The results in Fig. 6 (a) reveal that the larger the ratio of tags in 𝐵 − 𝐺 is, the larger the execution time of our REB scheme is. The underlying reason is that more tags in the set 𝐵 − 𝐺 will incur more interferences to the process

δ

(a) the ratio of tags in B G varies

Execution time (s)

δ

Execution time (s)

1) Impact of Tag Cardinality: To investigate the impact of tag cardinality on the protocols’ execution time, we fix the tag ratio ∣𝐵 − 𝐺∣:∣𝐵 ∩ 𝐺∣:∣𝐺 − 𝐵∣ to 1:1:1, and vary 𝑢 (indicating the system scale) from 20000 to 50000. The simulation results in Fig. 5 demonstrate that our REB significantly outperforms HT and EDFSA. For example, when 𝑢 = 50000, REB (𝛿 = 0) runs about 44 times faster than EDFSA, and nearly 920 times faster than TH; while REB (𝛿 = 1) runs 33 times faster than EDFSA, and 682 times faster than TH. Moreover, the execution time of HT and EDFSA grows linearly as 𝑢 increases. In contrast, our REB has a stable execution time, which reveals its good scalability against tag cardinality 𝑢.

of estimating genuine tags. We make another two main observations from Fig. 6 (b) which shows the execution time of the protocols with varying ratio of tags in 𝐵 ∩ 𝐺. First, the performance of identification protocols deteriorates as the ratio of tags in 𝐵 ∩ 𝐺 increases. The reason is that more tags in 𝐵 ∩ 𝐺 will cause more blocking collisions, which seriously interfere the tag identification process. Second, the larger the ratio of tags in 𝐵 ∩ 𝐺 is, the smaller the execution time of our REB scheme is. The underlying intuitive reason is that larger ratio of tags in 𝐵 ∩ 𝐺 leads to smaller ratio of tags in 𝐵 − 𝐺, which decreases the interference of tags in 𝐵 − 𝐺 to the process of estimating 𝑔. Because of a similar reason, the results in Fig. 6 (c) reveal that the execution time of REB also decreases as the ratio of tags in 𝐺 − 𝐵 increases.

Execution time (s)

(TH) protocol [19]; the other one is the Enhanced Dynamic Framed Slotted ALOHA (EDFSA) protocol [20]. TH protocol terminates after it traverses the whole tree. TH can identify not only the IDs in (𝐵 − 𝐺) ∪ (𝐺 − 𝐵) when a queried prefix is followed by a successful read, but also the IDs in 𝐵 ∩ 𝐺 when a prefix whose length is equal to tag ID but still followed by a collision read. Then, we can get the set 𝐺, by calculating [(𝐵 −𝐺)∪(𝐺−𝐵)−𝐵]∪(𝐵 ∩𝐺). The cardinality 𝑔 is got upon getting 𝐺. As for EDFSA protocol, it executes frames round by round. In a round, only the IDs in (𝐵 − 𝐺) ∪ (𝐺 − 𝐵) have chance to be identified. We denote the set of identified IDs as 𝑆𝑖𝑑𝑒𝑛𝑡 . Since the reader does not know whether all IDs in (𝐵 − 𝐺) ∪ (𝐺 − 𝐵) are completely identified or even what percentage of them are identified, EDFSA cannot terminate by itself. For the sake of EDFSA, we assumes it can “intelligently” terminate once [∣(𝐵 − 𝐺) ∪ (𝐺 − 𝐵)∣ − ∣𝑆𝑖𝑑𝑒𝑛𝑡 ∣] < ∣𝐺∣ × 𝛼.

(b) the ratio of tags in B

G varies

(c) the ratio of tags in G B varies

Fig. 6. Evaluating the time-efficiency of protocols with varying tag ratio. 𝑢 is fixed to 30000, and 𝛼 = 5%, 𝛽 = 95%.

IV.

R ELATED W ORK

In the infant stage of RFID study, a great deal of attention was paid to the problem of tag identification that aims to identify the exact tag IDs. Generally, there are two types of identification protocols: Aloha-based protocols [32] and Tree-based protocols [19]. Fundamentally, the Alohabased protocol is a kind of Time Division Multiple Access (TDMA) mechanism. A tag ID can be successfully identified in a slot when only one tag responds in this slot. As for tree-based protocols, the reader broadcasts a 0/1 string to query the tags. A tag responds with its ID once it finds that the querying string is the prefix of its ID. A reader can successfully identify a tag ID when only one tag responds. Clearly, the execution time of identification protocols is proportional to the tag population size. What is worse, in the RFID system with presence of blocker tags, the performance of identification protocols will further deteriorate because of the blocking collisions caused by IDs in 𝐵 ∩ 𝐺. To fast report the tag cardinality for various purposes such as timely stock monitoring, a great effort has been made to study the problem of tag estimation [5], [7], [10], [13]– [18], [23], [33]–[35]. These estimation protocols leverage the observations from Aloha/Tree protocols to statistically estimate the tag cardinality. For example, M. Shahzad et al. proposed the Average Run based Tag estimation (ART) by observing the average length of sequences of consecutive non-empty slots [10]. To the best of our knowledge, all these estimation protocols cannot address the problem of genuine tag estimation because they cannot exclude the interference from blocking tags. RFID privacy is of great importance but suffers the threat from malicious scanning. Ari Juels et al. proposed the blocker tags to protect the privacy-sensitive tags from

1686

2015 IEEE Conference on Computer Communications (INFOCOM)

malicious scanning [11]. Every coin has two sides. Ehsan Vahedi et al. indicated that the blocking technique causes a new threat to the RFID system. Specifically, a malicious blocker tag can prevent the valid reader from reading the tags. An efficient scheme was proposed to detect the existence of an attacker in the RFID system [36]. Following the original purpose of proposing blocker tags, this paper still leverages the blocker tags to protect the privacy of genuine RFID tags. V.

C ONCLUSION

This paper formally defines a new problem of genuine tag cardinality estimation with the presence of blocker tags. To efficiently address this practically important problem, we propose the RFID Estimation scheme with Blocker tags (REB), which is compliant with the commodity EPC C1G2 standard and does not require any modifications to offthe-shelf RFID tags. REB provides an unbiased functional estimator which can guarantee any degree of estimation accuracy specified by the users. Using REB, a retailer can timely monitor the product stock while blocker tags are being used to protect the privacy of some important items. Extensive simulation results reveal that REB is tens of times faster than the fastest identification protocol with the same accuracy requirement. ACKNOWLEDGMENT This work is supported by the National Science Foundation for Distinguished Young Scholars of China (Grant No. 61225010); the State Key Program of National Natural Science of China(Grant No. 61432002); NSFC under Grant nos. of 61173161, 61173162, 61272417, 61300187, 61300189, 61370198, 61370199, 61472184, 61321491 and 61272546; HK RGC PolyU G-YM08; NSF grants ECCS 1231461, ECCS 1128209, CNS 1138963, CNS 1065444, and CCF 1028167; the Jiangsu Future Internet Program under Grant No. BY2013095-4-08, and the Jiangsu High-level Innovation and Entrepreneurship (Shuangchuang) Program. R EFERENCES [1] K. Finkenzeller, “RFID Handbook: Fundamentals and Applications in Contactless Smart Cards, Radio Frequency Identification and Near-Field Communication,” Wiley, 2010. [2] L. Yang, Y. Chen, X.-Y. Li, C. Xiao, M. Li, and Y. Liu, “Tagoram: Real-Time Tracking of Mobile RFID Tags to High Precision Using COTS Devices,” Proc. of ACM MobiCom, 2014. [3] S. Qi, Y. Zheng, M. Li, L. Lu, and Y. Liu, “COLLECTOR: A Secure RFID-Enabled Batch Recall Protocol,” Proc. of IEEE INFOCOM, 2014. [4] T. Liu, L. Yang, Q. Lin, and Y. Liu, “Anchor-free Backscatter Positioning for RFID Tags with High Accuracy,” Proc. of IEEE INFOCOM, 2014. [5] L. Xie, H. Han, Q. Li, J. Wu, and S. Lu, “Efficiently Collecting Histograms Over RFID Tags,” Proc. of IEEE INFOCOM, 2014. [6] J. Liu, B. Xiao, K. Bu, and L. Chen, “Efficient Distributed Query Processing in Large RFID-enabled Supply Chains,” Proc. of IEEE INFOCOM, 2014. [7] Y. Zheng and M. Li, “PET: Probabilistic Estimating Tree for Largescale RFID Estimation,” IEEE Transactions on Mobile Computing, vol. 11, no. 11, pp. 1763–1774, 2012. [8] M. Roberti, “A 5-cent Breakthrough,” RFID Journal, vol. 5, no. 6, 2006.

[9] “http://www.centreforaviation.com/news/sharemarket/2010/06/17/hong-kong-airport-sets-new-cargo-traffic-recordfedex-sees-surging-asian-exports/page1.” [10] M. Shahzad and A. X. Liu, “Every Bit Counts: Fast and Scalable RFID Estimation,” Proc. of ACM MobiCom, 2012. [11] A. Juels, R. L. Rivest, and M. Szydlo, “The Blocker Tag: Selective Blocking of RFID Tags for Consumer Privacy,” Proc. of ACM CCS, 2003. [12] “http://www.informationweek.com/rsa-unveils-rfid-tag-blocker/d/did/1023433?” [13] M. Kodialam and T. Nandagopal, “Fast and Reliable Estimation Schemes in RFID Systems,” Proc. of ACM Mobicom, 2006. [14] M. Kodialam, T. Nandagopal, and W. C. Lau, “Anonymous Tracking using RFID tags,” Proc. of IEEE INFOCOM, 2007. [15] C. Qian, H. Ngan, Y. Liu, and L. M. Ni, “Cardinality Estimation for Large-scale RFID Systems,” IEEE Transactions on Parallel and Distributed Systems, vol. 22, no. 9, pp. 1441–1454, 2011. [16] B. Chen, Z. Zhou, and H. Yu, “Understanding RFID Counting Protocols,” Proc. of ACM MobiCom, 2013. [17] Y. Zheng and M. Li, “ZOE: Fast Cardinality Estimation for LargeScale RFID Systems,” Proc. of IEEE INFOCOM, 2013. [18] T. Li, S. Wu, S. Chen, and M. Yang, “Energy Efficient Algorithms for the RFID Estimation Problem,” Proc. of IEEE INFOCOM, 2010. [19] M. Shahzad and A. X. Liu, “Probabilistic Optimal Tree Hopping for RFID Identification,” Proc. of ACM SIGMETRICS, 2013. [20] S. Lee, S. Joo, and C. Lee, “An Enhanced Dynamic Framed Slotted ALOHA Algorithm for RFID Tag Identification,” Proc. of IEEE MobiQuitous, 2005. [21] P. Semiconductors, “I-CODE Smart Label RFID Tags,” http://www. nxp.com/acrobat download/other/identification/SL092030.pdf, 2004. [22] E. Inc, “Radio-frequency Identity Protocols Class-1 Generation-2 UHF RFID Protocol for Communications at 860 mhz-960 mhz,” EPCGlobal, Inc, 1.2.0 ed., 2008. [23] W. Gong, K. Liu, X. Miao, Q. Ma, Z. Yang, and Y. Liu, “Informative Counting: Fine-grained Batch Authentication for Large-Scale RFID Systems,” Proc. of ACM MobiHoc, 2013. [24] X. Liu, K. Li, H. Qi, B. Xiao, and X. Xie, “Fast Counting the Key Tags in Anonymous RFID Systems,” Proc. of IEEE ICNP, 2014. [25] D. E. Smith, “A source book in mathematics,” Courier Dover Publications, 2012. [26] M. Schilling, “Understanding Probability: Chance Rules in Everyday Life,” The American Statistician, vol. 60, no. 1, pp. 97–98, 2006. [27] S. N. V and D.-B. IV, “Mathematische Statistik in der Technik,” Deutscher Verl. der Wissenschaften, 1963. [28] L. Yang, J. Han, Y. Qi, C. Wang, T. Gux, and Y. Liu, “Season: Shelving Interference and Joint Identification in Large-Scale RFID Systems,” Proc. of IEEE INFOCOM, 2011. [29] S. Tang, J. Yuan, M. Li, G. Chen, Y. Liu, and J. Zhao, “Raspberry: A Stable Reader Activation Scheduling Protocol in Multi-reader RFID Systems,” Proc. of IEEE ICNP, 2009. [30] J. Waldrop, D. W. Engels, and S. E. Sarma, “Colorwave: An Anticollision Algorithm for the Reader Collision Problem,” 2003. [31] T. Li, S. Chen, and Y. Ling, “Identifying the Missing Tags in a Large RFID System,” Proc. of ACM MobiHoc, 2010. [32] F. C. Schoute, “Dynamic Frame Length ALOHA,” IEEE Transactions on Communications, vol. 31, no. 4, pp. 565–568, 1983. [33] W. Gong, K. Liu, X. Miao, and H. Liu, “Arbitrarily Accurate Approximation Scheme for Large-Scale RFID Cardinality Estimation,” Proc. of IEEE INFOCOM, 2014. [34] H. Liu, W. Gong, L. Chen, W. He, K. Liu, and Y. Liu, “Generic Composite Counting in RFID Systems,” Proc. of IEEE ICDCS, 2014. [35] Q. Xiao, B. Xiao, and S. Chen, “Differential Estimation in Dynamic RFID Systems,” Proc. of IEEE INFOCOM, 2013. [36] E. Vahedi, V. Shah-Mansouri, V. W. S. Wong, I. F.Blake, and R. K. Ward, “Probabilistic Analysis of Blocking Attack in RFID Systems,” IEEE Transactions on Information Forensics and Security, vol. 6, no. 3, pp. 803–817, 2011.

1687