Security, Privacy, and Access Control in Information-Centric Networking

4 downloads 9589 Views 3MB Size Report
This work has been submitted to IEEE Communications Surveys & Tutorials journal and is supported ... with a primary objective of efficient content delivery. One.
Security, Privacy, and Access Control in Information-Centric Networking: A Survey Reza Tourani, Travis Mick, Satyajayant Misra and Gaurav Panwar Dept. of Computer Science New Mexico State University {rtourani, tmick, misra, gpanwar}@cs.nmsu.edu Abstract—Information-Centric Networking (ICN) is a new networking paradigm, which replaces the widely used host-centric networking paradigm in communication networks (e.g., Internet, mobile ad hoc networks) with an information-centric paradigm, which prioritizes the delivery of named content, oblivious of the contents origin. Content and client security are more intrinsic in the ICN paradigm versus the current host centric paradigm where they have been instrumented as an after thought. By design, the ICN paradigm inherently supports several security and privacy features, such as provenance and identity privacy, which are still not effectively available in the host-centric paradigm. However, given its nascency, the ICN paradigm has several open security and privacy concerns, some that existed in the old paradigm, and some new and unique. In this article, we survey the existing literature in security and privacy research sub-space in ICN. More specifically, we explore three broad areas: security threats, privacy risks, and access control enforcement mechanisms. We present the underlying principle of the existing works, discuss the drawbacks of the proposed approaches, and explore potential future research directions. In the broad area of security, we review attack scenarios, such as denial of service, cache pollution, and content poisoning. In the broad area of privacy, we discuss user privacy and anonymity, name and signature privacy, and content privacy. ICN’s feature of ubiquitous caching introduces a major challenge for access control enforcement that requires special attention. In this broad area, we review existing access control mechanisms including encryption-based, attributebased, session-based, and proxy re-encryption-based access control schemes. We conclude the survey with lessons learned and scope for future work. Keywords–Information-centric networking, security, privacy, access control, architecture, DoS, content poisoning.

content providers, but they have also motivated the research community to explore designs for a more scalable Internet, with a primary objective of efficient content delivery. One of the products of this endeavor is the Information-Centric Networking (ICN) paradigm. ICN shifts the networking paradigm from the current hostcentric paradigm, where all requests for content are made to a host identified by its IP address(es), to a content-centric paradigm, which decouples named content objects from the hosts where they are located. As a result, named content can be stored anywhere in the network; each content object can be uniquely addressed and requested. Several ICN architectures have been proposed, such as Named-data networking/contentcentric networking (NDN/CCN), Publish-Subscribe Internet Routing Paradigm (PSIRP), Data Oriented Network Architecture (DONA), and Network of Information (NetInf). Though they differ in their details, they share several fundamental properties: unique name for content, name-based routing, pervasive caching, and assurance of content integrity. ICN enhances several facets of user experience as well as security, privacy, and access control; however, it also gives rise to many new security challenges. Various concepts and solutions have been proposed to address these challenges in the literature.

According to the Cisco Visual Networking Index forecast, video traffic (including VoD, P2P, Internet, and TV) will comprise 90% of all Internet traffic by 2019. The majority of this traffic is currently served to end users with the help of content delivery networks (CDNs), with servers that reside close to the network edge. This has helped reduce core network traffic and improve delivery latency. Despite the scalability that CDNs have so far provided, the current host-centric paradigm will not continue to scale with the proliferation of mobile devices and the Internet of Things (IoTs) coupled with the rapidly increasing volume of video traffic. Not only have these trends been putting pressure on Internet Service Providers (ISPs) and

In this article, we explore ICN security, privacy, and access control concerns in-depth, and present a comprehensive study of the proposed mechanisms in the state of the art. We categorize this survey into three major domains, namely security, privacy, and access control. In the security section, we address attacks applicable to both IP-based networks and ICNs, such as denial of service (DoS and distributed DoS or DDoS) and vulnerabilities unique to ICN, including cache pollution, content poisoning, and naming attacks. Despite many similarities between a classical DoS attack and the DoS attack in ICN, the latter is novel in that it abuses ICN’s stateful routing plane; the attack aims to overload a router’s state tables, such as the pending interest table (PIT) and forwarding information base (FIB). The cache pollution attack targets a router’s content locality with the intention of altering its set of cached content; this results in an increase in the frequency of content retransmission, and consequently reduces network goodput.

This work has been submitted to IEEE Communications Surveys & Tutorials journal and is supported in part by the U.S. NSF grants:1345232 and 1248109 and the U.S. DoD/ARO grant: W911NF-07-2-0027.

In the privacy section, we study the privacy risks in ICN under four classes: client privacy, content privacy, cache privacy, and name and signature privacy [32]. We explore the

1. I NTRODUCTION

implications of each of these risk classes and elaborate on relevant proposed solutions. Due to ICN’s support for pervasive caching, content objects can be replicated throughout the network. Though this moves content close to the edge and hence reduces network load and content retrieval latency, it comes at a cost—publishers lose control over these cached copies and cannot arbitrate access. Thus, there is need for efficient access control, which allows reuse of cached content and also prevents unauthorized accesses. Several mechanisms have been proposed in which access control is achieved using content encryption, clients’ identities, content attributes, or authorized sessions. We review these proposed mechanisms and highlight their benefits and drawbacks in detail in the access control section. At the end of each section, we present a summary of the state of the art and also discuss open research challenges and potential directions to explore. We conclude the survey with a summary of lessons learned. Before we dive into the discussion, we briefly review some representative ICN architectures in Subsection 1.A. Following that we identify previous surveys in ICN covering different ICN architectures, naming and routing, DoS attacks, mobility, and potential research directions in Subsection 1.B. A. Overview of the Proposed Information-Centric Networking Architectures Based on the nature of communication in the proposed ICN architectures, we categorize them into two main models, as shown in Fig. 1: consumer-driven and publish-subscribe. In the consumer-driven architectures, communication is initiated when a client requests a content from the network; in response, the requested data is sent into into the network by a publisher. The content routers locate and deliver the requested content without the use of any request-to-content matching service. CCN [67], [2] and NDN [7] are two popular consumer-driven ICN architectures. In contrast, in the publish-subscribe architectures a publisher first advertises its content to the network and interested subscribers establish subscriptions to the content. The content is then delivered from the publisher to the subscriber with the help of a matching service provided by a name resolution service [16], resolution handlers [73], or rendezvous nodes [105], [8], [9]. DONA [73], PURSUIT [9], PSIRP [105], [8], NetInf [16], and MobilityFirst [100], [6] fall into this category. Although this is not an inclusive list of ICN architectures, it is largely representative; thus, we will review these architectures in what follows. We refer interested readers to two surveys [17], [121] for more details on other ICN architectures, such as SAIL [10], 4WARD [1], COMET [99], [3], CONVERGENCE [4], and CONET [43]. The Data Oriented Network Architecture (DONA) [73] was proposed by Koponen et al. at UC Berkeley in 2007. DONA uses a flat self-certifying naming scheme. Each name consists of two parts; the first is the cryptographic hash of the publisher’s public key, and the second is an object identifier, which is assigned by the publisher and is unique in the publisher’s

domain. To achieve self-certification, the authors suggested that publishers use a cryptographic hash of the object as the object identifier. A subscriber can then easily verify the integrity of an object simply by hashing it and comparing the result to the object’s name. DONA’s resolution service is composed of a hierarchically interconnected network of resolution handler (RH) entities, which are tasked with publication and retrieval of objects. To publish an object, the owner sends a REGISTER message including the object name to its local RH. The local RH, keeps a pointer to the publisher and propagates this message to its parent and peer RHs, who then store a mapping between the local RH’s address and the object name. A subscriber interested in the object sends a FIND message with the object name to its own local RH. The local RH propagates this request to its parent RH, and propagation continues until a match is found somewhere in the hierarchy. After finding a match, the request is forwarded towards the identified publisher. The authors proposed two methods of object delivery from publisher to requester. In the first method, the publisher sends the object using the underlying IP network. The second method takes advantage of path symmetry: the request message records the path it takes through the network. After reaching the publisher, the object traverses the reverse path from the publisher to the requester. Exploiting this routing model, RHs on the path can aggregate the request messages for an object and form a multicast tree for more efficient object dissemination/delivery. Content-centric Networking (CCN) [67], [2] was proposed by researchers at Palo Alto Research Center in 2009. In 2010, Named Data Networking (NDN) [7], which follows the same design principles, was selected by the US National Science Foundation (NSF) as one of four projects to be funded under NSF’s Future Internet Architecture program. Both CCN and NDN share the same fundamentals, such as a hierarchical naming scheme, content caching, and named content routing. The hierarchical naming allows the provider’s domain name to be used in making routing decisions. In the client-driven CCN/NDN, a client sends an interest packet into the network to request a content by its name. Content routers, equipped with a content store (CS), a pending interest table (PIT), and a forwarding information base (FIB), receive the interest and perform a CS lookup on the content name. If the content is not available in the CS, the router performs a PIT lookup to check whether there is an existing entry for the requested content. If the PIT lookup is successful, the content router adds the incoming interest’s interface to the PIT entry (interest aggregation) and drops the interest. If no PIT match is found, the router creates a new PIT entry for the interest and forwards the interest using information from the FIB. An interest can be satisfied either by an intermediate forwarding router which has cached the corresponding content chunk, or in the worst case, by the content provider. In both cases, the content takes the interest’s reverse-path back to the requester. Upon a router’s receipt of a content chunk, the PIT lookup

ICN Architectures

Consumer-Driven

CCN

NDN

Publish-Subscribe

DONA

PSIRP/PURSUIT

NetInf

MobilityFirst

Fig. 1: Categorization of Information-Centric Networking architectures.

identifies the interfaces over which it should be forwarded. The content router may cache a copy of the content in its CS in addition to forwarding it through the designated faces. The Publish Subscribe Internet Technology (PURSUIT) [9] project and its predecessor Publish Subscribe Internet Routing Paradigm (PSIRP) [105], [8], were funded by FP7 (European Union’s research and innovation program) to produce a publishsubscribe protocol stack. A PURSUIT network is composed of three core entities, namely Rendezvous Nodes (RNs) which form the REndezvous NEtwork (RENE), the topology manager, and forwarders. Similar to DONA, PURSUIT uses a flat naming scheme composed of a scope ID, which groups related information objects, and a rendezvous ID, which ensures that each object’s identifier is unique in its group. A publisher advertises its content by sending a PUBLISH message to its local RN, which routes the message to the RN designated to store the content defined by the scope. The local RN makes this decision using a distributed hash table (DHT). A subscriber interested in the content object sends a SUBSCRIBE message to its local RN, which will also be routed to the designated RN using the DHT. Upon receipt of a SUBSCRIBE message by the designated RN, the topology manager is instructed to generate a delivery path between the publisher and the subscriber. The topology manager then provides the publisher with a path through the forwarders. In PURSUIT, network links are each assigned a unique string identifier, which the topology manager uses to create a routing Bloom filter for each flow. The generated Bloom filter is then added to each packet’s header, and is used by the intermediate forwarders for content delivery. Network of Information (NetInf) [16] was initially conceived in the FP7 project 4WARD [1]. NetInf employs a flat naming scheme with a binding between names and their locators, which point to the content’s location. As several nodes can cache copies of the data, an object may be bound to more than one locator. Two models of content retrieval are offered by NetInf: name resolution and name-based routing. In the name resolution approach, a publisher publishes its data objects to the network by registering its name/locator binding with the name resolution service (NRS). An interested client resolves the named data object into a set of locators and subsequently submits a request for the object, which will be delivered by the routing forwarders to the best available cache. The routing forwarders, after obtaining the data, deliver it

back to the requester. In the name-based routing model, a client directly sends out a GET message with the name of the data object. This message is forwarded to an available storage node using name-based routing, and the data object, once found, is forwarded back to the client. MobilityFirst [100], [6] was funded by the NSF’s future Internet Architecture program in 2010. The main focus of this architecture is to scale in the face of device mobility, hence it includes detailed mechanisms for handling mobility, wireless links, multicast, multi-homing, security, and in-network caching. Each network entity, including devices, information objects, and services, is assigned a globally unique identifier (GUID), which can be translated into one or more network addresses. To advertise a content, a publisher requests a GUID from the naming service and registers this name with a global name resolution service (GNRS). The registered GUID is mapped, by a hash function, to a set of GNRS servers, which are connected through regular routing. A subscriber can then obtain the content name from a Name Certification Service (NCS) or use a search engine to resolve a human-readable name into the corresponding GUID. A subscriber submits a GET message, containing both the GUID of the desired object and its own GUID, to its local content router. Since content routers require the network address, the request will be forwarded to the GNRS to map the GUID into actual addresses. The result of this query is a set of partial or complete routes, or a set of addresses. Upon receiving this information, the requesting content router attaches the destination network address to the GET message and forwards it into the network. Any content router on the forwarding path may contact the GNRS for an updated destination address or route, which may have changed due to the provider’s mobility. The publisher, upon receiving the GET message, sends the requested object back to the source GUID following the same procedure. MobilityFirst provides a combination of IP routing and name-based routing by name resolution and data routing processes. On-path caching is employed to satisfy subsequent requests for previously served GUIDs. This is in contrast to off-path caching, which causes an update in the GNRS service, where the new caching node’s network address is added to the GUID’s record.

Survey Organization

Security

Privacy

Access Control

Denial of Service

Timing Attack

Encryption-Based

Content Poisoning

Monitoring Attack

Attribute-Based

Cache Pollution

Anonymity

Session-Based

Secure Naming & Routing

Protocol Attack

Proxy Re-encryption

Application Security

Name & Signature

Miscellaneous

Generic Contributions Fig. 2: The organization of the survey.

B. Review of Existing ICN Surveys Ahlgren et al. [17] reviewed the different proposed information-centric architectures. In addition to describing the architectures in detail, the authors also presented their open challenges. Following this survey, Xylomenos et al. [121] surveyed the proposed ICN architectures, comparing their similarities and differences and discussing their weaknesses. Tyson et al. focused on mobility in information-centric networks in [107]. Several benefits of node mobility were discussed by the authors, as well as mobility-related challenges such as provider mobility and cached content discovery. Zhang, Li and Lin [125] and Zhang et al. [126] explored proposed caching approaches in information-centric networking. In [27], Bari et al. reviewed the state-of-the-art in naming and routing for information-centric networks and explored the requirements for ideal content naming and routing. Future research directions in information-centric networking were discussed by Pan et al. [92]. Aamir and Zaidi [11] surveyed denialof-service attacks in information-centric networks and identified interest flooding, request piling, content poisoning, signature key retrieval, and cache pollution as DDoS vectors. AbdAllah et al. [12] recently discussed security attacks in ICN. The authors classified attacks into four categories: routing, naming, caching, and miscellaneous. The paper focused on discussing the ways an attacker can orchestrate these attacks as well as the applicability of current IP-based solutions to information-centric networks. Novel Contributions of this Survey: All the existing surveys have either not dealt with security, privacy, and access control or have looked at them to a very limited extent. The work of AbdAllah et al. [12] is the first survey dealing with security in ICNs, but it is not comprehensive. For instance, access control in ICNs has not be considered in any survey

and access control is important. To the best of our knowledge, we are the first to present a comprehensive survey of the state-of-the-art in security, privacy, and access control in the context of ICN. In this survey, we present each of these three aspects independently, surveying the state of the art, lessons learned, and the shortcomings of proposed approaches. We also discuss existing challenges and propose potential directions and solutions to explore. We believe that a comprehensive review of the state-of-the-art in ICN security, privacy, and access control is essential for a reader/researcher to gain deeper understanding of the open challenges and existing solutions in this domain, which is quickly becoming popular. The rest of the paper is organized as it is depicted in the Fig. 2. In Section 2, we review the security issues of different ICN architectures, their proposed solutions, and existing open problems. Different privacy issues, proposed solutions, and open challenges are presented in Section 3. Access control enforcement mechanisms, their drawbacks, and existing open challenges are presented in Section 4. In Section 5, we summarize the existing ICN security research and present a comprehensive discussion of future research directions in ICN security. 2. S ECURITY IN ICN In this section, we review vulnerabilities in ICN and discuss the state-of-the-art solutions, then conclude this section with open problems and potential solutions to be explored. This section is divided into subsections based upon the particular types of attacks. First, we discuss the proposed countermeasures against DoS attacks. Content poisoning and cache pollution attacks and their countermeasures are discussed in the subsequent subsections. Then, we discuss attacks inherent to content naming and describe proposed mechanisms for secure naming. We will also explore proposed application-level security

mechanisms. We conclude this section with an overview of general contributions to ICN security, including work that cannot be grouped into any of the categories described above. Fig. 3 illustrates the classification of existing literature on ICN security. A. Denial of Service (DoS) Attack DoS attacks in ICN may target either intermediate routers or content providers. The most basic type of attack, interest flooding, involves an attacker sending interests for a variety of content objects that are not likely to be present in the targeted routers’ caches. This is mainly a concern in consumer-driven architectures such as CCN and NDN, where the storage of a PIT entry for each received interest may result in the exhaustion of the router’s PIT table memory and prevent it from serving benign clients’ requests. This scenario is depicted in Fig. 4, which shows clients and an attacker connected to an edge router that is also a content router (can cache content). The network is composed of a content provider at one end (on the right) and the routing core consisting of routers without content cache and the content routers with content cache. In this scenario, the edge router, connected to the attacker as well as legitimate clients, has its PIT filled up by the attacker’s interests. The interest name /attack/C* refers to some undefined content name that may not exist, is inaccurate, or is a request for dynamic content to be created on-the-fly. This attack is more severe when the attacker requests fake content objects (i.e., names with a valid prefix with an invalid suffix) or dynamic objects, which need to be generated by the provider. Requests for fake objects will result in the provider dropping the interest, and subsequently the PIT entries on the targeted router(s) (e.g., routers on the path) will remain active until their expiration. On the other hand, dynamic content requests will be served by the provider. However, these requests/replies burden the forwarding routers, and may also cause DoS at the provider. Wang et al. [68] investigated the effect of content caching on DoS attacks, focusing on CCN in particular. They compared the DoS attacks that target content providers in IP-based and content-centric networks, and proposed a queuing theory based model to model DoS attacks. This model considers the caching period of content objects as well as queuing delay at repositories. The authors concluded that DoS attacks in CCN (also applies to NDN) have limited effectiveness in comparison to DoS attacks on IP networks due to a reduced request arrival rate at the content provider. Because intermediate routers can satisfy interests, interest flooding can be localized significantly by increasing the cache size at routers and the time period for which content objects are cached. Despite the correctness of the authors’ models, several unrealistic assumptions weaken the relevance of the work. The authors assumed that an attacker only requests content objects that are available at the content provider(s) and may be cached. However, this is not a realistic attack scenario; a real attacker targeting a content provider would request either non-existent

content or dynamically-generated content (which may be unpopular and hence useless when cached). Also, the analysis provided does not account for cache replacement policies, which would affect the content caching period. Furthermore, intermediate routers would be more vulnerable targets to DoS than content providers; however, the impact of DoS on routers was not discussed. Afanasayev et al. [14] proposed three approaches to coping with interest flooding attacks in named-data networking (NDN). Their vanilla approach is a slight modification of the wellknown Token Bucket algorithm, in which each router limits the number of pending interests for each interface proportional to its link capacity (bandwidth-delay product). This technique is not very effective, as a router may utilize the entire link capacity to satisfy an attacker’s interests, hence reducing the satisfaction rate of legitimate clients’ interests. The authors augmented this vanilla approach by introducing a concept of per-interface fairness. In this mechanism, a router ensures that the outgoing link capacity is shared fairly among traffic from all incoming interfaces, thus preventing traffic from a minority of incoming interfaces from consuming an entire link’s capacity. For this purpose, the PIT is extended with a new column to denote each interest’s as either forwarded or in-queue. The router also maintains a queue for each incoming interface. An interface with a high interest arrival rate is subject to queuing in favor of service to other interfaces. This improvement partially solves the problem, as an attacker on one interface will be unable to consume all of the router’s resources. However, even with this approach there is no distinction between an attacker and a legitimate client. Both the attackers’ and the legitimate clients’ interests are rate-limited if they are incident on a high-rate interface. The last proposed algorithm differentiates interest timeout events from interest satisfaction events. Each router keeps statistics pertaining to the satisfaction history of its interfaces. This allows incoming interfaces with higher satisfaction rates to be given a greater share of the outgoing link capacity. The drawback of this approach is that the probability of satisfaction for an interest reduces dramatically as path length increases. A longer path may be subject to more congestion and packet loss at the routers and with more routers the probability of rate limiting increases. To address this drawback, the authors suggested that routers explicitly announce their interest satisfaction ratio limits to their downstream neighbors, who can then adjust their own acceptance thresholds accordingly. This algorithm, despite being more effective, still applies penalties at the granularity of interface, not flow. Legitimate users’ flows will still suffer. Gasti et al. [54] also explored DDoS attack scenarios in named-data networking, focusing primarily on interest flooding. The authors divided interest flooding scenarios into classes depending on whether the attackers request (1) existing or static, (2) dynamically generated, or (3) non-existent content objects. The attack target for Types (1) and (3) is the networkcore infrastructure, while the Type (2) attack targets both the content providers and the network infrastructure. The authors noted that malicious requests for existing or static content has

Security

?

?

DoS/DDoS

Content Poisoning

[14] [38] [41] [54] [68] [79] [91] [109] [111] [112] [113]

[54] [57] [58] [59] [72]

?

?

Cache Pollution

Secure Naming and Routing

Application Security

[15] [19] [20] [21] [22] [42] [63] [95] [116] [122] [127]

[23] [26] [30] [31] [49] [60] [61] [70] [98] [108] [117] [123]

[39] [69] [86] [93] [119]

?

? Other General Contributions [50] [51] [55] [81] [83] [110]

Fig. 3: ICN Security sub-categories and the state-of-the-art. Content  router

PIT /Youtube.com/movie

/attack/C1 /attack/C2 /attack/C3 /attack/C4 /attack/C5

2 2 2 2 2

Router

Router

Client /attack/C  * Edge  router Attacker Client

Content  router

Content  provider

/NMSU.edu/Network

Fig. 4: Denial of Service (DoS) attack scenario.

limited effect due to content caching at intermediate routers. In contrast, requesting dynamically generated content not only consumes intermediate routers’ resources (such as PIT space and bandwidth), but also keeps the providers busy with generation of content chunks corresponding to each incoming interest. It was noted that non-existent content is the type most likely to be used in attacks against infrastructure. The authors suggested that routers keep track of the number of pending interests per outgoing face, as well as the number of unsatisfied interests per incoming face and/or per-name prefix. Hence, rate limiting could be applied when these counters exceed a predefined threshold. The per-name prefix based rate limiting is a better approach than per-interface rate limiting. Compagno et al. [38] designed Poseidon, a collaborative mechanism for interest flooding mitigation. Poseidon involves two phases: detection and reaction. Detection is performed individually, with each router monitoring two values: ratio of incoming interests to outgoing content, and the amount of PIT state consumed by each interface. These statistics are collected over a time window, such that old data does not affect detection of future attacks. When a pre-set threshold is reached the router enters reaction mode, wherein collaborative mitigation takes place. A router rate limits its interfaces with abnormal interest arrival rates, then sends notification to its downstream routers about the attack. A downstream router receiving such notification can then detect the attack at an earlier stage. The authors noted that rate-limiting was more effective at

reducing the attacked router’s PIT size than the notification mechanism, however notification improved the satisfaction rate of requests. Unfortunately, the authors did not evaluate the impact of their mechanism on legitimate clients in detail; particularly concerning is the potential effect on clients that are co-located on the same interface as an attacker. Dai et al. [41] proposed an IP-inspired approach for mitigating interest flooding in NDN. The scheme is inspired by IP-traceback which allows an attack to be “traced back” to the attacker. The interest traceback procedure is triggered when the size of the PIT at a router exceeds a predefined threshold. At this moment, the router generates a spoofed data packet for the longest-unsatisfied interest in the PIT. The spoofed data will be forwarded to the attacker, causing its edge router to be notified of the malicious behavior; in response, the router can rate-limit the attacker’s interface. Similar to other rate-limiting approaches, this mechanism may also have a negative impact on legitimate clients. This scheme in particular can cause a legitimate client, who has mistakenly requested a non-existent (or yet-to-be-created) content, to be unfairly penalized. Additionally, since rate limiting only occurs at the edge router, this scheme may be ineffective if an edge router is compromised or is non-cooperative with its peers. Furthermore, the authors do not discuss the impact of the router’s decisions on legitimate long-unsatisfied interests; e.g., it is not mentioned whether all unsatisfied interests with long wait times or only a subset of them may be treated as malicious. Virgilio et al. [109] analyzed the security of the existing PIT architectures under DDoS attack. The authors compared three proposed PIT architectures: (1) SimplePIT, which stores the entire URL, (2) HashPIT, where only a hash of the URL is stored, and (3) DiPIT (distributed PIT), where each interface uses a Bloom filter to determine which content objects should be forwarded. The authors concluded that all three proposed PIT architectures are vulnerable to DDoS attack, and they all perform the same under normal traffic conditions. While SimplePIT and HashPIT suffer from memory growth in the face of DoS, DiPIT does not consume extra memory. DiPIT

will always use a fixed amount of memory, as each face is assigned a counting Bloom filter to identify which data should be forwarded on that face. Unfortunately, the Bloom filter’s inherent false positive rate has the potential to cause data to be forwarded unnecessarily, and therefore waste bandwidth. Although this paper showed the effects of DDoS on different PIT architectures through simulation, the authors did not propose any viable solution. Wang et al. [112] proposed a mechanism which copes with interest flooding by decoupling malicious interests from the PIT. The mechanism requires that each router monitor the number of expired interests for each name prefix, then add a prefix to the malicious list (m-list) if this count exceeds a chosen threshold. To prevent legitimate name prefixes from staying in the m-list, each m-list entry is assigned an expiry time, after which the prefix is removed from the m-list. However, an mlist entry’s expiry timer is reset if a new interest arrives for the same prefix. Routers avoid storing PIT state for prefixes recorded in the m-list by modifying an interest in order to make the corresponding content object self-routing. Before forwarding the interest, the interface on which it arrived is appended as the last component of the content name. When the response arrives, it can then be routed without a PIT lookup. This procedure can be applied by several routers on the path, in which case a list of interfaces will be present at the end of the name. Each downstream router removes its own interface (which would be the last in the list) before forwarding the content object to the next hop. Although this helps routers keep the sizes of their PITs manageable, they will still be responsible for forwarding the malicious interests; thus network congestion and starvation of legitimate clients are still possible. This mechanism also puts additional processing burden on the routers and increases packet overhead. To remedy the shortcomings of this mechanism, the authors [113] later proposed an interest flooding detection and mitigation mechanism based on fuzzy logic and router cooperation. In the detection part, that core routers monitor their PIT Occupancy Rate (POR) and PIT Expiration Rate (PER), which represent the rate at which the PIT collects new entries and the rate of PIT entry expiration, respectively. The realtime values corresponding to these rates are collected and used through fuzzy inference rules to identify if they are normal or abnormal. If either value is abnormal, it triggers the mitigation mechanism. The mitigation mechanism can be triggered by a router itself or by another router. The targeted prefix is determined, the router identifies an interface on which the most interests for that prefix have arrived; applies rate-limiting to that interface; and notifies its neighbor (on that same interface) of the targeted prefix. Simulation results show the effectiveness of this mechanism in both reducing PIT memory consumption and increasing interest satisfaction for legitimate clients. However, the authors assumed that the attackers only target a specific name prefix; thus the mitigation is effective in dismantling attacks against

specific publishers, but not those against the network infrastructure itself. Moreover, a distributed attack could reach the network core over several paths; as this approach allows each router to identify only one malicious interface, its effectiveness against DDoS is unknown. Wang et al. [111] modeled the interest flooding attack in NDN using a symmetric binary tree topology. The model considers factors, such as routers’ PIT sizes, round trip times, PIT entries’ TTLs, content popularity distribution, and both malicious and legitimate interest rates. To analyze the impact of a DoS attack, the authors derived a DoS probability distribution, which evaluates the probability that a legitimate interest will be dropped due to starvation. The authors modeled the events of PIT entry insertion and removal with a continuous-time homogeneous Markov chain, where the number of PIT entries at any time is given by the states of the Markov process. A simulation result confirms the validity of the theoretical model. The authors suggested that the effectiveness of DoS could be reduced by using bigger PITs, bigger content stores, and shorter TTLs for PIT entries. Unfortunately, these suggestions do not actually address the problem: an attacker could easily increase its request rate proportionally. Li and Bi [79] proposed a countermeasure against DoS attacks targeting dynamic content. As opposed to static content, which is signed once when it is generated, dynamic content is generated and signed upon interest arrival; a high rate of requests can thus overload the content provider due to the computational overhead of the signature generation. As a mitigation, the authors proposed a proof-of-work mechanism, which requires clients to perform some extra computational task before requesting a dynamic content object. Before requesting the content, the client requests a metapuzzle from the content provider. Upon receiving the metapuzzle, the client generates the actual puzzle and solves it (similarly to how blocks are mined in Bitcoin). The puzzle solution and the current timestamp form a part of the interest. Upon receiving the interest, the provider checks both the validity and freshness of the solution. If the solution is valid and fresh, the provider generates and signs the desired content; otherwise, the interest is dropped. The meta-puzzle is updated either after a predefined lifetime or after a large number of solutions are received. This proposal does increase the barrier for DoS or DDoS attacks, but also puts a computational burden on legitimate clients. But more importantly, the authors stated that a DDoS attack with as few as 300 attackers can significantly degrade the effectiveness of this scheme. Nguyen et al. proposed an interest flooding detector based on statistical hypothesis testing theory [91]. The scheme is based upon the fact that when under attack, the interest rate on an interface is greater than that during normal conditions. Meanwhile, the data rate under both hypotheses remains the same; therefore, the data hit-ratio in attack scenarios is lower than that in normal conditions. Unlike other solutions, this scheme takes the desired probability of false alarm as a parameter and calculates the detection threshold accordingly. Hence, the

TABLE I: Summary of DoS/DDoS Mitigation Approaches Mechanism

Target

Afanasayev et al. [14]

Router

Non-Existent

Compagno et al. [38] Dai et al. [41]

Router Router Provider Router Provider Provider Router Router Router

Non-Existent Non-Existent Dynamic Existing & Non-Existent Existing Dynamic Non-Existent Non-Existent Non-Existent

Gasti et al. [54] Wang et al. [68] Li et al. [79] Nguyen et al. [91] Wang et al. [112] Wang et al. [113]

Content Type

Mitigation Approach Rate Limiting & Per-face Fairness Per-face Statistic & Priority Rate Limiting & Per-face Statistics Rate Limiting & PIT Size Monitoring

Router’s Functionality PIT Extension Storing Statistics Storing Statistics Not Applicable

Scope Individual Routers Router Collaboration Router Collaboration Router Collaboration

Rate Limiting & Per-face Statistics

Storing Statistics

Individual Routers

Caching Period Increase Client’s Proof-of-Work per Interest Statistical Hypotheses Testing Theory Decoupling Malicious Interest from PIT Fuzzy Logic-based Detection

Not Applicable Not Applicable Storing Statistics Additional Queue Storing Statistics

Individual Routers Not Applicable Individual Routers Individual Routers Router Collaboration

threshold only depends on the chosen false positive rate and the inherent trade-off between detection delay and threshold accuracy. Unfortunately, the ndnSIM evaluation provided by the authors uses only a simple binary tree graph with eight clients and one attacker; thus, the effectiveness of the scheme is currently unknown for large networks or distributed attacks. In Table I, we summarize all the proposed DoS mitigation mechanisms in terms of the entity implementing the mechanism, whether the attack model involves existent, dynamic, or non-existent content requests, the nature of the mitigation approach, the extra functionality needed in the routers, and the level of collaboration required between routers. We discuss open problems and scope for future work in the last subsection to give a comprehensive view to the reader. B. Content Poisoning Attack The objective of the content poisoning attack is to fill routers’ caches with invalid content. To mount this attack, an attacker must control one or more content providers or intermediate routers, so that it may inject its own content into the network. The injected content should have a valid name corresponding to an interest, but a fake payload or an invalid signature. The poisoning attack is illustrated in Fig. 5, with the attacker (one of the content routers on the path between the client and provider) returning an invalid content (oval C1) instead of the genuine content (rectangular C1) corresponding to the requested name. The content poisoning attack is typically infeasible on the IP Internet (without mounting a man-in-the-middle) as a client connects directly to the provider to establish a flow. The ICN paradigm however, allows the content to be served by any intermediate router or from any one of several content providers. This attack can have potentially devastating consequences: the network can be filled with poisoned content objects that are useless to the clients, while useful content finds no place in the caches. In the following, we review the countermeasures against content poisoning. Gasti et al. [54] were the first to discuss the content/cache poisoning attacks. As their first countermeasure, the authors suggested the use of a “self-certifying interest/data packet” (SCID), which helps forwarding routers validate received content chunks. Prior to sending an interest, a client is required to obtain the desired chunk’s hash, name, and signature from the content provider; this information is then attached to the

interest. On retrieving a content chunk, a router can easily check its validity by comparing its hash to the hash from the interest. This method is less computationally intensive than traditional RSA signature verification, however it requires that the client obtain the hashes for each data chunk/packet beforehand. This requires the client to query the provider directly prior to requesting a content; this dramatically increases content retrieval latency and limits scalability. As an alternative solution, the authors proposed cached content signature verification by routers. In their basic model, each router randomly selects content chunks to be subject to verification; the router verifies the signatures of the selected chunks, and drops any that are corrupted. To prevent redundant verification, routers collaboratively select a range of content chunks to verify; the scope of this collaboration varies, ranging from a neighborhood to an organization. To reduce collaboration overhead, the authors also suggested client feedback decision-making, in which a client may inform its edge router about each content chunk’s validity. However, this type of feedback can also be used by malicious clients to mislead routers by reporting legitimate content objects as fake, or viceversa. Ghali et al. [58] proposed a content poisoning mitigation mechanism while introducing an updated definition of a fake content. The authors defined a fake content as one with a valid signature using the wrong key, or with a malformed signature field. The authors discussed the applicability of existing solutions such as signature verification by intermediate routers, which is infeasible at line speed. Although self-certifying names can mitigate the effect of content poisoning, there are problems which need to be addressed such as how content hashes

Content  router

Client

Router Edge  router

Router

Attacker C1 Client C1

C1 Content  router

Content  provider

Fig. 5: Content poisoning attack scenario.

should be obtained and how dynamic content objects should be addressed. Hence, the authors proposed a ranking mechanism for cached content using exclusion-based feedback. Exclusion is a selector feature in the CCN and NDN architectures, which allows a client to exclude certain data (either by hash or name suffix) from matching its interest, effectively overriding a match on the requested name’s prefix. Clients can use this feature to avoid receiving data objects that are known to be unwanted, corrupted, or forged. Therefore, it is a useful feedback mechanism for detection of poisoned content. The detector’s ranking function takes three factors into account, namely number of exclusions, exclusion time, and exclusioninterface ratio. The exclusion time defines the recency of a particular data name exclusion (freshness of the exclusion). In the paper, a content with more exclusions, or a recent exclusion has a lower rank. A content’s rank is also reduced if the router receives exclusion feedback for it from multiple clients on different interfaces. In this approach, whenever there are multiple cached contents with names that match the name of an interest, the router returns the content with the highest rank. The drawbacks of this approach are: it is highly dependent on client feedback; non-cooperative and/or malicious clients can undermine its effectiveness; and the exclusion feature is not present in all ICN architectures. Ghali et al. [57], [59] also noted that the content poisoning mitigation is contingent on network-layer trust management. According to them the cache poisoning attack depends on two properties of ICN: interest ambiguity and lack of a trust model. The former arises from the interest packet structure, which considers the content name as the only compulsory field, while neglecting two other fields, the content digest and the publisher public key digest (PPKD). The latter refers to the lack of a unified trust model in the network layer. To efficiently solve the content poisoning problem, the authors suggested a mechanism that clarifies this ambiguity in the interests. The proposed approach is built upon adding a binding between content name and the provider’s public key, an Interest-Key Binding (IKB), to the interest packet. The only modification at the content provider is the addition of the provider’s public key to the content’s KeyLocator field. An intermediate router, upon receiving a content, should match the hash of the public key present in the KeyLocator field with the interest’s PPKD (available in the PIT). The content will be forwarded if these match, and will be discarded otherwise. The client-side complexity of this approach is in obtaining the provider’s public key in advance. In order to bootstrap a trust model, the authors proposed three approaches: a preinstalled public key in the client’s software application, a global key name service similar to DNS, and a global search-based service such as Google. To reduce core routers’ workload, the authors proposed that edge routers perform the IKB check for all content packets, while core routers randomly verify a subset of content packets. Unfortunately, this mechanism does not scale. Signature verification, which is a public key infrastructure (PKI) based verification, is slow and cannot be performed at line speed, hence even if random routers or edge routers

perform the verification, it will result in congestion and potentially undesirable interest timeouts. Some other weaknesses of the mechanisms proposed by the authors [54], [58], [57], [59] include the assumption that the verifying router is trusted– perhaps the router is malicious, then it can verify an incorrect IKB to be correct. Further, the schemes lacked from detailed analysis of scalability and overhead. Kim et al. [72] proposed a mechanism to reduce signature verification cost. The mechanism was inspired by check before storing (CBS) [29], which probabilistically verifies and checks content items, and only validated content items can be stored in the cache. Through simulation analyses, the authors noticed that in general only a small number of cached contents (10% of the content set) are requested before expiration of their lifetimes. Hence, they divided the cached content into serving content, which will be requested while they are cached, and bypassing content, which will be dropped from the cache before subsequent interests. The authors used a segmented LRU policy for cache replacement: a content is initially put in the unprotected segment of the cache (content initially assumed to be by-passing content), upon successful verification it is moved to the protected segment. The proposed countermeasure, taking advantage of this separation, only verifies the signature of a serving content. A content is defined as serving when it has a cache hit, at which point it’s signature is verified and it is moved to the protected cache segment. This prevents resources from being wasted on verifying by-passing content. To avoid multiple verifications of a single content chunk, the verified chunk will be tagged when it is stored in the protected cache. Although simulation results demonstrate that the percentage of the cache filled with poisonous content goes down, the scheme has some drawbacks. It still suffers from latency due to the verification process that occurs for every chunk that is requested twice. Hence, an attacker can enforce verification of every fake content, by requesting it twice–this at scale could lead to a DoS/DDoS attack. The authors show that with increase in the protected segment proportion in the cache the overall hit rate goes down, but they do not mention if that is for fake content or for usable serving content, this has significant bearing on the efficiency of the mechanism. Table II summarizes the basic techniques used in the proposed countermeasures and their overheads. C. Cache Pollution Attack Caching in ICN is effective due to the premise that the popularity of the universe of content on the Internet follows a distribution (e.g., Zipf distribution), where a small number of popular contents are requested frequently, while the rest of the contents are requested sparingly. The popular (frequently requested) content objects or chunks can be stored in caches at the network edge, thus reducing request latency and reducing network load. However, an attacker can undermine this popularity based caching by skewing the content popularity distribution by requesting less popular content more frequently.

TABLE II: Content Poisoning Countermeasures Mechanism Gasti et al. [54] Ghali et al. [58] Ghali et al. [57], [59] Kim et al. [72]

Mitigation Approach Self-Certifying Interest & Collaborative Signature Verification Client Feedback, Content Ranking Interest-Key Binding & Adding the Provider’s Public key to the Content Collaborative Signature Verification of Serving Content

This will pollute the cache and make caching less effective. This is the cache pollution attack. In this subsection, we explore two classes of cache pollution attacks: locality disruption and false locality. In the locality disruption attack, an attacker continuously requests new, unpopular contents to disrupt the locality of the cache by churning the cache regularly with new content. In the false locality attack, on the other hand, the attacker’s aim is to change the popularity distribution of the local cache by repeatedly requesting a set of unpopular contents from within the universe of contents. That is, this attack creates a different popularity order for the contents. Xie et al. [119] proposed CacheShield, a mechanism providing robustness against the locality disruption attack. It is composed of two main components: a probabilistic shielding function, and a vector of content names and their corresponding request frequencies. When a router receives a request for a content chunk, if the chunk is in its CS, it replies with the content. Otherwise, the router forwards the interest towards the provider. When a chunk arrives at the router, the shielding p−t function defined as, 1/(1 + e q ), where p and q are predefined system-wide constants and t denotes the tth request for the given chunk, is used to calculate the probability of placing the content in the CS. If the chunk is not placed in the CS, then the router either adds the chunk’s name with a frequency of one in the vector of content names, if it does not exist; if the name exists, then the number of requests for the chunk is incremented by one. A router caches a chunk in the CS when the request frequency of the chunk’s name in the vector exceeds a pre-defined threshold. This approach suffers from the fact that the shield function’s parameters p and q are constants and can be easily deduced (if not known), and hence an attacker can easily calculate the value of t. Then the attacker has to just ensure that it requests the unpopular contents more than t times. Additionally, the portion of the CS that is used to store the name vector is essentially an overhead. To overcome the shortcomings of CacheShield, Conti et al. [39] proposed a machine-learning based algorithm. They evaluated the effect of cache pollution attacks on different cache replacement policies and network topologies. They proposed a detection algorithm, which operates as a sub-routine of the caching policy. The algorithm is composed of a learning step and an attack-testing step. It starts by checking the membership of an arrived content in a sample set chosen from the universe of contents. If the content belongs to the sample set, the learning step will be triggered with the goal of identifying an attack threshold (defined as τ ) for evaluating the contents. The value of τ is used by the attack test sub-routine in the testing step. The attack test sub-routine

Overhead Hash Value Comparison & Random Signature Verification Content Ranking Calculation PPKD Comparison & Signature Verification Signature Verification on Cache Hit

simply compares the calculated τ with another value δm , which is a function with parameters, such as content request frequency and the size of the measurement interval, of all contents in the sample set as input. If δm is greater than τ , then the mechanism detects an attack. The drawback of this approach is that it only detects the attack, but does not identify the attack interests, or content chunks. Further, the assumption that the adversary’s content requests can only follow a uniform distribution, while legitimate users follow the Zipf distribution is also not a fair assumption. The adversary can always create requests, such that the distribution of its requested content follows a Zipf distribution–true for both locality disruption or false locality. Park et al. [93] proposed a cache pollution attack detection scheme based on randomness check. They proposed an iterative scheme that takes advantage of matrix ranking and sequential analysis for detecting a low-rate cache pollution attack, in which an attacker requests content chunks at a low rate to bypass any rate filters. The detection scheme starts with the routers mapping their cached content onto an n × n binary √ matrix M , where n ' [ S c ] and Sc is the average number of cached content. The authors employ two cryptographic hash functions for mapping a content name to the row and column indices. The rank of matrix M is evaluated using the Gaussian elimination method. The ranking process is iterated k times, and the attack alarm is triggered if the matrix-rank reaches a pre-defined threshold. To increase detection accuracy, the authors used a cumulative-sum algorithm over the iterations. As they were interested in low-rate attacks, the scheme does not consider popular contents. The popular contents are removed from the matrix over the k iterations by AND and XOR operations performed on M . The authors showed the effectiveness of their scheme in detecting low-rate locality-disruption attacks. However, this scheme is not applicable to the harder to detect false locality attack. Furthermore, the caching routers have to perform computationally intensive operations such as matrix generation, popular content elimination, cryptographic hashing, and iterative rank calculations, which may not only undermine scalability, but also adoptability. Karami et al. [69] proposed an Adoptive Neuro-Fuzzy Inference System (ANFIS) based cache replacement policy resilient to cache pollution. The proposed replacement policy involves three stages: input-output data pattern extraction, accuracy verification of the constructed ANFIS structure, and integration of the constructed model as a cache replacement policy. In the first stage, an ANFIS structure is constructed according to the properties of the cached content. Longevity (the period that the

TABLE III: Cache Pollution Countermeasures Mechanism Conti et al. [39] Karami et al. [69] Mauri et al. [86] Park et al. [93] Xie et al. [119]

Detection & Mitigation Approaches

Attack Type

Random Content Sampling for Attack Threshold Detection Adoptive Neuro-Fuzzy Inference System Replacement Policy Honeypot Installation & Hidden Monitoring Cached Content Matrix Ranking Probabilistically Caching Popular Content

Locality Disruption Locality Disruption & False Locality False Locality (by Content Provider) Low-rate Locality Disruption Locality Disruption

content has been cached), content request frequency, standard deviation of the request frequency, the last content retrieval time, content hit-ratio, and the variance of the content request rate over all interfaces, for each cached content, are all fed to a nonlinear system. The system returns a goodness value (between 0 and 1) per content, where 0 indicates false-locality, 0.5 indicates locality-disruption, and 1 indicates a valid content. The system iteratively evaluates the goodness of the cached contents with longevity higher than a pre-defined threshold. Eventually, the system selects the contents with goodness values less than the threshold, ranks the remaining contents, and applies cache replacement over the content with lower goodness values. The authors showed the advantages of their proposed mechanism over CacheShield in terms of hit damage-ratio (proportion of hits that cannot occur due to the attack), percentage of honest consumers receiving valid contents, and communication overhead. However, this mechanism needs to store historical and statistical information for each cached content, which could be a significant memory overhead, especially for core routers. Additionally, the routers are required to iteratively compute statistics and update the cache state–additionally computations that can undermine scalability. Mauri et al. [86] discussed a cache pollution scenario in an NDN network in which the provider is the attacker and its intent is to maliciously utilize the router’s cache to preferentially store its own content objects and reduce their delivery latency. The authors assume that the malicious provider has access to terminal nodes (bots or zombies), which request the content items available at the attacker. This allows a greater proportion of the attacker’s content to move down to the edge of the network, thus improving its latency of delivery to legitimate clients, under the assumption of requests being routed to the nearest replica routing. At the same time, other contents experience relatively higher delay, even if they are truly popular. The authors proposed a mitigation mechanism for this attack that used a honeypot installed close to potential zombies, which monitors and reports the malicious interests to the upstream routers. A router gathers these interests into a blacklist; the interests in this blacklist are routed using the standard NDN routing protocol, which routes based on the FIB entry not the CS or nearest replica. In Table III, we summarize the proposed cache pollution solutions based on their detection and mitigation approaches, and the nature of the attack. We also present the nature of the storage and computation overheads for each solution at the routers.

Router’s Overhead Storage Computation Low Moderate Moderate High Moderate Low Low High Moderate High

D. Secure Naming and Routing 1) Secure Naming: Wong et al. [116] proposed a secure naming scheme with the objective of establishing trust between content providers and clients. The proposed scheme is based on three identifiers: authority identifier (ID), which is generated from the provider’s public key; content identifier, which is the cryptographic hash of the content; and algorithmic identifier, which binds the content identifier with a set of the content fragment/chunk identifiers. Based on the URI naming convention, the authority field is mapped to the provider’s public key and the resource path field holds the content identifier. This is similar to the naming scheme in DONA. In this scheme, the generated metadata, which includes information, such as content ID, provider ID, algorithmic ID, digital signature, and additional content specific information are disseminated into a set of network nodes that function as part of a domain name system and also store metadata in a DHT. For content retrieval, a client queries the DNS to resolve the content name into a digital certificate. By extracting the authority identifier from the certificate, the client obtains the metadata that has to be resolved by the DHT. The query to the DHT returns the content and algorithmic ID, which the client uses to request the content. The authors have not evaluated the scalability of the approach in terms of the header overheads and the latency due to DNS and DHT queries. Similar to the previous scheme, Dannewitz et al. [42] proposed a naming scheme for NetInf. The authors defined a tuple composed of the content ID, the content, and a piece of metadata called the information object (IO). The content ID follows a self-certifying flat structure containing type, authentication, and label fields. The type field specifies the hashing function used for ID generation. The authentication field is the hash value of the provider’s public key; and the label field contains a number of identifier attributes and is unique in the provider’s domain. The IO includes fields, such as the provider’s complete public key and its certificate, a signature over the self-certified data, the hash function used for the signature, and any additional information for the owner’s authentication and identification. One of the drawbacks of this naming scheme is the significant overhead of the headers, which includes the public key and the certificate. The authors do not discuss what portion of the metadata is fetched in the beginning and whether the signature verification happens per chunk, or if it happens after the whole content is downloaded. Verification after the whole content is downloaded is undesirable as it enables cache poisoning and pollution attacks.

Zhang et al. [127] proposed a name-based mechanism for efficient trust management in content-centric networks. This mechanism takes advantage of identity-based cryptography (IBC), in which either the provider’s identity or the content name prefix is used as the public key. In this mechanism, a trusted private key generator (PKG) entity is responsible for generating private keys given an identity (public key). To do so, the PKG generates a master key, which it keeps secret and uses it to generate private keys, and a set of public system parameters. To sign a content, a provider first securely acquires its private key (corresponding to its identity) from the PKG. After that, it signs the content and publishes it to the network. A client, using the public system parameters and the identity of the provider, can easily verify the content signature. This procedure can also be performed using the content name prefix as the public key; in this case, a name resolution service is required to register the name prefix. For confidentiality, a provider may encrypt the content with the client’s public key, which can be obtained using the client’s identity by using the system parameters. Meanwhile, the client has to securely communicate with the PKG to receive its private key. For group based communication, the provider encrypts the content with a symmetric key and encrypts the symmetric key using the group members’ public keys. Despite the advantages of IBC, PKI is still necessary to secure communication between the PKG and other network entities. Additionally, the use of the content name prefix as the public key needs to be investigated more thoroughly. Another significant drawback is that the scheme requires the client to receive its private key from a trusted third-party, which seriously undermines the usability of this scheme in the real-world. Hamdane et al. [63] proposed a hierarchical identity-based cryptographic (HIBC) naming scheme for NDN. This scheme ensures a binding between a content name and its publisher’s public key. Their identity-based content encryption, decryption, and signature mechanisms follows [127]. Different from the previous work, the authors proposed a hierarchical model in which a root PKG is responsible only for generating private keys for the domain-level PKGs. In this hierarchical model, the domain-level PKGs perform the clients’ private key generation. The identity of an entity is represented as a tuple, composed of its ancestor PKGs’ identities and its own identity. Therefore, a PKG at level t derives the private keys of it’s child entity with an ID t+1-tuple of the form (ID1 , ID2 , ..., IDt , IDt+1 ), where IDt+1 refers to the child entity’s ID. In this scheme, the ID tuple is used as the public key and the cryptographic operations are similar to classical IBC. This scheme has the same scalability concerns as the previous scheme on account of the encryption/decryption costs. In fact, the overhead is higher as the size of the public key is longer and grows with the depth of the hierarchy. Table IV summarizes the existing secure naming schemes and presents the type of cryptography used, the mechanism for ensuring provenance, and the nature of the encryption infrastructure. We note that for most of the proposed naming schemes, there exist significant overhead and there needs to be

more effort in reducing these overheads, or at least amortizing their cost across the complete set of interests/responses. 2) Secure Routing: Afanasayev et al. [15] proposed a secure namespace mapping scheme, which allows interest forwarding for name prefixes that are not in the FIB, which is important in the event of node mobility. The proposed mechanism is built upon two main concepts: link object and link discovery. The link object is basically an association between a name prefix and a set of globally routable prefixes. By creating and signing a link object, the content owner maps its own name prefix to those globally routable prefixes. The authors designed an NDN based DNS service (NDNS), where the mapping between the name prefix and the globally routable prefixes are stored, and the service provides this mapping (delegations) to a requesting entity. For link discovery, a client queries the NDNS iteratively for each component of the requested name prefix. If a client sends an interest that a router cannot satisfy using its FIB, that router returns a NACK. After the NACK reaches the client, its local forwarder discovers and validates the link object corresponds to the name prefix. After that, the client embeds the link object to its original interest and forwards it to the network. Although this scheme is a good initial solution to provider mobility it still has overheads. When a provider moves, the current routable prefix, which is in the FIB of the routers, will results in interests being routed to the provider’s former location until the FIB entries time out, which may waste bandwidth in high traffic scenarios. Rembarz et al. [95] proposed two approaches to secure the communication between public and private domains in NetInf. The first approach, gateway-centric approach, places a gateway between the public and private networks. All communication between these two networks are routed through this gateway. A publisher in the private domain publishes a content to a private name resolver, PNR, which resides in the private domain. The PNR informs a public name resolver (NR) in the public domain, about the published content’s identifier along with the gateway’s location; instead of the actual publisher’s location. A subscriber in the public domain resolves the content identifier at the public NR and obtains the gateway address. Upon the successful authentication of the subscriber at the gateway, the gateway resolves the content identifier at the PNR and delivers the content from the publisher to the subscriber. In the second approach, the publisher in the private domain publishes its private data identifier to a PNR. The PNR creates a mapping between the content identifier ID and a generated alternative identifier ID’ that is sent to the NR. A subscriber, in the public domain, contacts the NR to resolve ID’ to its location. The NR redirects the subscriber to the PNR for authentication and authorization. Upon the successful authentication, the PNR provides a token to the subscriber, which the subscriber uses for content retrieval from the publisher. This mechanism solves the drawbacks of the first approach, in which the gateway is a single point of failure and the network bottleneck, by eliminating the necessity of having a gateway. However, the PNR’s computation and communication overhead for subscribers authentication and authorization (especially

TABLE IV: Secure Naming Approaches Mechanism Dannewitz et al. [42] Hamdane et al. [63] Wong et al. [116] Zhang et al. [127]

Crypto RSA HIBC RSA IBC

Provenance Pub. Key Digest IBC Signature Pub. Key Digest IBC Signature

when the private network serves large amounts of requests) undermines the scalability of this approach. Alzahrani et al. [19], [20] proposed a DoS resistant selfrouting mechanism using Bloom filters for publish/subscribe networks. In publish/subscribe networks, each network link is assigned a unique identifier (LID), which is represented in the form of a Bloom filter. When a network entity requests for a path from the client to a location where the content exists (may be publisher or a cache), an entity called the topology manager (TM), which resides in one or more routers, generates a filter (z-filter) that specifies the delivery path from a publisher to its subscriber by OR-ing the Bloom filters (LIDs) of the links on the delivery path. At the intermediate routers, an AND operation between the z-filter (in the packet header) and the routers’ LIDs on the path indicates the delivery links. This mechanism is vulnerable against DoS attack in which an attacker, can collect enough z-filters and reuse them to overload the delivery path with bogus traffic. The authors suggested the use of dynamic link identifiers to remedy this vulnerability. In the proposed mechanism, the TM creates a new z-filter (for each time interval) considering the incoming/outgoing interfaces of the routers on the delivery path, a time-based secret (shared between TM and intermediate edge routers), and the flow ID (the information item ID). These per-flow, time-sensitive zfilters restricts the duration for which the attacker can use them (after that they become stale). This updated mechanism introduces two drawbacks; first, the number of z-filter updates increases with a decrease in the time interval, thus better attack mitigation requires higher computational overhead for the TM. Second, the size of the packet header (includes the z-filter) increases with the number of links in the delivery path. For this reason, the authors investigated factors that affect the z-filter’s size in [21]. One of the factors affecting the Bloom filter size is its inherent false positive probability—bigger the filter’s size smaller the false positive probability. This false positive probability may cause more network traffic by selecting additional links that are not in the delivery path. An attack scenario arising from falsepositives involves an attacker who maliciously turns some 0 bits in the z-filter into 1’s to add more links to the delivery path. In another attack scenario, an attacker can launch a replay attack using a valid filter, in order to send a bogus content, which was not requested. Alzahrani et al. [22] proposed a key management protocol for publish-subscribe networks which utilized dynamic link identifiers. Following the proposed mechanism in [19], [20] the authors proposed an enhancement that prevents a malicious publisher from generating fake z-filters by creating a mechanism for the publisher’s edge router to verify the TM generated

Drawbacks Lack of Evaluation & Scalability Issue Signature Verification Overhead PKG Requirement for Private key Generation Scalability Issue & Public key Length

z-filter. For a content of a particular publisher, the TM uses the symmetric key that it shares with the publisher’s edge router to cryptographically hash the corresponding z-filter and the z-filter generation timestamp and forwards it to the publisher along with the z-filter. The publisher sends the packet including the received information to its edge router who checks the validity of the z-filter by comparing the received hash (generated by the TM) with the one it generates. Upon successful validation, the edge router stores the z-filter in a table for its TTL period, and forwards all subsequent packets with that z-filter, without further validity check. Due to the vulnerability of the Diffie-Hellman protocol against man-in-the-middle attacks, to achieve mutual authentication between TM and routers, the authors used DiffieHellman-DSA, as proposed in [80]. However, the proposed mechanism is vulnerable against the malicious publisher colluding with its edge router. In addition, this mechanism requires stateful routers, which are vulnerable against flooding based DoS attacks (similar to CCN/NDN DoS-flooding attack). Yi et al. [122] augmented the NDN forwarding plane to thwart security problems, such as prefix hijacking and PIT overload. In prefix hijacking, an attacker announces the victim’s prefix and drops the packet. The authors suggested the use of interest NACKs whenever requests are not satisfied for reasons, such as network congestion, non-existent content, and duplicate content. The interest NACK reduces the chance of PIT overload by allowing the reduction of the PIT timeout to a value close to the network RTT, where previously the PIT timeout was much greater than the RTT. Additionally, it mitigates the prefix hijacking vulnerability, by providing extra time for the router to query other faces. However, it increases the amount of states that routers must store: each router has to store information about the RTT for each interest, for a router that receives requests at line speed, this can be a large amount of states. Although a router’s storage can possibly handle this amount of storage, it opens a new horizon for attackers to pollute this information-base. Additionally, with the NACK consuming an interest in the PIT, there is no scope for bogus interest aggregation, thus an attacker can keep sending the same bogus interest several times without any adverse action. E. Application-level Security Work on ICN application security can be classified into three major subtopics: filtering, anomaly detection, and application security suites. Filtering concerns the identification and removal of unwanted content, such as spam, forged content, and content from untrusted publishers. Anomaly detection involves the detection of other types of undesired activity, such as

flooding, misbehavior of network elements, and malicious traffic. We have designated application-specific security measures as application security suites; these suites combine different cryptographic techniques to achieve some specific goal(s). Fotiou et al. [49] proposed an anti-spam mechanism for publish/subscribe networks. It is based on an inform-ranking process, with content ranked based on votes from publishers and subscribers. Each publisher serving a content implicitly votes for that content, and the publishers’ votes are weighted based on the publishers’ own ranks and publication counts. After the content is published, it can be voted on by subscribers. Each subscriber’s vote is weighted inversely to the total number of votes it has cast. After all votes are collected, the information can be used to rank the content objects and identify content objects that are likely to be spam. The simulations demonstrated that this mechanism is better at filtering spam in comparison to existing schemes, which only consider the publisher’s vote when ranking content. However, this scheme’s reliance on user feedback may degrade its effectiveness in a real deployment; not only are typical users unlikely to vote on the content, but malicious users can hijack the process easily to make up the majority. Moreover, the voting process itself confers non-negligible communication overhead. Karami et al. [70] proposed a fuzzy anomaly detection algorithm for content-centric networks. It employs the Particle Swarm Optimization (PSO) meta-heuristic algorithm, k-means clustering, and a fuzzy detection algorithm to classify behaviors as either normal or abnormal. The detector must first be trained, and thereafter can used to identify potentially malicious traffic. The fuzzy approach is notable for its low false-positive rate. However, this comes at the cost of an increased false-negative rate. Therefore, it may be possible for an attacker with sufficient resources to produce enough traffic such that some of its malicious packets are not detected. Additionally, a false positive results in a legitimate user’s quality of service being degraded– the user may get wrongly punished. Wong et al. [117] proposed a separate security plane for publish/subscribe networks, which would be responsible for assuring content integrity. The security plane takes over the distribution of authentication materials and associated content metadata, which would otherwise be the responsibility of the data plane. The materials distributed by the security plane would therefore include the content name, the content ID, the Merkle tree root, the publisher’s public key, and the publisher’s signature. To prevent the insertion of malicious metadata, publishers are obligated to identify themselves to the security plane and submit to challenge-response authentication. We believe that while it is convenient for data to be separated from its authentication materials, abstraction of this functionality into a separate control plane is ultimately unnecessary. The integrity assurances of the proposed control plane are no more flexible and no stronger than those provided by simpler content-signing schemes, such as the manifest-based content authentication supported by CCN or NDN. Goergen et al. [61] designed a semantic firewall for contentcentric networks. Unlike IP firewalls, which filter at flow-

level granularity, the CCN firewall can filter content based on provider and/or name. For provider-based filtering, the firewall must obtain the provider’s public key, which is then used to identify disallowed providers and also filter content objects with invalid signatures. Content name filtering is a more convenient filtering paradigm, in which requests with blacklisted keywords in the name are filtered. Both types of filtering can be performed on either interests or their corresponding content objects. Additionally, the firewall can monitor the behavior on each of its interfaces and filter peers that show abnormal behavior, such as high request volume or high drop rate. A minimalistic evaluation of the proposed firewall shows that latency does not increase dramatically with the number of filtering rules – retrieval time increased by only 1.75% for a 500MB content. However, these delays may become significant for bigger content objects and with the increase in number of content objects, which has not been addressed. Thus the scalability of the approach is unknown. Goergen et al. [60] proposed a security monitoring mechanism for CCN with the objective of detecting attack patterns based on the activities of the FIB, PIT, and CS. To detect abnormal behavior, each node periodically evaluates statistical per-second information such as bytes sent, bytes received, content items received, and interests received. In addition, statistics are stored on the total number of accepted, dropped, and sent interests. In order to classify a particular time period as either anomalous or benign, the authors employ support vector machine (SVM) classification. The experimental results show the efficacy of this method for attack detection; however, its ability to detect certain low-rate attacks is questionable. Furthermore, the added responsibility of SVM classification, which is relatively intensive, creates unnecessary load on core network elements. Ambrosin et al. [23] identified two different ways of creating an ephemeral covert channel in named-data networking. For both types of covert channel, sender and receiver must have tight time synchronization and agree on a set of unpopular content to use for the exploit. To send a “1” bit covertly, the sender requests an unpopular object during the proscribed time slot, to send a “0,” no request is sent. In the first variation, the object is assumed to be cached at the edge router if it was requested. The receiver then requests the same content, and measures the retrieval time to differentiate a cache hit from a cache miss, and consequently infers the bit that was sent. This mechanism is accurate when the sender and receiver are colocated behind the same edge router; therefore, its applicability is limited. Furthermore, the covert transmission is very limited in bandwidth, and imposes a large load on the network. Burke et al. [30] presented a security framework for a CCNbased lighting control system. In the first variation of the protocol, control commands required a three-way handshake and were transmitted in a signed content payload; in the second, the commands were immediately sent as a signed interest. The framework uses an authentication manager to manage the network’s PKI, and employs shared symmetric keys for communication. To reduce the burden of key storage on the

TABLE V: Application Security Summary Mechanism Ambrosin et al. [23] Asami et al. [26] Burke et al. [30] Burke et al. [31] Fotiou et al. [49] Goergen et al. [60] Goergen et al. [61] Karami et al. [70] Saleem et al. [98] Vieira et al. [108] Wong et al. [117] Yu et al. [123]

Application Ephemeral covert channel Moderator-controlled information sharing Lighting control system Secure sensing in IoT Anti-spam mechanism Traffic anomaly detection at routers Semantic firewall Anomaly detection mechanism Secure email service Security suite for Smart Grid Content integrity by security plane Trusted Data Publication/Consumption

Approach Time difference analysis between cache hit and cache miss Publisher signature followed by moderator signature for message publications Submitting commands as signed content or signed interest Assigning a sensor an ACL for content publishing Information ranking based on publishers and subscribers votes Statistical data analyses and SVM classification Filtering by content name, provider’s public key, and anomaly detection Fuzzy detection algorithm and traffic clustering Asymmetric crypto with emails as independent objects Content-based cryptography and access level distribution via security server Content signature and publisher authentication to security plane by challenge-response Schematized chain-of-trust

embedded devices, these symmetric keys can be generated ondemand by a pseudorandom function. These shared symmetric keys can then be used to enforce encryption-based access control. The authors in [31] employ a similar architecture to perform secure sensing in the Internet of Things (IoT). The system requires an authorization manager (AM), a trusted third party which generates root keys, which are then used to sign any other keys produced. The AM associates a producer with a namespace, which is listed in the producer’s certificate. Each sensor is also assigned an access control list, which specifies the permissions of each application with respect to that node. While this scheme is flexible in providing confidentiality, integrity, authentication, and access control for IoT networks, it suffers from a significant overhead problem— power-constrained devices such as sensing nodes are required to perform asymmetric-key cryptography, which are expensive. Yu et al. [123] presented a schematized trust model for named-data networks to automate data authentication, signing, and access procedures for clients and providers. The proposed model is composed of two components: a set of trust rules, and trust anchors. Trust rules define associations between data names and the corresponding keys that are used to sign them. The authors define a chain of trust, which is discovered by recursively evaluating trust rules, starting from the KeyLocator field in the content and ending at a trusted anchor. Anchors are envisioned to serve as trusted entities that help bootstrap the key discovery process. For data authentication, the client uses the public key in the KeyLocator of the packet and according to the trust schema, recursively retrieves public keys to reach a trust anchor. It authenticates the data packet by verifying the signatures from the trust anchor to the received packet (in reverse). For signing a content a client identifies the trust rule and if it has a corresponding key, it signs the content. If the key does not exist, the client generates the corresponding key according to the name and the cryptographic requirements. A generated key needs to be signed according to the trust rule. The chain-oftrust allows an entity to publish verifiable content and another to verify the veracity of the content following the chain. The iterative discovery and key verification step is inherently inefficient, especially for mobile devices that are power

constrained. Further the trust rules may become complex very quickly within a few levels, this may result in inaccurate configuration during usage. This limits the applicability of the approach. Vieira and Poll [108] proposed a security suite for C-DAX, an information-centric Smart Grid communication architecture. The proposed security suite employs content-based cryptography, in which content topics are used as public keys, and the corresponding secret keys are generated by a security server. For each topic, write-access secrets and read-access secrets must be distributed to each authorized publisher and subscriber, respectively. While the scheme provides sufficient security and flexibility for typical applications, its reliance on a central security server constitutes a single point of failure. In a high-impact application such as the Smart Grid, the failure or compromise of this service could have dire consequences. Also, requiring cyber-physical devices to store two keys for each topic also limits scalability. Saleem et al. [98] proposed a distributed secure email service for NetInf, based on asymmetric-key cryptography. In line with the principles of ICN, each email message is treated as an independent object. A client’s (user’s) public key constitutes its identifier, and no domain name service is required; therefore, the scalability of the proposal is good. However, the subscription-based nature of the service potentially leaves users vulnerable to spam, and no mitigation for this has yet been proposed. Asami et al. [26] proposed a moderator-controlled information sharing (MIS) model for ICN, which provides Usenetlike functionality for ICNs while leveraging an identity-based signature scheme. Several message groups are defined, each of which is assigned a moderator. To publish a message, the publisher signs with its secret key then sends it to the moderator of the group to which it wants to publish; the moderator can then sign the message and relay it to the group’s subscribers, or reject the message and drop it. To verify a signature, the subscriber only needs to know the identities of the publisher and moderator. This is an example of implementation of a secure legacy application in ICN. Table V summarizes the proposed application-level mechanisms.

F. Other General Contributions In this subsection, we review existing work which highlighted, classified, and addressed ICN security problems in general. In what follows, we discuss security concerns in NetInf [81], PSIRP [51], and publish/subscribe architectures in general [50], as well as concerns related to ICN’s stateful data plane [110]. Loo et al. [81] studied the security challenges faced by NetInf from the perspectives of both applications and infrastructure. The authors divided their concerns into eight categories: access control, authentication, non-repudiation, data confidentiality, data integrity, communication security, availability, and privacy. Application-layer concerns included poisoned content injection, privacy invasion, unauthorized access, and false accusation. At the infrastructure level, the authors elaborated on the threats of unauthorized content access, privacy, and cache and route misuse. Solutions discussed by the authors included provider authentication and authorization, Tor-like approaches to privacy preservation, and PKI-based approaches to signature verification and content integrity. Unfortunately, the descriptions of the proposed solutions are shallow, and their generality raises concerns about their efficiency and flexibility for application in ICN. Fotiou et al. [51] reviewed a clean-slate PSIRP networking architecture and highlighted its security assurances. The architecture employs self-certifying names, each composed of a rendezvous identifier (RID) and a scope identifier (SID). Content publication and client subscription operations follow the same general approach as any publish/subscribe network. To preserve information security, content transmissions are encrypted and include packet-level authentication (PLA). Under PLA, the packet header contains the sender’s signature along with its public key and certificate. The forwarding mechanism utilizes the z-filter, a Bloom filter generated by the topology manager to define the information delivery path. The z-filters protect against DoS attack by using dynamic link identifiers, as explained in [19], [20]. In the previous subsection, we discussed concerns about the z-filter, related to scalability and false positives. Apart from that, the other main drawback of this design is the use of per-packet cryptographic signatures in PLA. Performing such operations at line speed is difficult, even for routers equipped with embedded cryptographic processors. Fotiou et al. [50] discussed the security requirements and threats in publish/subscribe networks, and presented some preliminary solutions towards a secure rendezvous network. The authors highlighted client privacy, access control, content integrity, confidentiality, and availability as the most important security concerns for publish/subscribe networks. Additionally, rendezvous-based networks are required to handle subscriber/publisher authentication, anonymity for user subscriptions and subscription-publication matching, and accounting for publication dissemination. A lack of attention to these security concerns would lead to vulnerabilities such as cache poisoning, denial of service, route hijacking, and attacks where a malicious

entity creates multiple fake identities (Sybil attack). The authors broadly discussed a solution which employs a key management center, role-based and attribute-based access control, and homomorphic cryptography. This mechanism mitigates many of the poisoning and hijacking attacks, however privacy concerns and denial-of-service attacks have not been addressed. Attribute-based access control, in general, also has limited flexibility due to its inherent lack of support for revocation. Ghali et al. [55] proposed a secure fragmentation mechanism for content-centric networks. Unlike the chunking procedure already performed by content providers, content fragmentation may happen anywhere in the network–necessary if a chunk larger than a link MTU (maximum transmission unit) must be forwarded. The authors argued for per-hop reassembly of fragments, and concluded that support for interest packet reassembly should be mandated for the sake of routing efficiency. However, such reassembly requires a more sophisticated content integrity verification mechanism. Therefore, the authors proposed a method of incremental fragment verification for outof-order fragment delivery. The router assigns a buffer for each chunk and verifies each incoming fragments using the information calculated from the previous fragments. Upon receiving the last fragment, the router determines the validity of the entire chunk and forwards the last fragment only if authentication succeeds. The simulation results show that retrieving a 32KB content with the proposed fragmentation mechanism is about 2.5 times slower than baseline CCN. Though fragmentation increases the flexibility of the network, this is a very significant increase in latency. Marias et al. [83] have identified security and privacy concerns which should be addressed by a future Internet architecture. The authors first reviewed recent achievements in physical layer security, network coding security, and network infrastructure security. Then they identified authentication and identity management as core building blocks of a secure network, and discussed the challenges of implementing them. Regarding privacy, the authors reviewed some existing work in Internet of Things (IoT) privacy and highlighted communication anonymity as a necessity in a future Internet. Additionally, the authors highlighted the requirements and challenges for mobile application security and privacy. However, the authors did not elaborate on the attacks that are inherent to ICN, such as cache pollution, content poisoning, DoS/flooding, and the timing attack. Furthermore, a review of existing access control mechanisms for ICN has been neglected. Wahlisch et al. [110] discussed the threats and security problems that arise from stateful data planes in ICN. The authors categorized these attacks into three classes: resource exhaustion, state decorrelation, and path and name infiltration. Resource exhaustion attacks include those such as DoS, interest flooding, and PIT overload. Timeout attacks (orchestrated by throttling the network to increase latency) and jamming attacks are examples of state decorrelation attacks. Path and name infiltration attacks include route hijacking and route interception. Despite presenting a thorough attack classification, this paper

did not discuss any mitigation to the aforementioned attacks. As this subsection groups a set of non-related works, we do not summarize them in a table. We also do not provide any summary or future research in this category.

G. Summary and Future Research In this section, we reviewed the state-of-the-art in ICN security, specifically attacks such as denial of service, content poisoning, cache pollution, and secure naming. Here, we summarize the existing challenges and suggest potential directions which may be useful directions for exploration. 1) Denial of Service Attack: DoS attacks, in general, either target the content routers [14], [38], [41], [112], [111], [91] and/or the content providers [68], [54], [79]. An attacker tries to exhaust either the routers’ PITs or content providers’ resources by requesting dynamic or non-existent content with a high rate, which causes unbounded service delays for legitimate clients. The majority of the proposed solutions [14], [38], [41], [54], especially against the interest flooding based DoS attacks, are variants of a rate limiting mechanism on the suspicious interfaces or name prefixes. The major drawback of the rate limiting based solutions is the impact they have on the legitimate clients who are either co-located with the attackers, or are requesting dynamic content that are not generated yet or mistakenly requesting a non-existent content. No scheme does a per-flow based rate-limiting. The closest is the approach by Gasti et al. [54] where prefix based rate-limiting was proposed. There is need for more fine-grained rate-limiting to better limit malicious from benign requests. Other proposed mechanisms including per-interest client’s proof-of-work [79], fuzzy logic-based detection [111], statistical hypotheses testing theory [91], and increasing the caching time [68] have also been proposed to solve the problem. However, these mechanism either require storage of per content statistics at the routers or are not computationally scalable, especially in the real time. A better mechanism may be one that removes the suspicious requests from the PIT [112], similar to the publish-subscribe Bloom filter based self-routing [8], [9]. This mechanism can be augmented by taking a self-routing approach for the suspicious interests and the available stateful routing for the legitimate interests. Another potential direction is employing a software-defined networking (SDN) approach in which a network controller with an overall aggregated view of the network detects and mitigates the DoS attack in its earlier stage. It can be achieved by the collaboration of routers at different levels of the network hierarchy, specifically for filtering the communication flows that share malicious name prefixes. Exploiting a more sophisticated interest aggregation method, which aggregates the malicious interests with same prefix (regardless of their suffixes) into one PIT entry, can also slow down the PIT exhaustion. We also believe some of the current IP-based detection and defense mechanisms [124] might be relevant for ICN DoS mitigation. This is a significant area of interest.

2) Content Poisoning Attack: In this attack, the attacker’s goal is to fill the routers’ caches with fake contents, that are either content with valid names and invalid payloads or content with invalid signatures. All of the proposed mechanisms require the intermediate routers to verify the data packets’ signatures [54], [72], compare the content hash in interest and data packets [54], [57], [59], or to rank the contents based on the clients’ feedback [58]. The signature verification based mechanism [54], [72], in general, are not scalable at line speed due to the high cost of cryptographic operations. Content ranking proposed in [58] solely relies on the clients’ feedback and hence, is vulnerable to malicious clients. To effectively address the content poisoning attack, a detection mechanism should be employed with negligible cost at the intermediate routers. We believe that the hash verification based approach is the more promising approach. Perhaps more study can be done to identify a suitable cryptographic hash function. Another approach is to trace the fake content back to its origin by leveraging the history of each interface on the route. After successfully detection of the attack origin, a mitigation mechanism can be orchestrated. For instance, a router may prevent caching the content chunks that arrive from a suspicious interface or have the same name prefixes as the fake content. We believe that there is still need for more efficient and scalable mitigation approaches. 3) Cache Pollution Attack: Cache pollution is divided into false locality attack in which an attacker tries to change the locality of the cache by requesting a set of unpopular content objects, or locality disruption attack in which the attacker creates fake popularity for unpopular contents. The objective of these attacks is to degrade cache effectiveness and increase the content retrieval latency. Some of the proposed approaches [69], [93], [119] incur high computation cost at the intermediate routers, which undermines their scalability. Other proposed mechanisms either only detect the cache pollution attack [39] or address the less severe malicious provider attack scenario [86]. We believe that the key idea for solving cache pollution attack is in designing a robust caching mechanism, which not only increases the resiliency of the cache against these attacks, but also improves the overall network latency and users quality of experience. One possible direction is to explore collaborative caching more. A variety of collaborative caching schemes have been proposed in the literature with the objective of improving cache utilization and reducing the latency [114], [35], [120]. However, the positive impacts of collaborative caching mechanisms on mitigating cache pollution attack have not been explored. With collaborative caching and feedback between the caches, mechanisms can be designed to contain or root out cache pollution attack attempts. 4) Secure Naming and Routing: Content naming scheme is an integral component of ICN, especially for the routing and security functionalities. Lack of a verifiable binding between the content name and its provider simplifies the orchestration of the content poisoning attack. Even when the provider’s signature for the content provides this binding, the high cost of signature verification would prevent intermediate routers

from verifying signature of all arriving packets. Despite some initial efforts [42], [63], [116], [127] to make content naming secure, there is still a need for more scalable and computationally efficient approaches. The identity based cryptographic approaches [63], [127] require the client to trust a third party for private key generation; a practice that significantly undermines the applicability of these approaches. A secure and efficient naming scheme is still an open challenge. Any such scheme should include metadata, such as the content hash and the provider’s identity and signature to enhance the security of the system. This is currently an important area of research with proposal being made to the ICN Research Group, an Internet Research Task Force [5]. 5) Application-Level Security: Different ICN applications and application-level mechanisms, such as content filtering, anomaly detection, and covert channel have been proposed in the literature. Mechanisms proposed in [49], [60], [61], [70] tried to detect abnormal traffic at the intermediate routers, spam contents based on the subscribers’ and publishers’ votes, or performed content filtering through the firewall. In [30], [31], [108], the authors proposed ICN inspired architectures for lighting control systems, Internet of things, and smart grid. In [123], Yu et al. proposed a chain-of-trust based schema for content publishers and consumers to use to share content. The authors in [117] suggested the separation of data and security planes for better content integrity assurance. Other proposed applications include ephemeral covert channel communication [23], secure email service [98], and moderator-controlled information sharing [26]. We have not found an application that incorporates all the security functionalities available in ICNs (any architecture) nor did we find a comprehensive applicationlevel security suite (again any architecture). That should be one of the interests of future researchers in this domain.

Timing attack has been explored in a large body of literature [90], [89], [13], [32], [37]. This attack targets the privacy of caches and clients that are co-located with the attacker. In a timing attack, an attacker probes content objects which it believes are cached at the shared router. The attacker leverages precise time measurements to distinguish cache hits and cache misses, and thereby can identify which contents are cached. A cache hit implies that the content had been requested by another client in the neighborhood, while a cache miss indicates that the content has not been requested (or has been evicted from the cache). An informed attacker can also ascertain whether the request is served by the provider or by a router somewhere along the path to the provider. As illustrated in Fig. 7, a shorter latency of retrieving content C1 in comparison to content C2 reveals the availability of C1 in the shared edge router’s cache.

3. P RIVACY IN ICN In this section, we explore privacy risks in informationcentric networks and the proposed mitigation mechanisms. At the end of this section, we present the open challenges and some possible directions them. Attacks against privacy in ICN may target the caching routers, cached contents, content names, content signatures, as well as client privacy. These privacy concerns are inherent to all architectures. Additionally, there are a few attacks that are possible due to the inherent design choices of specific architectures, which we will discussed separately. We will highlight the vulnerable design choices and discuss their advantages and disadvantages. Fig. 6 presents a classification of privacy attacks in ICNs, along with the related work aimed at their mitigation. Before discussing the privacy state of the art based on categories we mention one work that covers most of the categories and hence does not fall into any specific category. Fotiou et al. reviewed the proposed ICN architectures and discussed the requirements and design choices for secure content naming, advertisement, lookup, and forwarding in [47]. The authors classified each privacy threat as either a monitoring, decisional interference, or invasion attack. The decisional

Acs et al. [13] investigated cache privacy in named-data networks in the presence of timing and cache probing attackers. The authors confirmed the effectiveness of these attacks in different network topologies, and demonstrated attack feasibility even in scenarios where the attacker and the victim are three hops away from the shared router (success rate of 59%). They discussed two traffic classes: interactive traffic and content distribution traffic. For interactive content, the authors proposed the addition of a random number to the content name, to be agreed upon by the requester and the content provider. This prevents the attacker from successfully probing the cache for this content due to the precise content suffix matching approach that is employed in the majority of ICN architectures. On the other hand though, another client requesting the same content does not have its request satisfied even if a cached copy of the content exists, which undermines efficiency due to caching. As an alternative solution, the authors suggested that the requester and producer mark privacy-sensitive interests and content as private. The intermediate routers then prevent the marked content from being cached, subsequently enhancing the clients’ privacy. The authors also suggested the emulation of a cache miss at a router, with the router applying a random delay

interference attack either prevents a consumer to access certain content, prevents the content advertisement and forwarding of a specific provider, or allows content filtering based on content name. In the invasion attack, an attacker tries to acquire sensitive information from the target. The authors also analyzed the identified threats and ranked them according to the DREAD model [65], and briefly reviewed ongoing research on privacy concerns in information-centric networking. In the ICN paradigm, if we assume that the payload is encrypted, then the information in a chunk does not identify the user. However, other mechanisms of traffic analysis can be performed. Based on these mechanisms and knowledge available at with the attacker several other attacks can be orchestrated. We categorize these attacks into timing attack, communication monitoring attack, anonymity, protocol attack, and namingsignature privacy, and discuss them in what follows. A. Timing Attack

Privacy

? Timing Attack

?

? Anonymity

Communication Monitoring Attack

[13] [32] [37] [89] [90]

? Protocol Attack

[25] [36] [44] [46] [52] [104] [106]

[32] [74] [75]

[32] [74] [75]

? Naming-Signature Privacy [28] [32] [71] [84] [85] [102]

Fig. 6: Privacy Risks and their Countermeasures. TABLE VI: Summary of Timing Attack Mitigations Approach Mitigating Entity

Acs et al. [13] Delay for the first k interests Edge routers

Chaabane et al. [32] Delay for the first k interests Edge routers

Content  router

C1

Router

Router

C1 Client

TC1