Anonymous Communications in Computer Networks

0 downloads 0 Views 1MB Size Report
Based on Chaum's MIXes, Tarzan (Freedman & Morris,. 2002) provides both sender and recipient anonymity, and therefore relationship anonymity, by using a ...
148

Category: IT Security & Ethics

Anonymous Communications in Computer Networks Marga Nácher Technical University of Valencia, Spain Carlos Tavares Calafate Technical University of Valencia, Spain Juan-Carlos Cano Technical University of Valencia, Spain Pietro Manzoni Technical University of Valencia, Spain

INTRODUCTION In our daily life no one questions the necessity of privacy protection. Nevertheless, our privacy is often put at risk. The first problem has to do with the fact that privacy itself is a concept difficult to define. As a matter of fact, in many countries the concept has been confused with data protection, which interprets privacy in terms of the management of personal information. Nowadays, the term privacy is extended to territorial and communications protection. We will focus on the privacy of electronic communications. When referring to this type of communication, the first aspect we think about is security. In fact, this concept is widely discussed, and nowadays we often hear about threats and attacks to networks. Security attacks are usually split into active and passive attacks. We consider that an active attack takes place when an attacker injects or modifies traffic in the network with different purposes, such as denial of service or gaining unauthorized access. Unlike active attacks, a passive attack takes place whenever the attacker merely inspects the network by listening to packets, never injecting any packet. Malicious nodes hope to be ‘invisible’ in order to collect as much network information as possible just by using timing analysis and eavesdropping routing information. A way to avoid this type of attack is to anonymize both data and routing traffic. In this manner we can hide the identities of communicating nodes and avoid data flow traceability. Various scenarios can be devised where anonymity is desirable. In a commercial transactions context, if we think about an off-line purchase, we accept that some users prefer to use cash when buying some goods and services, because anonymity makes them more comfortable with the transaction. Offering anonymity to online commerce would increase the number of transactions.

Military communications are another typical example where not only privacy but also anonymity are crucial for the success of the corresponding mission. Finally, if we attend a meeting where some delicate matter is being voted on, it could be necessary for the identities to remain hidden. Again, in this case, anonymity is required.

BACKGROUND In order to talk about anonymity, first we have to establish the terminology to be used. An important work on this issue is Pfitzmann and Hansen (2000); based on this work, we can establish a classification of anonymity degrees: A node is considered exposed when its identity information is known. If its identity is not the real one, the node is pseudonymous. Furthermore, if it is unlinkable to some kind of relevant information, we achieve anonymity with respect to that information; as an example we can consider the relationship between end-to-end peers or the peers themselves. Finally, when the communication is not perceived, we can say that it is undetectable; and if it is undetectable for any external node and also anonymous for every participant, the communication is unobservable. In the literature, there are various works based on different networks topologies as the Dining Cryptographers (Chaum, 1988) or MIXes (Chaum, 1981) in order to provide anonymous communications in fixed networks.

PEER ANONYMITY In this article we will discuss the different degrees of anonymity provided by means of different proposals found in the literature, emphasizing those issues that are still unsolved.

Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.

Anonymous Communications in Computer Networks

general approaches Anonymity has been treated differently depending on the network characteristics and goals. We believe that the two most relevant generic proposals are the Dining Cryptographers network and the MIX network.

Dining Cryptographers Network The Dining Cryptographers network (DC-net) (Chaum, 1988) achieves sender anonymity in the following way: some pairs of participants share a secret bit. Each participant calculates the sum of all the bits that he shares, and if he wants to transmit, he inverts that result. All the nodes send the result of the sum or the inverted one (if necessary). If no one (or an even number of participants) transmits, the sum of all these transmissions is zero. In cases where one participant (or an odd number of them) transmits, the sum will be one. Each participant could share a key of n bits with another participant, one bit per round. So, the ith bit of each such key will be used in the ith round. However, this approach is restricted to small networks since only one node can transmit in each round. In large networks the probability of having more than one node wishing to transmit in a specific round increases, and so collisions will render this mechanism impractical. Furthermore, the anonymous bandwidth of a DC-net is limited by the slowest participant. Overall, DC-nets provide strong anonymity elegantly, but suffer from efficiency and scalability problems. Herbivore (Goel, Robson, Polte, & Sirer, 2003) is a protocol based on DC-nets that tries to solve the scalability

problem by splitting the network into sub-groups (called cliques), but it requires global topology control. In terms of efficiency, results are actually quite poor.

MIXes Network In 1981, Chaum proposed the use of MIXes to anonymize electronic mail users and messages. The main goal for a single MIX is to hide the correlation between incoming and outgoing messages within a large group of messages by delaying or reordering them. In order to do this, encryption and padding mechanisms are applied. There are several MIX variants: •



A pool MIX only sends part of the incoming messages, keeping the other parts for later rounds. Hence, it uses the reordering technique. It is called a “timed MIX” if the event that triggers the flushing is the expiration of a timeout. In cases where the trigger is the arrival of a message, the MIX will be referred to as a “threshold MIX.” A stop-and-go MIX (or continuous MIX) delays messages according to an exponential distribution, which does not depend on traffic. Hence, if the number of users is low, the degree of anonymity is also low.

A MIX network can consist of a set of predefined routes, called cascades, or free route networks, where routes are selected by users. Berthold, Pfitzmann, and Standtke (2001) establish that this last type of networks is flexible, scalable, and extendable. However, it is less secure due to the intersection of different anonymity sender/recipient groups,

Figure 1. Example of DC-net



A

Anonymous Communications in Computer Networks

making it easier to reveal participants’ identities. Danezis (2003) provides an intermediate solution: Each MIX can only choose routes included in a predetermined graph. The conclusion is that this network is more scalable than a cascade and more resistant to intersection attack than the free route alternative. In general we can say that MIX-nets always provide relationship anonymity and sometimes also recipient anonymity if their identity remains hidden. However, sender anonymity is more difficult to achieve. Based on Chaum’s MIXes, Tarzan (Freedman & Morris, 2002) provides both sender and recipient anonymity, and therefore relationship anonymity, by using a restricted topology for packet routing. Packets can be routed only between special IP tunnels. Anonymity should be transparent to both client applications and servers. Tarzan operates at the IP layer and relies on layered encryption: each leg of the tunnel removes or adds a layer of encryption. The tunnels are static and any relay failure requires the formation of a new tunnel, thus increasing both delay and computation overheads. Instead of single-node MIXes, Cashmere (Zhuang, Zhou, Zhao, & Rowstron, 2005) selects regions (relay groups) as MIXes, providing sender and relationship anonymity. At the same time, any node in a region can act as a MIX, hence reducing the probability of a MIX failure. Each group requires a public/private key pair, which is generated and distributed using an off-line certificate authority (CA). The source randomly selects and orders the relay groups to conceal the destination relay group. It then encrypts the forwarding path in multiple layers using the public keys associated with each relay group. An intermediate node decrypts the message received, forwards it to the next relay group, and broadcasts the decrypted contents to all other members of its own group. The bandwidth cost is higher for this solution than for a node-based relay approach.

Other Protocols In addition to the solutions already discussed, proposals such as Crowds, Hordes, P5, Tor, and HIP are also relevant in this field of research. With Crowds (Reiter & Rubin, 1998), the authors provide sender anonymity following this strategy: The source node does not choose the path to be used; instead, it sends the message directly to the Internet with probability p, or forwards the message to another randomly selected user with probability 1-p. The rest of nodes in the network behave in the same way. Therefore, the initiator is indistinguishable from a member that simply forwards a request from another. Once a path is chosen, it remains static for that source-destination pair until the server sends a special message that forces all the established paths to change in order to avoid certain

0

types of attacks. Connections between users are encrypted with the keys distributed by the server. Shields and Levine (2000) describe the Hordes protocol, which uses a similar strategy to anonymously send a packet from the initiator to the responder. However, it uses multicast communications in the reverse path to reduce the amount of work required from participants and to improve data delivery latency and link utilization. This proposal assumes that shortest path multicast routing trees are available. The Peer-to-Peer Personal Privacy Protocol (P5) (Sherwood, Bhattacharjee, & Srinivasan, 2002) is another protocol targeting anonymous communications that provides sender and recipient anonymity. It is designed to be implemented in addition to the current Internet protocols without any special infrastructure support. Since it is based on broadcast transmissions, it creates a broadcast hierarchy to avoid scalability problems. This solution has a cost in terms of efficiency; moreover, in mobile environments it would require group management algorithms in order to keep the hierarchy updated. Every node acts as a MIX, scrambling received packets before forwarding them and using hop-byhop encryption. Also, it maintains a fixed communication rate, sending signal or noise packets only if necessary. In Dingledine, Mathewson, and Syverson (2004), Tor is proposed as the second-generation onion router. It works on the Internet, and is designed to make TCP-based applications (like Web browsing or instant messaging) anonymous. Clients choose a path through the network and build a circuit, in which each node knows its predecessor and successor, but no other nodes along the path. The length of packets is fixed, and each node unwraps them using a symmetric key. Each onion router maintains a long-term identity key to sign certificates, directories, and the router’s descriptor, along with a short-term onion key to decrypt requests from the users to set up a circuit and to negotiate ephemeral keys. The Host Identity Protocol (HIP) (Moskowitz, Nikander, Jokela, & Henderson, 2007) aims to separate the identifier and locator roles of IP addresses. The base HIP protocol (“base exchange”) is used between hosts to establish an IP-layer communications context (called HIP association) prior to communications. This process is based on a Sigma-compliant Diffie-Hellman key exchange (Diffie & Hellman, 1976) with public key identifiers for mutual peer authentication. The public key of an asymmetric key pair is used as the identifier, named the host identifier (HI). However, a hashed encoding of the HI, usually referred to as the host identity tag (HIT), represents the host identity in protocols due to its short and fixed length (128 bits) and the following three properties: (1) it has the same length as an IPv6 address, and so can be used in address-sized fields in APIs and protocols; (2) it is self-certifying (i.e., given a HIT, it is computationally hard to find a host identity key that matches the HIT); and (3) the probability of HIT collision between two hosts is very low.

Anonymous Communications in Computer Networks

Indeed, these properties provide the following advantages: first, its fixed length simplifies the protocol coding and reduces the cost of this technology in terms of packet size. Second, it presents a consistent format to the protocol irrespective of the underlying identity technology used.

undetectability and unobservability The highest degrees of anonymity are undetectability and unobservability. Referring again to the definitions presented in Pfitzmann and Hansen (2000), and according to the authors: “A mechanism to achieve some kind of anonymity appropriately combined with dummy traffic yields the corresponding kind of unobservability.” So we could say, for example, that: DC-nets + dummy traffic = sender unobservability MIX-nets + dummy traffic = relationship unobservability The other two most popular mechanisms used to provide undetectability are steganography and spread spectrum techniques. The use of power control can also help to reduce the probability of being heard since the attacker must be located very close to the transmitter node. If directional antennas are used, the attacker not only has to stay close, but also inside the corresponding sector to which the antenna is directed. General approaches to achieve undetectability and unobservability can be found in the literature. In this section we briefly explain some of them that are based on two popular techniques: steganography and dummy traffic. We focus firstly on the studies related to steganography. The typical approach is based on images where it is easy to introduce hidden information due to the redundancy that this type of message presents, but without forgetting the possibility of compression that would eliminate redundancy and also all the hidden information. Sender unobservability is provided in Heydt-Benjamin, Serjantov, and Defend (2006) using steganography in a highlatency MIX network, achieving sender/receiver unlinkability as achieved by other MIX networks, while improving sender unobservability. In it, users steganographically embed messages in images which they then post to the most popular Usenet newsgroups. The majority of images in Usenet will not contain stegotext and will serve as cover traffic. The schema presented in Bo, Jia-zhen, and De-yun (2007) uses the identification field of the IPv4 header to conceal data. According to this proposal the first eight bits of that field will contain data and the next eight bits the order of the packet. Due to the small amount of information transmitted in each packet, several are required to send the whole message. In order to give this field a random appearance, a fourth-order Chebyshev chaotic system is used to generate a sequence from an initial given value. This chaotic sequence is then

converted to a binary sequence. Also, a key is shared between source and destination. The message is encrypted with this key. Afterwards the k different encrypted blocks are included in k packets and they are sent to the destination. To ensure the success of the proposal packet, fragmentation along the path must be avoided. To achieve this, the path maximum transfer unit discovery (PMTUD) is enabled. Obviously, the schema is only practical for very short messages. Ahsan and Kundur (2002) use two strategies to hide information: IPv4 header manipulation (fragment bit and identification field) and packet sorting using the sequence number fields in IPSec (AH or ESP headers). In the former case they propose the use of the second bit—the DF (Do not Fragment) bit—in the Flags field. In order to send secret information using this bit, both the source and destination have to know the MTU of their network and always build packets that are a smaller size than this MTU to avoid packet fragmentation (that would corrupt the DF bit). The identification field is another possible header field manipulated to hide data. Ahsan and Kundur (2002) use it through chaotic MIXing (toral automorphism systems) to provide a random appearance to the field. The only limitation is that the identification field is unique for a specific source-destination pair as long as the datagram is alive. With regards to the packet sorting, the authors consider that the order of the packets sent is of no concern. So, if there are n packets to send, they will have n! ways of sending them and the selection of one combination can be interpreted as log2(n!) concealed bits. Estimation of hidden information is done using a look-up table to match the stored sequence to the corresponding sequence of packets received, and mapping this sequence to the hidden information. Also, the transmission process is modeled as a non-ideal channel characterized by the position error. The latter is imposed by the network’s behavior since the network cannot guarantee sequencing in packet delivery. With regards to dummy traffic, Diaz and Preneel (2004a) introduced this topic and pointed out the necessity of establishing an appropriate amount of dummy packets depending on the cost of inserting them. Also, the number of dummy packets should depend (or not) on the amount of real traffic. The frequency of generation and the most appropriate route length for these dummy messages are still open questions. Research presented in Diaz and Preneel (2004b, 2004c) focuses on MIXes networks by trying to determine the best strategy for the insertion of dummy traffic. Such traffic is inserted and removed by MIXes, not users, and two different ways of inserting it are proposed: into the output link at the time of flushing or into the pool of the MIX. In both cases the generation of dummy messages follows a probability distribution independent of the traffic of real messages. With regards to the type of MIXes, the authors compare deterministic and binomial MIXes using random or deterministic dummy policies. They conclude that binomial MIXes, 

A

Anonymous Communications in Computer Networks

together with a random dummy policy, provide the greatest level of anonymity. Likewise, in terms of anonymity, it is better to insert dummy traffic at the output link, although latency also increases.

future trends In recent times, wireless networks have become important in our daily communications, and so it is necessary to provide them with security mechanisms, and anonymous ones in particular. The proposals for wired networks are not suitable for wireless and mobile communications. Hence, most of the last works for anonymous communications have been specifically designed for MANETs (Kong & Hong, 2003; Zhang, Liu, Lou, & Fang, 2006). However, their performance needs to improve if we want those protocols to be useful and practical.

conclusIon In this article we have analyzed a wide variety of proposals for anonymous communications in wired networks. We believe that the field of anonymous communications is still open to improvements, and no author has yet integrated the different anonymity mechanisms described throughout this article in a consistent and efficient manner. Performance issues also remain largely untackled, and they require more scrutiny before actual deployment can take place.

references

ACM, 4(2). Danezis, G. (2003). MIX-networks with restricted routes. Proceedings of the Privacy Enhancing Technologies Workshop (PET 2003). Berlin: Springer-Verlag (LNCS 2760). Diaz, C., & Preneel, B. (2004a). Anonymous communication. In WHOLES: A multiple view of individual privacy in a networked world. Diaz, C., & Preneel, B. (2004b). Reasoning about the anonymity provided by pool MIXes that generate dummy traffic. Proceedings of the Conference on Information Hiding (IH’04). Berlin: Springer-Verlag (LNCS 3200). Diaz, C., & Preneel, B. (2004c). Taxonomy of MIXes and dummy traffic. Proceedings of the Conference on Information Security Management, Education and Privacy (INetSec’04) (vol. 3, pp. 215-230). Diffie, W., & Hellman, M.E. (1976). New directions in cryptography. IEEE Transactions on Information Theory, 22(6), 644-654. Dingledine, R., Mathewson, N., & Syverson, P. (2004). Tor: The second-generation onion router. Proceedings of the 13th USENIX Security Symposium. Freedman, M.J., & Morris, R. (2002). Tarzan: A peer-to-peer anonymizing network layer. Proceedings of the 9th ACM Conference on Computer and Communications Security. Goel, S., Robson, M., Polte, M., & Sirer, E.G. (2003). Herbivore: A scalable and efficient protocol for anonymous communication. Technical Report 2003-1890, Cornell University, USA.

Ahsan, K., & Kundur, D. (2002). Practical data hiding in TCP/IP. Proceedings of the Workshop on Multimedia Security at ACM Multimedia.

Heydt-Benjamin, T.S., Serjantov, A., & Defend, B. (2006). Nonesuch: A MIX network with sender unobservability. Proceedings of the 5th ACM Workshop on Privacy in Electronic Society.

Berthold, O., Pfitzmann, A., & Standtke, R. (2001). The disadvantages of free MIX routes and how to overcome them. Proceedings of the International Workshop on Designing Privacy Enhancing Technologies: Design Issues in Anonymity and Unobservability. New York: Springer-Verlag.

Kong, J., & Hong X. (2003). ANODR: Anonymous on demand routing with untraceable routes for mobile adhoc networks. Proceedings of the 4th ACM International Symposium on Mobile Ad Hoc Networking & Computing (MobiHoc’03) (pp. 291-302), New York.

Bo, X., Jia-zhen, W., & De-yun, P. (2007). Practical protocol steganography: Hiding data in IP header. Proceedings of the 1st Asia International Conference on Modeling and Simulation.

Moskowitz, R., Nikander, P., Jokela, P., & Henderson, T. (2007). HIP: Host identity protocol. Retrieved from http:// www.ietf.org/internet-drafts/draft-ietf-hipbase-09.txt

Chaum, D. (1988). The dining cryptographers problem: Unconditional sender and recipient untraceability. Journal of Cryptology, 1(1), 65-75. Chaum, D. (1981). Untraceable electronic mail, return addresses, and digital pseudonyms. Communications of the 

Pfitzmann, A., & Hansen, M. (2000). Anonymity, unobservability, and pseudonymity: A proposal for terminology. In H. Federrath (Ed.), Workshop on design issues in anonymity and unobservability (pp. 1-9). Berlin: Springer-Verlag (LNCS 2009). Reiter, M., & Rubin, A. (1998). Crowds: Anonymity for

Anonymous Communications in Computer Networks

Web transactions. ACM Transactions on Information and System Security, 1(1).

Item of the System: Any participating subject, object, or action: node, user, message, sending, and so forth.

Sherwood, R., Bhattacharjee, B., & Srinivasan, A. (2002). P5: A protocol for scalable anonymous communication. Proceedings of the IEEE Symposium on Security and Privacy.

Spread-Spectrum Techniques: Methods by which energy generated with a certain bandwidth is deliberately spread in the frequency domain, resulting in a signal with a wider bandwidth.

Shields, C., & Levine, B.N. (2000). A protocol for anonymous communication over the internet. Proceedings of the ACM Conference on Computer and Communications Security. Zhang, Y., Liu, W., Lou, W., & Fang, Y. (2006). MASK: Anonymous on-demand routing in mobile ad hoc networks. IEEE Transactions on Wireless Communications, 21, 23762385. Zhuang, L., Zhou, F., Zhao, B.Y., & Rowstron, A. (2005). Cashmere: Resilient anonymous routing. Proceedings of the 2nd Symposium on Networked Systems Design and Implementation (NSDI), Boston.

Key terms Anonymity: State of being not identifiable among other items belonging to a set. This set is called anonymity set. Dummy Traffic: Randomly generated packets injected in the network to make the perception of real traffic difficult.

Steganography: The art and science of writing hidden messages in such a way that no one apart from the sender and the intended recipient even realizes that there is a hidden message. Undetectability: Incapability of observing an established communication. Thus, undetectability prevents that third parties can observe when a packet is being sent through the network. Unlinkability: Incapability of stating the relation between two observed items of the system. For example, recipient unlinkability ensures that the sending of a packet and the corresponding recipient cannot be linked by others. Unobservability: Undetectability by external attackers plus anonymity for internal attackers. Untraceability: Property of maintaining routes unknown to either external or internal attackers.



A