Detecting Neighbor Discovery Protocol Flooding

0 downloads 0 Views 93KB Size Report
Telecommunications Research Lab, School of Computing,. Queen's .... NS Flooding attack flood IPv6 network with huge number of NS messages causing target ...
Detecting Neighbor Discovery Protocol Flooding Attack using Machine Learning Techniques Firas Najjar1, Mohammad Kadum1,2, Homam El-Taj3 1

National Advanced IPv6 Center (NAv6), Universiti Sains Malaysia (USM) Pulau Pinang, Malaysia 11800 {Firas,Kadum}@nav6.usm.my 2. Telecommunications Research Lab, School of Computing, Queen’s University, Kingston, ON, Canada K7L 3N6 [email protected] 3. Community collage, Tabuk university, [email protected]

Abstract. Due to the large deployment of open network, securing Neighbor Discovery Protocol (NDP) becomes critical as it is lack of authentication which makes it vulnerable to flooding attack. Most of exist solutions for securing NDP violate the design principle of NDP in terms of overhead and complexity; other solutions suffer from high false positive alerts which affect solution trustiness. Machine learning has high potential in securing NDP as such attacks can be detected without relying on the attack signature; rather, broader definitions of attack attributes can be learned. This paper aims to investigate the use of machine learning mechanisms in detecting NDP flooding attacks. The experimental results showed that C.45 machine learning technique accurately detects NDP flooding attack.

1 Introduction The huge growth of Internet users leads for exhaust most exist Internet Protocol v4 (IPv4) addresses [1], to overcome this issue, Internet Assigned Number Authority (IANA) [2] start to allocate IP addresses using Internet Protocol version 6 [3] which provides a massive number of IP addresses. Although IPv6 was built with security in mined and as a successor of IPv4, it inherent security weakness from IPv4 protocol that Neighbor Discovery Protocol (NDP) [4] which the main supported protocol for IPv6 has no authentication or registration mechanism, therefor, NDP expose to attacks. IPv6 networks provide the ability to any connected node (router or host) to configure its IP address and start communicate with other nodes without any registration or authentication. Moreover, the receiver must response it; therefor the attacker can easily flood the network with any NDP fake messages and most hosts must blindly accept and process all these messages, consequently, exhaustion of the hosts system resources, which may lead at the end to freeze them all, eventually, rebooting is required to clear the memory of thousands of fake addresses. Different approaches proposed to prevent and monitor NDP misuse, spoofing, and denial of server attacks (DoS); however, most of them violate the design principle of NDP in term of overhead, complexity, and high rate of false positive alarms,

thereupon, proposed solutions on NDP must preserve the original design without any modification, while improving NDP security. One of the latest technologies for monitoring and preventing computer cyber threats are Intrusion Detection and Prevention Systems (IDPSs) which become essential component of computer security. The role of IDPS is to detect any sign of possible incident that violate the system polices, warn system administrator, and try to stop or slow down detected violation. Moreover, IDPS contains ruled-based algorithms with learning algorithms which recognize detected complex patterns to help making intelligent decisions or predictions when it faces new or previously unseen network behavioral [5]. Machine learning algorithms in IDPS used to recognize valid, novel, potentially useful and meaningful detected network behavioral using non-trivial mechanisms, therefor machine learning algorithms become an effect domain for IDPS for detecting novel and known attacks. The rest of this paper is organized as follow: Section 2 presents background of NDP and IDPS. Section 3 describes the related work. Section 4 identifies machine learning common technique. Section 5 descripts testing and evaluating machine learning technique. Finally, the conclusion is covered in Section 6.

2 Background This section provide background of NDP with IDS technologies. 2.1 NDP NDP provides a stateless mechanism which give the ability for connected nodes to configure there IP addresses, configure there gateway, and start communication with neighbor nodes without any authentication or registration inside the local site [6]. Consequently, the attacker can claim himself any node inside the network and start various attacks. Even NDP include IPsec [7] in its original specification, to secure NDP messages, there is no instructions introducing the use of IPsec, and how the automatic exchanging of keys. Therefore, manually configuring of security associations must be done, which make it impractical for most purposes. [8] IPv6 NDP allows nodes to identify its neighbors on the same LAN, and advertise its existence to other neighbors. To complete its functions, NDP uses the below ICMPv6 [9] messages: • Router Advertisement (RA) messages are originated by routers and sent periodically, or sent in response to Router Solicitation. Routers use RAs to advertise their presence and send specific parameters such as MTU, Router Prefix, lifetime for each prefix, and hop limits. • Router Solicitation (RS) messages are originated by hosts at the system startup to request a router to send RA immediately. • Neighbor Solicitation (NS) messages are originated by hosts, which attempt

to discover the link-layer addresses of other nodes on the same local link, or originated in DAD, or originated to verifying the reachability of a neighbor. • Neighbor Advertisement (NA) messages are sent to advertise the changes of the host MAC address and IP address, or solicited response to NS messages. • Redirect messages are used to redirect traffic from one router to another. The absence of NDP authentication give the attackers the opportunity to easily flood the IPv6 network with fake NDP messages, the NDP flooding messages attacks categorized in: • RA Flooding attack sends huge number of RA messages to specific host or to all-multi-cast-nodes (FE02::1). Therefore, most hosts blindly accept and process all RA messages, consequently, exhaustion the system resources of these hosts, which may lead to freeze them and eventually rebooting is required to clear the memory of thousands of fake addresses. • RS Flooding attack sends huge number of RS targeting routers inside the local area network (LAN), it sends all these messages to all-routersmulticast-group (FE02::2) keeping the router busing answering all RS messages, consequently, preventing routers of completing other requests. • NA Flooding attack flood the network of huge number of NA messages trying to exhaust the kernel memory of neighbor node cashes, furthermore, systems which not enforce limitation polices in nodes cache ends with kernel panic. • NS Flooding attack flood IPv6 network with huge number of NS messages causing target nodes to remove saved entries from its Neighbor Cache, and try to poison target Neighbor cache by sending Neighbor Solicitation. • Redirect Flooding attack send large number of data redirected to an existing node which result exhaust the node resources and lead to DoS attack.

2.2 IDPS Intrusions were detected manually in the past, by reading and analysis all systems logs trying to detect anomalies or attacks, this process takes lot of time, effort and needs specialized trained employees to make the detection, therefore, detection process should be done automatically[10]. However, not every detected anomaly can be treated as threats; even attacks exhibit characteristics that are different from normal traffic [11]. On the other hand, anomalies have the potential to translate into significant critical and actionable information [12]. Anomaly based systems attempt to map events to the point where they learn what is normal and then detect an unusual occurrence that indicates an intrusion. National Institute of Standards and Technology (NIST) [13] define (IDSs) as the process of monitoring the events occurring in a computer system or network and analyzing them for signs of possible incidents, which are violations or imminent threats of violation of computer security policies, acceptable use policies, or standard security practices.

Moreover, NIST defines Intrusion prevention system (IPS) as the process of performing intrusion detection and attempting to stop detected possible incidents, which means IPS is active IDS which performs all IDS activity and try to stop intrusions, in this paper IDPS short will be interchangeable with the shorts IDS and IPS. IDPS methodologies used by IDPSs can be categorized into: • Misuse-based, (also denoted as signature-based) search for predefined patterns, or signatures, within the captured data. This methodology is very effective in detecting predefined threats and have small number of false positive alert, but this methodology has no ability to detect new threats or undefined one. • Anomaly-based, in this methodology, normal behavior of the target system defined, and generate anomaly alert whenever any deviation is detected, usually anomaly-based IDSP suffer from high number of false positive alert. • Strict Anomaly Detection, Sasha and Beetle firstly introduce Strict Anomaly Detection in 2000 [14], its attack detection methodology which detect any violation of system rules, “not use” is alternative name for it, the best use of strict anomaly detection in environments where legmate activities is well define[15 FSM].

2 Related Work NDP is stateless and lack of authentication which expose it to attacks [16-19], even the original design include IPsec within IPv6 to secure it, however IPsec needs manual configuration which make it limited to small network with known hosts[20]. Solutions proposed to overcome NDP limitation due the lack of authentication can be divided according to main purpose: securing NDP and monitoring NDP which are IDPS, for securing NDP, most common solutions made changes on the original design of the protocol which increase the complexity of the protocol. On the other hand, IDPS solutions intend to monitor the NDP without any modification to original design of the protocol, these approaches detects any violation of predefined normal behavior of the protocol, alert system administrators and try to stop intruders. SEND [21] and Cryptographically Generated Address CGAs [22] are examples of securing NDP, these solutions are the best choice for securing IPv6 networks where is IPsec found to be an impractical choice. However, SEND and CGA has not been widely implemented or deployed due to high complexity and issues reportedly holding back some vendors are intellectual property claims and licensing terms CGAs [20]. Another example, [23] proposed the use of digital signature to secure IPv6 neighbor discovery protocol which has less complexly than CGA, however, the proposed solution cannot detect DAD and NUD attacks. [24] Propose highly randomized technique for address generation that safeguards node’s privacy and asserts address uniqueness on the link. The main limitation of solution proposed to secure NDP is the overhead the complexity increasing of the protocol, by adding extra functions, as compared with IDPS solutions which didn’t change or increase the complexity of NDP, the main role

of these solutions is to alert system administrator of any violation of NDP normal behavioral. There is two types of monitoring solutions: passive and active mechanism which generate extra packets for additional analysis. NDPmon [25] is example of passive mechanism which tracks changes in MAC-IP pairings; any changes triggers alerts to system administrator. The main drawback is, training face must be free of any compromised node, otherwise the whole detection processes fails. Another example, [15] built a finite state machine that model the normal behavioral of NDP, and use Strict Anomaly Detection to detect any violation. On the other hand, active mechanism [26, 27] uses probe packets for additional observations. [28] Uses Multicast Listener Discovery (MLD) probing to reduce the traffic, while [29] propose a host-based IDPS which verified any changes made on its neighbor cache by sending NS probes. Main limitation of active technique is the generated over head traffic which can used by the attacks to perform DoS attacks by flooding the nodes with fake MAC-IP address pairs.

3 DATASET Datasets are essential part for evaluation and testing the machine learning solutions, in order to use machine learning technique to detect NDP anomalies, NDP dataset must successfully capture the normal behavioral, failing in capturing normal behavior of the protocol affects the machine learning technique accuracy. The most common benchmark datasets lack of IPv6 data flow, which make them useless to detect NDP flooding attack, [30] dataset is selected in testing and evaluations, it successfully captures the normal behavioral of NDP, and capture RA and NS flooding attacks which make it good choice to select. Table 1 summarizes the summation of captured packets inside the dataset. Table 1. Captured packets summarisation and duration NDP Dataset.

3.1

Class

Duration

Packets

RS

RA

NS

NA

RD

IP

MAC

Normal Flood_RA Flood_NS

24 H 25 second 23 Second

2991 79771 101759

13 0 0

460 45678 0

1159 34093 89279

1070 0 12373

289 0 107

13 45691 89292

7 45685 89286

Features selection and generation

Feature(attributes) selection is very important in learning method, removing redundant and unusual features increase the detection accuracy and minimize the time needed for training, therefore, NDP message features are selected, other features are ignored, for more information about feature selection please refer to [31]. On the other hand, features extraction is one of the major challenges in pattern recognition, features extraction intends to create more informative and non-redundant new features in order to reduce the cost of computation and improve the classifier efficiency. Fig.1 illustrate the process of generating the new features, the output of this process generates eight new features from the dataset:







• • • • •

Duration is the time in seconds for counting NDP messages, number of MAC addresses and number of IP addresses, used to generate these messages. In this paper 3 second was chosen as the duration time for counting the packets, however, 3 second is long time for flooding attacks, therefore, if the number of counted type packets over the threshold the new duration become 0.1 second as shown in Fig. 1. Number of MAC address is the incremental count of MAC address in the network, this feature very useful in detecting spoofing MAC addresses, because some attacks violate the system polices not NDP, such as the number of MAC addresses connected to specific ports. Number of IPs is the incremental count of IP addresses, this feature unlike number of MAC addresses because IPv6 permits to have more than one IP address for each MAC address, hence the attacker can generates fake NDP messages using different IP address using legitimate MAC address, this situation can happen when there physical security on switch ports. Number of RS messages is the counted number of RS messages within duration time; Table 2 shows the maximum permitted number RS in seconds. Number of RA messages is the counted number of RA messages within duration time; Table 2 shows the maximum permitted number RA in seconds. Number of NS messages is the counted number of NS messages within duration time; Table 2 shows the maximum permitted number NS in seconds. Number of NA messages is the counted number of NA messages within duration time,Table 2 shows the maximum permitted number NA in seconds. Number of redirect messages is the counted number of redirect messages within duration.

Fig. 1. Process of generating new features from Dataset. Table 2. Captured packets summarisation and duration NDP Dataset. Message

Protocol Constant

Description

RS RA NS NA

3 Transmission / 8se 3 Transmission / 8se 3 Transmission / 3seconds 3 Transmission/ Threshold

Maximum number of RS packets for each IP in seconds Maximum number of RA packets for each subnet Maximum number of NS packets for each IP in seconds Maximum number of NA packets for each IP in seconds

4 Machine Learning In this section most popular machine learning techniques are highlighted and discussed, starting basement and simple methodology, hence, some attack dataset, single attribute can do the whole job while other attributes are redundant and irrelevant, in another attack dataset, attribute must contribute equally and independently to detect the attack. A third scenario, attack can simply detected using logical structure, with selecting few number of attributes that can be selected using a decision trees [32], therefore the aim of this paper is to compare different machine learning technique to detect NDP flooding attacks. ZeroR Method. ZeroR is the simplest classification method which usually used as a benchmark for other classification methods. For classification, ZeroR depends on the class type attribute and ignores all other attributes. In classification, ZeroR construct frequent table of the classes and select the most frequent class for classification [32]. One Rule Method. One Rule method (OneR) or 1-level decision trees is simple and accurate method which classify using single attribute , to create classification rule, frequency table constructed for each attribute against the class. Then the rule with the smallest total error is chosen [33]. Naive Bayesian. Naive Bayesian is another simple technique which uses all attributes and treat them equally, moreover it assumes all attribute are statistically autonomous. Although this assumption is not realistic for most real-word datasets, it works very well in practice [34]. Decision Trees. Decision tree uses tree structure to build classification model, the final result of the model is tree with braches (decision nodes) and leaf nodes which are the result. Iterative Dichotomiser 3 (ID3) was the basis of decision tree by J. R. Quinlan which uses top-down and greedy search through the space of possible branches with no backtracking. Entropy and Information Gain used in ID3 to construct a decision tree.This paper used in the experiments C4.5 [35] algorithm which is an extension to ID3. Nearest Neighbor. The nearest neighbour algorithm (KNN) [36] is a pattern recognition statistical method which classify according to the nearest k object in the dataset, it uses similarity or distance metrics to choose the nearest objects. KNN is example of lazy learning techniques where the method do nothing until prediction is made. This algorithm is one of the highly accurate machine learning algorithms that involves no learning cost and builds a new model for each test. The testing may become costly if the number of instances in the input data set increases. Support vector machine. Support vector machines (SVM) is one of the most robust and accurate methods in all well-known data mining algorithms as [37] highlighted, it performs classification finding the hyper plane which maximize the margins between classes [38].

5 Experiments evaluation WEKA tool [39] was used in testing and evaluating the different machine learning technique on detecting NDP RA and NS flooding attacks, testing and evaluation have been performed on Dell XPS Inter Core i7 processor, and 8 GB RAM, with Windows 10 pro operating system. Testing and evaluation process go through different phases, first phase is preprocessing by removing un useful, redundant instances, WEKA-RemoveDuplicate function was utilized, total number of instances was reduces from 1639 instances to 510 instances, 1129 instances were duplicated, in other word 1129 repeated behavior was removed, such as sent RA every 3 second for example. Second phase is to apply machine learning technique which were described in section 4, default setting were chosen for most technique, only in SVM Linear Kernal was chosen, and Canoy initialization method was chosen in K-Mean technique. The experiment results in Table. 3 show most techniques successfully capture flooding attacks which means the generated features successfully identify the different behavioral of NDP. Benchmark technique detect 43 % of the attacks, this result affected by removing redundant instances, without removing duplicate instance the detection rates become 73%, and it will increase if the scenarios change, since normal is the main characteristics of networks. Table 3. Comparison between machine learning techniques in detecting NDP flooding attacks. Classifier ZeroR

Time in seconds 0

Percent of correctly 43.52 %

OneR

0.01

90.00 %

Naive Bayesian

0.02

100 %

C4.5

0.08

100 %

KNN

0

100 %

SVM

1.08

100 %

Confusion Matrix a b c