Dec 14, 2001 - precise. Since we need only 3 to 4 memory lookups per packet to de- ... of the algorithm on a Pentium-4 PC, the algorithm incurred no packet.
Fast Classiﬁcation, Calibration, and Visualization of Network Attacks on Backbone Links Hyogon Kim1 , Jin-Ho Kim2 , Saewoong Bahk2 , and Inhye Kang3 1
Korea University Seoul National University 3 University of Seoul
Abstract. This paper presents a novel approach that can simultaneously detect, classify, calibrate and visualize attack traﬃc at high speed, in real time. In particular, upon a packet arrival, this approach makes it possible to immediately determine if the packet constitutes an attack and if so, what type of attack it is. In this approach, a ﬂow is deﬁned by a 3-tuple, composed of source address, destination address, and destination port. The core idea starts from the observation that only DoS attack, hostscan and portscan appear as a regular geometric shape in the hyperspace deﬁned by the 3-tuple. Instead of employing complex pattern recognition techniques to identify the regular shapes in the hyperspace, we apply an original algorithm called RADAR that captures the ”pivoted movement” in one or more of the 3 coordinates. From the geometric perspective, such movement forms the aforementioned regular pattern along the axis of the pivoted dimension. Through real execution on a Gigabit link, we demonstrate that the algorithm is both fast and precise. Since we need only 3 to 4 memory lookups per packet to detect and classify an attack packet, while simultaneously running 2 copies of the algorithm on a Pentium-4 PC, the algorithm incurred no packet loss over 330Mbps live traﬃc. Memory requirement is also low - at most 200MB of memory suﬃces even for Gigabit pipes. Finally, the method is general enough to detect both DoS’s and scans, but the focus of the paper is on its capability to identify the latter on backbone links, in the light of recent global worm epidemics.
Detecting attacks on backbone-speed links, let alone performing attack classiﬁcation and other more involved tasks, is hard. The formidable speed forbids any algorithm requiring more than a few memory lookups and computation steps per packet, to operate in-line. Traditional anomaly-based approach [1, 2] is obviously not usable in this environment since, ﬁrst, it requires traﬃc accumulation to characterize normal traﬃc, second, it usually requires complex computation. In this paper, we discuss an approach to simultaneously detect, classify, and calibrate attack traﬃc at backbone speed, in real time. Better yet, it easily lends itself to H.-K. Kahng and S. Goto (Eds.): ICOIN 2004, LNCS 3090, pp. 837–846, 2004. c Springer-Verlag Berlin Heidelberg 2004
Hyogon Kim et al.
the visualization of on-going attacks. To be more speciﬁc, it has the following desirable properties: a) real-time detection and classiﬁcation: done in O(1) perpacket processing, immediately upon packet arrival, b) low memory requirement: less than 200MB for gigabit pipes, c) ease of calibration: attack source/victim, duration, intensity, dimensions identiﬁed without oﬀ-line post-mortem analysis, d) minimal false positives/negatives, e) no requirement for the support from the Internet infrastructure in any form: neither protocol modiﬁcation, protocol addition, nor coordination between networks/routers, f) simultaneous DoS, hostscan and portscan tracking, and ﬁnally, g) immunity from asymmetric routing. This paper is organized as follows: Section 2 presents our real-time classiﬁcation method. A novel representation of attacks, their particular signatures, and the implementation of the signature generator are discussed. In Section 3, we show the result of applying the algorithm to a backbone trace, and live network traﬃc on campus backbone. The paper is concluded in Section 4. Due to the space constraints we omit the discussion on the statistical nature of the method, its analysis, performance evaluation of the scheme in terms of the speed, memory requirement, sensitivity, estimation error, and false positive rate. Interested readers are referred to  for these details and related work.
Real-Time Attack Classiﬁcation
On each packet arrival, we want to judge whether it is (highly likely) part of an attack or not. And if indeed it constitutes an attack, we want to classify the type of attack: DoS, hostscan, or portscan . Furthermore, we want to identify who is the victim (DoS), who is the perpetrator and what ports are scanned (hostscan, portscan), and the intensity of the attack. In this section, we discuss our approach to achieve these goals. First, we deﬁne a ﬂow to be a 3-tuple < s, d, p >, composed of the source address (s), destination address (d), and destination port (p). Our novel idea starts from the observation that only DoS attack, hostscan and portscan appear as a regular geometric entity in the hyperspace deﬁned by the 3-tuple. For instance, source-spoofed DoS packets maintain a ﬁxed destination address, thus appears as a straight line (in case destination port is ﬁxed) parallel to the s axis, or as a rectangle (in case destination port is randomly varied) parallel to the s-p plane. Legitimate ﬂows, on the other hand, appear as random points scattered across the hyperspace. Figure shows the ﬂows observed at 9:35 and 9:36 a.m. in December 14th, 2001 on two trans-paciﬁc T-3 links connecting the U.S. and a Korean Internet Exchange. The three axes are the source IP address, destination IP address, and destination port as used in the ﬂow deﬁnition above. (The source and the destination addresses have decimal scale.) Each dot in the 3-dimensional hyperspace represents a single ﬂow (not a packet). Total of 2.22 million packets were mapped to the hyperspace in the ﬁgure, where the packets in the same ﬂow fall on the same position. We can easily recognize the regular geometric formations, such as a large rectangle and a leaner rectangle lying parallel to s-axis, lines parallel to d-axis, and numerous vertical lines. These regular formations are (destination port varied) DoS at-
Fast Classiﬁcation, Calibration, and Visualization of Network Attacks
Fig. 1. Flows at around 9:35 a.m., Dec. 14th, 2001
tacks, hostscans, and portscans, respectively. Although far outnumbering them, legitimate ﬂows do not form any regular shape, and are less conspicuous. Instead of employing complex pattern recognition techniques such as 3-dimensional edge detection, we apply an original algorithm that captures the ”pivoted movement” in one or more of the 3 coordinates. This is because, from graphical perspective, such movement forms the aforementioned regular pattern along the axis of the pivoted dimension. In hostscan, the source IP address and the destination port are ﬁxed, while the destination IP address pivots on them . In portscan, the destination port pivots on the source and the destination IP address. In sourcespoofed DoS, the destination IP address is ﬁxed, while either only the source IP address or both the source IP address and the destination port pivots on it . In order to detect the presence of pivoting in the traﬃc stream, our scheme ﬁrst generates a signature for each incoming packet. The signature is simply a tuple consisting of 3 binary values: < Ks , Kd , Kp >. The coordinates in the signature one-to-one correspond to the ﬂow coordinates. Each coordinate value in the signature tells us whether the corresponding value in the ﬂow (that the packet in hand belongs to) was seen ”recently” or not. (The degree of recentness for diﬀerent coordinates could vary, and we will deal with it later.) For example, suppose two ﬂows Flow Flow ID Arrival time t: < 126.96.36.199, 188.8.131.52, 90 > 1 t + 1: < 184.108.40.206, 220.127.116.11, 80 > 2
Hyogon Kim et al.
pass through the monitor that executes our scheme. For convenience, throughout the paper we will call the monitor RADAR monitor (for Real-time Attack Detection And Report), and the algorithm that it executes, RADAR algorithm. Unless we explicitly mention the algorithm, we refer to the monitor (that includes the algorithm) when we simply say RADAR. RADAR remembers these two ﬂows for a ﬁnite time duration L. For the sake of explanation, let us assume for now that the time duration is the same for every coordinate, e.g., L = 2. When a packet with source IP = 18.104.22.168, destination IP = 22.214.171.124, destination port = 90 appears at time t + 2, RADAR tells that this packet’s signature is < Ks , Kd , Kp >=< 1, 0, 1 >. This is because source IP address 126.96.36.199 appeared in ﬂow (2) and port 90, in ﬂow (1). But 188.8.131.52 was not used either in (1) or (2) as the destination address, so Kd = 0 . If L = 1, ﬂow (1) would have been purged from RADAR at the time of the packet arrival, and the signature would be < 1, 0, 0 >. In principle, this per-packet signature determines whether the packet is part of a ”pivoted movement”, and if so, what type it is. Note that when pivoting occurs, the value of the pivoted coordinate changes constantly from packet to packet within the attack stream. From the perspective of RADAR algorithm, the pivoted coordinate is viewed as persistently presenting recently unobserved values. In Fig. 2, for instance, the pivoted coordinate is the destination address, and each packet presents a new value: 184.108.40.206 → 220.127.116.11 → 18.104.22.168 → . . .. So RADAR will keep generating < 1, 0, 1 > signatatures for hostscan. This way, RADAR gets to yield the signatures < 1, 0, 1 >, < 1, 1, 0 >, or < 0, 1, ∗ > rather frequently in the presence of hostscan, portscan, or sourcespoofed DoS, respectively. (’*’ is wildcard, i.e., ’0’ or ’1’). These signatures are what we call attack signatures, and the corresponding ﬂow goes through further examination. Sometimes legitimate traﬃc can get attack signatures, and vice versa. Or one attack might be mistaken as another, all due to hapless modiﬁcation of one or more coordinates in the signature, so some reﬁnement is required in back-end processing (which is much less time-pressed). The accuracy of the proposed algorithm thus depends on how likely these unwanted changes in the signature are, and the analysis of this statistical aspect of our algorithm can be found in . 2.1
In this section, we explore possible signatures and their semantics. There are attack signatures and the signatures of legitimate traﬃc, and we start the discussion with the former. Figure 3 exhaustively enumerates all signatures and their conceivable implied attack types. As we described earlier, ’0’ in a signature means that the monitor has not recently seen the value in the given coordinate. Thus, if a packet belongs to an attack stream, ’0’ value in a coordinate most probably means that the coordinate is pivoting. The leftmost column is the number of dimensions that are pivoting. The second column is how the attacks might manifest themselves geometrically when the attack is mapped on to the 3-d hyperspace a la Figure 1. An important note here is that the signatures listed in Table I are self-induced. Namely, the values in a signature are what are
Fast Classiﬁcation, Calibration, and Visualization of Network Attacks
…… 09:35:23.955222 09:35:23.958716 09:35:23.965132 09:35:23.965443 09:35:23.966412 09:35:23.974520 09:35:23.976617 09:35:24.091332 09:35:24.093271 09:35:24.093317 …… 09:35:24.104956 09:35:24.105238 09:35:24.106191 09:35:24.107471 09:35:24.125654 09:35:24.126519 ……
…… x.x.x.x x.x.x.x x.x.x.x x.x.x.x x.x.x.x x.x.x.x x.x.x.x x.x.x.x x.x.x.x x.x.x.x …… x.x.x.x x.x.x.x x.x.x.x x.x.x.x x.x.x.x x.x.x.x ……
…… 64218 64232 64310 64311 64316 64322 64331 64424 64423 64422 …… 64438 64437 64433 64429 64466 64464 ……
…… 22.214.171.124 126.96.36.199 188.8.131.52 184.108.40.206 220.127.116.11 18.104.22.168 22.214.171.124 126.96.36.199 188.8.131.52 184.108.40.206 …… 220.127.116.11 18.104.22.168 22.214.171.124 126.96.36.199 188.8.131.52 184.108.40.206 ……
Destination port …… 111 111 111 111 111 111 111 111 111 111 …… 111 111 111 111 111 111 ……
Fig. 2. Real-life pivoting example: hostscan
Dim. 0 1
Graphical manifestation Dot Straight line
Single-source-spoofed DoS Portscan Hostscan Source-spoofed DoS (destination port fixed)
Source-spoofed DoS (destination port varied) Distributed hostscan
Fig. 3. Attack signatures
caused by the corresponding attack itself, but not by others. To wit, these are what an attack would obtain in the absence of any cross (legitimate + other type of attack) traﬃc. But as we discussed earlier, cross traﬃc might overlap in one or more coordinates, and these signatures are not always those detected when corresponding attack is under way. For < 0, 0, 0 >, one or more coordinates can be ﬂipped to 1 by cross traﬃc that happens to coincide on IP addresses or
Hyogon Kim et al.
port number. Suppose a ﬂow < 220.127.116.11, 18.104.22.168, 5555 > is initiated after a ﬂow < 22.214.171.124, 126.96.36.199, 3333 > is registered by RADAR. Then the former will receive < 0, 1, 0 > signature, which RADAR recognizes as the port-varied DoS attack. Since the signatures in Table I are before their attack traﬃc is subject to possible overlap, we call them original signatures. In contrast, if an original signature does get modiﬁed by overlap, we call the resulting signature transformed signature. For instance, if the transformation < 0, 0, 0 >→< 1, 1, 1 > occurs, where < 0, 0, 0 > is the original signature and < 1, 1, 1 > is the transformed signature. So when RADAR detects an attack signature, it might be a transformed signature, or an original signature kept intact. Most signatures in Table I are fairly straightforward, but there are a few that call for some explanation. First, even if nothing is pivoting (signature < 1, 1, 1 >), theoretically it still can constitute an attack. One may use a single, spoofed source IP address and a ﬁxed destination port number in a DoS attack. But it is impractical from the perspective of the attacker. Once the attack is identiﬁed as DoS, simply ﬁltering on the single (spoofed) source address leads to the complete elimination of the attack. ”Worse” yet, the collateral damage in the ﬁltering process is limited to the spoofed host only (it is denied an access to the victim). Therefore, we assume in this paper that this type of attack is not employed in reality. Second, we assume the distributed hostscan (signature < 0, 0, 1 >) will be detected as multiple hostscans (signature < 1, 0, 1 >), as it is. Third, the network-directed DoS (signature < 0, 0, 0 >) is an attack on the ingress pipe rather than on any particular host in the victim network. The only rationale might be that the attacker wants to evade detection because attack intensity for individual destination IP address contained in the pivoting range is proportionally reduced. But then the attacker is assuming (micro) ﬂow-based detector as its potential opponent, which is lame under the whole gamut of other existing detecting/ﬁltering methods [6, 7]. So in this paper, we also reject this type of attack as dubious. In sum, we reject three among the listed eight as original attack signatures: < 0, 0, 0 >, < 0, 0, 1 >, and < 1, 1, 1 > (shaded in Table I). Finally, distributed DoS (DDoS) does not appear in Table I. We can consider two cases. If DDoS sources spoof source IP address, they will collectively be detected as a single DoS attack < 0, 1, ∗ >. If spooﬁng is not used, since individual DoS streams look like legitimate ﬂows from our monitor’s viewpoint, they will not be detected as attacks. Usually, however, DDoS mobilizes a large DoS network of agent hosts to maximize the impact - e.g., more than 359,000 machines were made an agent by Code-Red version 2  in an attempt to bombard the White House web site. The Sapphire worm infected more than 70,000 hosts . Therefore, when the attack commences, RADAR will begin to see a great many source IP addresses all of a sudden. This will produce a noticeable amount of < 0, 1, ∗ > signature at a fast pace, and draw the attention of RADAR. Provided the intensity exceeds the tolerable threshold, which is low enough to be used on a spoofed DoS attack from a single attacker (see Section V), RADAR will raise an alarm. The remaining ﬁve cases are of our interest in the paper. First of all, ”Kamikaze” is special. A single source spews packets at a high rate towards random destination hosts at random ports.
Fast Classiﬁcation, Calibration, and Visualization of Network Attacks
Apparently, it cannot be an eﬀective attack, but rather, it seems suicidal. The origin of this type of ”attack” is not clear, but it does appear in our traces . One explanation could be a bug in the DoS attack code - pivoting destination address instead of source. But a more plausible theory is that it is the backscatter  from the DoS victim towards spoofed attack sources. And in Table I, we list two DoS types, but the distinction is only for the convenience of analysis - it does not bear any practical signiﬁcance. The signatures of the legitimate traﬃc can be similarly analyzed, but we omit the discussion due to space constraint. Interested readers can ﬁnd them in . 2.2
Fig. 4 shows the construction of main ﬁlter in the attack monitor. This is what we have called the ”front-end” thus far. It is composed of 3 hash tables, and collectively these hash tables generate the signature for each incoming packet. The network/transport packet header is mirrored to the ﬁlter, where a single, separate lookup is made against source IP address, destination IP address, and destination port number table, respectively. When a value (address or port) is ’not found’, i.e., recently unobserved, it is registered in the corresponding hash table as a new sighting. Any hash function can be used as long as it has good distributional property and can be quickly calculated. Among these two properties, however, the speed weighs more for the front-end. For instance, MD5 and SHA-1 may have good distributional property, but they require too complicated a computation, so they would not ﬁt our environment. Our experience shows that using the least signiﬁcant 24 bits from the IP address suﬃces for casual operation. Against the backbone trace we have, it resulted in 1.0072 comparisons on average (most are 0 and 1, where 0 means empty hash bucket), with only a few reaching up to 8 comparisons. For port hash table, the hash function is identify function, i.e., we use the port number as the index itself. This is because there are only 64K port number values. Since the hash lookups are used, the complexity of the main ﬁlter can be engineered at O(1). with each entry is the last accessed time tl . We maintain a moving time window L beyond which registered IP addresses or port numbers age out. Namely, if tnow − L > tl , we remove the entry from the corresponding hash table. We call the time window lifetime, and we deﬁne two lifetimes as follows: – LH (= Ls = Ld ): [source/destination] host lifetime – Lp : destination port lifetime The reason that we perform a separate lookup for each coordinate is clear. If we maintained each ﬂow entry indexed by < s, d, p > collectively, we would not know which coordinate is responsible for a failed ﬂow lookup. It means that we would not know immediately which coordinate is being pivoted, i.e., what type of attack is being mounted. Then some additional processing would be necessary on these new ﬂows in order to achieve classiﬁcation. Therefore, for real-time classiﬁcation, separate hash lookups are essential. Earlier we mentioned the possibility
Hyogon Kim et al. packet
main filter s
source source hash table hash table
dest hash dest hash table table
port hash port hash table table
Fig. 4. Signature generation by the main ﬁlter
of signature transformation. In particular, when the signature of the ﬁrst packet belonging to a legitimate ﬂow gets transformed, the packet may be identiﬁed as an attack. For < 1, 1, 1 >, on the other hand, the cause of misinterpretation is the inadequately set lifetime(s). In case it is set too low, RADAR forgets too fast (i.e., before the ﬂow ends), and returns 0 when it should return 1. Likewise, attack packets can get non-attack or incorrect attack signatures depending on the number and location of the ﬂipped bit(s). So there is always possibility that any coordinate can suﬀer this unwanted bit ﬂip(s). In , we analyze the false positive and false negative probability of the proposed algorithm caused by bit ﬂip(s).
We implemented a prototype of the RADAR system. Figure 5 shows the result of applying RADAR to the 8-hour trace (Dec. 14th, 2002) of about 612 million packets. It processed the trace in just 2.5 hours on a Pentium-3, 966MHz PC. The ﬁgure clearly shows that it successfully extracts attacks. Interested readers can ﬁnd and compare animations of attacks and their processed results in . We also plugged RADAR to a campus network gateway. The incoming packets were optically tapped from the gateway router on two Gigabit Ethernet interfaces . A Pentium-4 2.4GHz machine with 512MB Rambus memory, Intel PRO/1000MF dual port LAN card, and PCI 2.2 (32bit) bus simultaneously run a separate instance of the RADAR algorithm on each Ethernet port. The total traﬃc rate was roughly 330Mbps (65Kpps) at the time of the experiments . The most important result is that there was no packet loss at the kernel , due to RADAR processing. This is remarkable considering that we simultaneously run 2 instances of the algorithm. The memory requirement of the hash tables in the main ﬁlter
Fast Classiﬁcation, Calibration, and Visualization of Network Attacks
Fig. 5. Graphical output from the post-ﬁlter, a real RADAR-processed result of Figure 1
and the post ﬁlter  is moderate. Assuming we use a 24-bit hash for the source and destination IP tables, we need at least 225 hash buckets whose heads are a pointer (usually 4 octets). This alone is 128MB. Over and above, we need to store each ﬂow in these tables, where a ﬂow has at least 2 IP addresses, 1 port number, and a timestamp. Also each entry needs a pointer to the next entry. So each ﬂow entry requires at least 17B. Assuming there are 1 million ﬂows being tracked simultaneously, 34MB should be used. Then 1 million ﬂows in the main ﬁlter IP table translates to approximately 10Gbps (OC-192) based on our ﬂow arrival rate constant, since we have by default LH = 10s. Over and above, we have the port table in the main ﬁlter. However, there are only 64K entries, thus it adds little to the memory requirement. In the post-ﬁlter, we do not have large tables, since concurrent attacks must be only handful. We do not expect to see, say 64,000 attacks all simultaneously under way, even it is on a backbone link. Therefore, we use 16-bit hash for all tables. Again, the memory requirement will be insigniﬁcant, most likely less than 2MB. In sum, more than half of the memory of RADAR is used to construct the IP tables in the main ﬁlter. If memory is a critical resource, we could use 23-bit hash, halving the requirement, and then 22-bit hash and so forth.
This paper proposes a novel approach that determines for each arriving packet if it constitutes an attack, and if so, what type of attack it is, on a high-speed link, in real time. The approach is based on a simple observation that only network attacks such as DoS and scans manifest themselves as a regular geometric
Hyogon Kim et al.
entity in a 3-dimensional hyperspace whose dimensions are source IP address, destination IP address, and destination port number. Instead of employing complex pattern recognition algorithms to detect such regular patterns, we propose a novel algorithm, RADAR, that captures the ”pivoting” behavior which directly translates to the forming of abovementioned regular geometry in the 3-d hyperspace. RADAR algorithm requires only a few memory lookups per packet, yet the classiﬁcation error is minimal. This algorithm pans out only suspicious packets matching the pivoting behavior, so buys enough time for a more sophisticated back-end processing which removes the false positives from the suspicious packets. We analyze the performance of RADAR algorithm in terms of speed, sensitivity, relative error, and false positive rate. The simulation and real implementation experiments demonstrate that the algorithm indeed performs up to our expectation on high-speed links, and that it could be a useful building block for an early warning and reaction framework against fast global attacks of the future.
References  R. B. Blazek et al., ”A novel approach to detection of denial-of-service attacks via adaptive sequential and batch-sequential change-point detection methods,” IEEE Systems, Man, and Cybernetics Information Assurance Workshop, June 2001. 837  C. C. Zhou, ”Using Hidden Markov Model in Anomaly Intrusion Detection,” http://tennis.ecs.umass.edu/ czou/research/HMM/index.htm. 837  H. Kim, ”Fast Classiﬁcation, Calibration, and Visualization of DoS and Scan Attacks for Backbone Links,” Technical Report, June 2003, http://net.korea.ac.kr/papers/RADAR.html. 838, 840, 843, 844, 845  CAIDA, ”CAIDA analysis of Code Red,” http://www.caida.org/analysis/security/code-red/coderedv2 analysis.xml, July 2001. 842  CAIDA, ”Analysis of the Sapphire Worm,” http://www.caida.org/analysis/security/sapphire/, Jan. 30, 2003. 839, 842  M. Poletto, ”Practical Approaches to Dealing with DDoS Attacks,” NANOG presentaion, May 2001. http://www.nanog.org/mtg-0105/poletto.html. 842  Ratul Manajan, Steven M. Bellovin, Sally Floyd, John Ioannidis, Vern Paxson, and Scott Shenker, ”Controlling High Bandwidth Aggregates in the Network,” ACM CCR, V.32 N.3, July 2002. 842  David Moore, Geoﬀrey Voelker, and Stefan Savage, ”Inferring Internet Denialof-Service Activity,” in proceedings of the 2001 USENIX Security Symposium. 843  K. Houle and J. Weaver, ”Trends in Denial of Service Attack Technology,” CERT Coordination Center, Oct. 2001. 839