An Efficient Packet Scheduling Algorithm With Deadline ... - IEEE Xplore

212

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 15, NO. 1, FEBRUARY 2007

An Efficient Packet Scheduling Algorithm With Deadline Guarantees for Input-Queued Switches Yong Lee, Jianyu Lou, Junzhou Luo, and Xiaojun Shen, Senior Member, IEEE

Abstract—Input-queued (IQ) switches overcome the scalability problem suffered by output-queued switches. In order to provide differential quality of services (QoS), we need to efficiently schedule a set of incoming packets so that every packet can be transferred to its destined output port before its deadline. If no such a schedule exists, we wish to find one that allows a maximum number of packets to meet their deadlines. Recently, this problem has been proved to be NP-complete if three or more distinct deadlines (classes) are present in the set. In this paper, we propose a novel algorithm named Flow-based Iterative Packet Scheduling (FIPS) for this scheduling problem. A key component in FIPS is a non-trivial algorithm that solves the problem for the case where two classes are present in the packet set. By repeatedly applying the algorithm for two classes, we solve the general case of an arbitrary number of classes more efficiently. Applying FIPS to a frame-based model effectively achieves differential QoS provision in IQ switches. Using simulations, we have compared FIPS performance with five well-known existing heuristic algorithms including Earliest-Deadline-First (EDF), Minimum-Laxity-First (MLF) and their variants. The simulation results demonstrate that our new algorithm solves the deadline guaranteed packet scheduling problem with a much higher success rate and a much lower packet drop ratio than all other algorithms. Index Terms—Input-queued switch, network flow, packet scheduling, quality of service, real time scheduling.

I. INTRODUCTION OUPLED with its extraordinary capacity growth, the Internet is expected to provide a wide variety of application services with more advanced performance requirements known as QoS (quality of service) that are commonly characterized in terms of throughput, packet loss probability, delay, and delay jitter guarantees. Implementing these guarantees requires an efficient packet scheduling algorithm in the network’s routers/switches. The traditional output queueing (OQ) scheme has the best performance in QoS provision, in which every incoming packet can be immediately transmitted to its destined output port without contention. However, it has poor scalability because

C

Manuscript received February 9, 2004; revised April 26, 2005, and November 24, 2005; approved by IEEE/ACM TRANSACTIONS ON NETWORKING Editor F. Neri. The work of Y. Lee and J. Luo was supported in part by the National Natural Science Foundation of China under Grants 90412014 and 90604004. The work of J. Lou was supported in part by the UMRB Grant K9949. The work of X. Shen was supported in part by the UMRB Grant K9949, a contribution made by FutureNet Technology Inc., and K.C. Wong Education Foundation, Hong Kong. Y. Lee and J. Luo are with the School of Computer Science and Engineering, Southeast University, Nanjing 210096, China (e-mail: [email protected]; [email protected]). J. Lou and X. Shen are with the School of Computing and Engineering, University of Missouri-Kansas City, Kansas City, MO 64110 USA (e-mail: [email protected]; [email protected]). Digital Object Identifier 10.1109/TNET.2006.890097

switch and associated output the switching fabric of an times as fast as the input line speed to buffer must operate avoid output contention. Input queueing (IQ) scheme overcomes the scalability problem and has drawn great interest from researchers [2]–[8]. In an input-queued switch, incoming packets are first queued on the input side, and then scheduled such that they will be switched without output contention. The switching fabric and queue buffers in IQ can thus run with no speedup speedup , where the speedup is defined as the ratio of the switching fabric speed to the input line speed. However, simple IQ schemes may suffer lower throughput and higher packet delay. It was shown [1] that the throughput of an IQ switch with a single FIFO queue per input port may be limited to just 58.6% due to head of line (HOL) blocking, where a packet destined for a free output port is blocked by another packet ahead of it in the queue. The HOL blocking phenomena can be entirely eliminated by using the virtual output queueing (VOQ) technique ([2], [3] and references therein). With VOQ, a number of scheduling algorithms that are mainly based on various maximum weighted matching (MWM), or maximal matching [2]–[6], achieve 100% throughput and some degree of QoS guarantees. However, such performance can only be provided in a statistical rather than a deterministic way. These scheduling algorithms compute a matching configuration slot by slot in which an input is connected to at most one output and vice versa. As the slot time becomes shorter due to continuously increasing line speed, a scalability problem arises. Another category of IQ scheduling algorithms aims to provide rate guarantees to various traffic streams at connection level [7]–[9]. These algorithms work satisfactorily for certain restricted traffic models (e.g., CBR), and require regulating the incoming traffic to certain specific flow patterns. Generally no individual packet QoS guarantee is provided. To control the matching complexity while provisioning certain QoS guarantees, a frame-based matching scheme has been proposed where the time axis is partitioned into a sequence of frames each of which consists of a number of consecutive time slots. In frame-based scheduling, arriving packets, during the current scheduling cycle (frame), are first held in a buffer waiting to be scheduled. When the current cycle finishes, the scheduling algorithm fetches the packets waiting in the buffer and schedules them into next frame. One major advantage of frame-based scheduling is that the frame length can be flexibly tailored according to the complexity of the algorithm. Moreover, an end-to-end delay and delay jitter bound are achievable by frame-based scheduling schemes through properly regulating the traffic [9], [14]. Recently it has been demonstrated that, in a network of interconnected IQ switches, slot-by-slot based MWM scheduling algorithms may fail to guarantee

1063-6692/$25.00 © 2007 IEEE

LEE et al.: AN EFFICIENT PACKET SCHEDULING ALGORITHM WITH DEADLINE GUARANTEES FOR INPUT-QUEUED SWITCHES

100% throughput, but some frame-based algorithms achieve 100% throughput [24]–[26]. In searching for better QoS performance, researchers also proposed combined input/output queueing (CIOQ) schemes [10]–[12]. It is shown that a CIOQ switch with speedup two is capable of exactly emulating an OQ switch with any monotonic, work conserving service discipline [10]. The main barrier in using CIOQ algorithms is their high complexity. In addition, an internal speedup of two is required of the switching fabric and buffer memory. Therefore, researchers continue looking for efficient packet scheduling algorithms for the IQ schemes. In order to provide a deterministic QoS guaranteed performance for IQ switches, we attempt to tackle the following fundamental scheduling problem: Given a set of incoming packets each of which has a deadline, can we find a feasible contention-free schedule such that every packet can be transferred to its destined output port before its deadline? If no such feasible schedule exists, how can we find a schedule that allows a maximum number of packets to meet their deadlines with a minimum number of packets dropped? This problem is important because a deadline guarantee implies throughput and rate guarantees. It has also been a hot and difficult problem studied in the field of Multi-periodic Satellite Switched/TDMA (SS/TDMA) scheduling and Real-Time scheduling [20]–[22]. A number of algorithms have been proposed for this problem [15], [17], [18], [20]–[22]. One approach is to apply the well-known Birkhoff–Von Neumann algorithm to decompose the traffic rate matrix into a linear combination of permutation matrices and then employ a Packtized Generalized Processor Sharing (PGPS) scheme to produce the schedules [15]. This approach can achieve a rate guarantee for flows and provide a statistical delay bound for packets; however, it is only applicable when all packets of the traffic matrix have a single common deadline and must be non-overloaded. When more than one deadline is present in the packet set, a different approach must be used. Another recent result is to use packetized fluid tracking policies that require every packet to depart no later than the time determined by the fluid policy [21]. Searching for a tracking policy can thus be converted to a problem of scheduling packets with deadlines. Unfortunately, an optimal tracking policy only exists for 2 2 switches. For the multi-class case, where packets may have two, three, or more different deadlines, the most commonly used approaches are the EDF (Earliest Deadline First) algorithm, MLF (Minimum Laxity First) or their variants. Ref. [20] provides a good survey of these algorithms and limited simulation results. Recently, this fundamental scheduling problem has been proved to be NP-complete if three or more classes (distinct deadlines) are present in the packet set [17]. In this paper, we propose a new approach to dealing with this fundamental problem. An optimal algorithm for a single class is first presented. Then we solve the schedulability problem for the case of two classes. Distinct from the Birkhoff–Von Neumann decomposition method that can only handle a single class of incoming traffic, our algorithm is the first to optimally schedule packets of two classes if the incoming traffic is non-overloaded. For the general case where an arbitrary number of service classes appear, we develop an efficient algorithm named Flow-based Iterative Packet Scheduling (FIPS), which solves the multi-class case by repeatedly applying the two-class

213

algorithm. As demonstrated by simulation results, the FIPS algorithm solves the deadline guaranteed packet scheduling problem with a much higher success rate and a lower packet drop ratio than the traditional EDF, MLF based algorithms. In IQ switches, FIPS works nicely for frame-based packet scheduling. As shown in Sections II–VI, such a scheme not only effectively handles the continuous arrival of packets, but also achieves the advantages of frame-based scheduling and provides differentiated QoS requirements. Our algorithm also applies to a wide range of other applications, including broadcast-and-select optical networks, hard real-time systems, such as multiprogramming systems, and satellite-switched time division multiple access (SS/TDMA) systems. The deadlines for packets in these systems are either pre-specified (e.g., hard real-time, multi-periodic message scheduling), or derived (e.g., translated from QoS requirements such as end-to-end delay bound), or implicitly given (e.g., tracking fluid policies in IQ switches). For instance, VTRS (Virtual Time Reference System) [27] elaborates how to compute the virtual time stamps (expected arrival and departing times) from the end-to-end delay requirement and insert them into packets. The rest of this paper is organized as follows. Section II introduces necessary notations and background to the deadline guaranteed packet-scheduling problem. Section III presents an optimal solution to the single class case. Section IV derives a necessary and sufficient condition for the schedulability problem for the case where two classes are present. In Section V, an efficient algorithm for multiple service classes is presented and used for frame-based scheduling. Simulation results and comparisons are given in Section VI. Section VII concludes this paper. II. PRELIMINARIES Consider an input-queued switch. As most research papers on IQ switches do, we assume all packets have a fixed size. (Variable lengths can be dealt with by the segmentation and reassembly technique.) The time is divided into slots such that the length of each slot is equal to the transmission time of a packet. Without loss of generality, we assume consecutive slots constitute a frame where is a constant positive integer. The switching fabric is non-blocking with no speedup. In each time slot, the switch controller sets up a switching configuration that establishes up to one-to-one connections from input ports to output ports. Each input port or output port can transfer or receive at most one packet in one time slot. Slots in each frame are . Every numbered from 0. Slot covers time interval packet is assumed to arrive or depart at the beginning of a time slot. Two timestamps, and , are associated with each packet if it is supposed to depart no earlier than and no later than . Without loss of generality, we assume all packets have the . A triple is associated with each same , and packet which means that it is to be transmitted from input port to output port before or at time slot . Note that a multiple . Let be the set of of packets can have the same triple be the set of all possible dispackets to be scheduled, and there is a tinct deadlines (classes) that occur in , i.e., packet with . For each , , and , , let

214


all packets in that have triple we also define the following notations.

. For clarity,

. The set following conditions are satisfied:

is schedulable only if the

and

any packet in

that is scheduled in slot

Moreover, we define , , , , and . Obviously, we have , and . If a packet of is sched, we say that it is deadline guaranteed. uled in slot , In IQ switches, the incoming packets arrive continuously. We use a frame scheme rather than slot-by-slot matching to accommodate such arrival continuity as well as to achieve good scalability. With frame-based scheduling, at the beginning of every frame, all buffered packets constitute a static set, , with the but with different individual deadsame eligibility time lines. Traditional frame-based scheduling treats all packets with the same delay bound. However, to provide good QoS, some deadlines need to be less than the frame length and others can be greater than . Our goal is to find a schedule that allows a maximum number of packets to meet their deadlines. Packets that miss deadlines would be dropped. We assume that packets with an earlier deadline have a higher priority. Therefore, we maximally schedule the packets with the earliest deadlines first, then maximally schedule the packets with next earliest deadline, and so on. The packets whose deadlines are greater than in the current frame have the lowest priority and would be scheduled only if the current frame still has spare space. If they cannot be scheduled, they will join the next frame with a deadline reduced by from the current one. Since at most one packet can be transferred between an input port and an output port in any time slot, for any slot , the following conditions must be satisfied: or

(1)

(4)

for any and any . Theorem 1 can be easily proved because the total number of packets arriving at input port (or destined to output port ) cannot be larger than with deadlines less than or equal to the number of time slots available which is . If the set is not schedulable, we wish to find a schedule that guarantees a maximum number of packets to meet their deadlines. Definition 3: Given a packet set , an optimal schedule is one that allows a maximum number of packets to meet their deadlines. Because the schedulability problem is NP-complete when , it is difficult to produce an optimal schedule even is schedulable. Therefore, we wish to design an though approximation algorithm that has high probability of producing is schedulable, or drops as few a feasible schedule when packets as possible. In our simulations, we will generate schedulable sets with various switch sizes and test the performance of our algorithm vs. other existing algorithms. We will examine their success rates and packet dropping ratios. The success rate, , is defined as the number of times an algorithm succeeds in producing a feasible schedule against the total number of schedulable sets tested. Given a schedulable set, the simulation counts a failure even if one packet is dropped. The packet dropping ratio, , is defined as the average number of packets dropped over the total number of packets in a given set. Due to NP-completeness of the schedulability problem, researchers have had difficulties in constructing large schedulable packet sets with large switch sizes [20]. Therefore, they did simulations only on small sized switches such as 4 4 switches. In this paper, we introduce an interesting method that can easily construct a schedulable set with a larger switch size such as 16 16, 32 32, or even larger. The details are given in Section VI.

(2) III. SCHEDULING PACKETS WITH A SINGLE CLASS (3)

Definition 1: Let be a set of packets to be scheduled, where . The set is called each packet is associated with a triple schedulable if a schedule exists such that conditions (1), (2), and (3) are satisfied and every packet is deadline guaranteed. Definition 2: The schedulability problem is to determine whether a given packet set is schedulable. to be schedulable is A necessary condition for the set known as the non-overloaded condition that appeared in [3], [15], [17]. We state it in Theorem 1. Theorem 1: Let be a packet set to be scheduled. Let be , a set of distinct deadlines,

In this section, we consider the simplest case where every packet in set has the same deadline . For this case, the necessary condition (4) stated in Theorem 1 becomes a sufficient condition also [15]. We state it in Theorem 2. Theorem 2: A packet set is schedulable if and only if (5) (6) When conditions (5) and (6) are not satisfied, an interesting problem is how we can drop the minimum number of packets such that the remaining set satisfies (5) and (6). Note that the Birkhoff–Von Neumann algorithm [15] cannot solve the optimization problem even for a single class when (5) and (6) are not


Fig. 1. Flow network for the example.

Fig. 2. Maximum flow in the network of Fig. 1.

satisfied. To the best of our knowledge, there is no known algorithm available to solve this optimization problem. Here we propose the following algorithm to solve this optimization problem, which will be used as an important subroutine by the multi-class algorithm later. First, we define the excess degree of input to be . Similarly, we define excess degree of output j to be . Obviously, some or must be positive, or otherwise is schedulable. We need to drop some packets to reduce positive and down to zero. The total , which value we need to reduce is is called the total excess degree. Suppose we drop a packet of . If both and are positive, then dropping this packet by two. If is positive but is not (or will reduce the value by one. If vice versa), then dropping this packet will reduce both and are not positive, then dropping this packet will not help the schedulability. We now present the algorithm along with explanations. Procedure One_Class_Dropping

215

:

Step 1) Construct a capacitated directed bipartite graph , where , . Edge if there . The capacity on edge is a packet of is . , . An example is given below with The set is represented by a matrix, where entry is the value of . The excess degree for each and is also shown.

The graph for this example is shown in the middle part of Fig. 1. Step 2) Construct a flow network by adding a source vertex and a sink vertex to the graph constructed in Step 1 is as shown in Fig. 1. The capacity on edge

is the the value , and the capacity on edge value . for the Step 3) Compute a maximum (integral) flow network constructed in Step 2 [16], [19]. This flow corresponds to a set of packets we need to drop. has a flow (an Specifically, if an edge of packets of integer), then we drop the number from the set . A maximum flow of Fig. 1 is shown in Fig. 2, , which means we need to drop packets of , , , where . Dropping any packet in this set reduces the total excess degree by 2. Step 4) Update the excess degrees and for the remaining set. If every and is less than or equal to zero, we are done and exit. The remaining set for the above example and its excess degrees are shown by the following matrix . For this example, more steps are needed because and are not zero.

Step 5) If is positive, drop any packets from input port . Similarly, if is positive, then drop any packets whose destination port is . Because no could exist in the remaining set packet of and positive, dropping a packet in with both this step reduces by one. In above example, and a packet of we can drop a packet of . Therefore, the total number of packets dropped for the example is six. This is the best we can achieve. be the set of all dropped packets. Compute Step 6) Let . Output , which is schedulable now. The correctness of the above dropping procedure can be argued as follows. Let be the number of packets dropped in Step 3 and be the number of packets dropped in Step 5. Then

216


we have . The total number of packets dropped which will be minimized if is maxis imized. The maximum flow algorithm guarantees that the is maximized. Thus, the dropping algorithm is optimal. After applying the One_Class_Dropping procedure to the set , the remaining set will satisfy the conditions (5) and (6). We can then use any existing one-class algorithm [13]–[15], such as the Birkhoff–Von Neumann algorithm to schedule the remaining set with all packet deadlines guaranteed. Thus, we have the following scheduling algorithm for the one-class case that solves the optimization problem. Algorithm One_Class_Scheduling

The time complexity of the One_Class_Dropping algorithm is dominated by step 3. If we use the push-relabel method [23] to compute the maximum flow, the complexity of the . Since the time One_Class_Dropping procedure is complexity of the Bikhoff-Von Neumann algorithm is [15], the complexity of our One_Class_Scheduling algorithm . is

. is schedulable if and only if (13) (14)

is schedulable if and only if

Similarly,

(15) (16)

Algorithm Promote

,

. Let packets of , packets of , and . We first look at the schedulability problem for . Obviously, and all if all packets of can be scheduled in slots of , then the packets of can be scheduled in slots of set is schedulable. From Theorem 2, this situation is true if and only if the following conditions are true. For set : (7) (8) : (9) (10) If conditions (7) and (8) are not satisfied, then the set is obviously not schedulable. However, when the conditions (7) and (8) are true but conditions (9) and (10) are not true, we cannot determine the schedulability right away. From Theorem 1, if is schedulable, it must satisfy that (11) (12) However, conditions (11) and (12) are not sufficient. We observe that is schedulable if and only if we can “promote” a subset of class packets to class such that the set can and set can be scheduled be scheduled in slots of . Let us investigate what properties set in slots of must have.

:

Step 1)

, compute the and for set , which are the negation of the excess degree. Since conditions (7) and . and (8) are true,

Step 2)

For each and , , , compute the excess and degree for set . Since (9) and (10) are not true, some and are positive. Note that , by conditions (11), (12). If is schedulable, then a set exists such that and for all , , , .

Step 3)

Construct a 4-stage flow network . The first stage contains a single source node . The fourth stage contains a single sink node . of nodes, The second stage contains set , where represents the of input port . The third stage contains set nodes, , where represents is associated the output port . Each edge in with two numbers, called lower and upper bounds. consists of the following Specifically, the set edges:

IV. SCHEDULING PACKETS WITH TWO CLASSES

For set

, we define ,

In the following, we present an algorithm Promote that determines whether a set exists such that (13)–(16) are if it exists. From the above satisfied, and produces such a set discussion, we assume non-overloaded conditions (7), (8), and (11), (12) are true but (9), (10) are not.

:

. Step 1) call Procedure One_Class_Dropping Step 2) call Birkhoff–Von Neumann algorithm .

Let

and in

For each packets of packets of in It is easy to see that

For each and , vacancy degree

,

(i) There is an edge between source and , , which has a lower bound , and an upper bound . The , we must two bounds mean that, if promote at least packets from whose input port is , but not exceeding the value allowed by set . (ii) There is an edge from to with a lower bound , and upper bound , meaning that there are packets of in for possible selection . by set


217

Fig. 4. A legal flow in G(s; t; U; V ; E ) of Fig. 3.

Fig. 3. Flow network G(s; t; U; V ; E ) constructed from matrices A and B.

(iii) There is an edge from to with lower and upper bound bound , . The meaning is, ,we must promote at least but if packets of class that are not exceeding destined to output . As an example, we consider a 4 4 switch. with and class with The class are represented by the following two is matrices respectively, where the entry the number of packets to be switched from input to output . The corresponding flow network is shown in Fig. 3.

Step 4)

Call

the

to find a flow

algorithm Max_Lower_Bound presented in the Appendix in the network .

Step 5)

Compute the set , is the flow on edge

Step 6)

If the flow is a legal flow, set is schedulable, . Otherwise, is not schedulable.

packets of .

Remark 1: There is an existing algorithm [16], [19] to compute a legal flow for a flow network if such a flow exists. How-

ever, the existing algorithm does not provide a solution if no legal flow exists. In this case, our algorithm Max_Lower_Bound in the Appendix will provide the set such that is schedulable and also guarantees has may not be the minimum total excess degree, although . schedulable in Fig. 4 shows a legal flow in the of Fig. 3, which means that class packets of (1, 1, 4), (3, 3, 4), and (4, 3, . If we promote these packets to class , 4) constitute the set both set and set , as represented by the following two matrices, are schedulable, and thus is schedulable.

The time complexity of the Promote algorithm is determined by Step 4, which is . (see the Appendix). Theorem 3: Given a set of packets with two deadlines, its schedulability can be determined in time . Proof: This follows from the above discussion. Theorem 3 shows that although the schedulability problem is , it is polynomial solvable if NP-complete when . In other words, we can achieve a 100% success rate if only two classes are present in . When the set with is not schedulable, we consider the optimization problem, which is concerned with how to drop a minimum number of class packets such that the remaining set of is schedulable. We assume that non-overmay loaded conditions (7), (8) and (11), (12) are true, but not exist. We propose the Two_Class_Scheduling algorithm as follows, which solves both the schedulability and optimization problem.

218


Algorithm Two_Class_Scheduling

:

that uses the Step 1) Call algorithm Promote algorithm Max_Lower_Bound in the Appendix , packets of and computes the set is the flow on edge . Step 2) If Step 1 outputs “legal flow found”, then promote to class and go to Step 5. packets in to class ( Otherwise, promote packets in is schedulable). Step 3) Let packets of in , and packets of in . Compute the excess degree and for the set , where , , . The Max_Lower_Bound algorithm guarantees that the total excess degree of , , is minimum. Step 4) In set , if , drop packets from input port ; if , drop packets that are destined to output . The total number of dropped packets is . Note that no packet of exists in set such that both and . This is because, otherwise by conditions (11), (12), this packet of can be also promoted to set , contradicting the optimality of . Step 5) Schedule the packets of in and remaining packets of in respectively, exit. As described before, since Step 1 of algorithm , Two_Class_Scheduling has a time complexity of and step (5) has a time complexity of , the overall complexity is . Theorem 4: The algorithm Two_Class_Scheduling drops a minimal number of class packets such that the remaining set of is schedulable. , the theorem is obviously Proof: If Step 1 produces set correct. In the following, we assume does not exist. For the sake of contradiction, suppose an optimal solution drops fewer packets, and all remaining packets are schedulable. than be the set of class packets scheduled in by Let the optimal solution, packets of in packets of in . Since is schedulable in slots of , , . Computing the excess degree for set , we get , , and . has the minSince Max_Lower_Bound guarantees that imum total excess degree, we have . Since packets, it must the optimal solution drops fewer than , where both and drop at least one packet of , so that the value can be reduced by two. Let be such a packet. We compute the vacancy de, particularly, for port and port , gree for the set

, and . is schedulable in slots of , both Because and . We argue that and cannot be both positive, for otherwise, the packet of can be promoted to and the packet of is still schedulable in , contradicting the optimality. So, without loss of gener. Then we have , ality, we assume that is , , which contradicts the condition or (11). Remark 2: The optimality of algorithm Two_Class_Scheduling is built upon the assumption that the packets in class have higher priority than those in class . That is, class packets are maximally scheduled first when we pursue the maximum total number of guaranteed packets. If we assume both class and class to have the same priority, then the algorithm may produce suboptimal solution because this model allows to drop class packets to get more class packets transmitted. We will discuss this model in the future. In more general cases, non-overloaded conditions (7), (8) are satisfied, but conditions (11), (12) may not. We propose the following generalized -Promote algorithm to promote a set of class packets to set such that is schedulable in and the total excess degree for the set is maximally reduced. Algorithm G-Promote

:

Step 1) Let . Note that set presumably satisfies the conditions (7) and (8). Compute the excess and with respect to vacancy degrees , , , and sets and as in Steps 1 and 2 of the Promote algorithm. Because conditions (11), (12) are , and/or . not true, it is possible that as in Step 2) Construct a network graph Step 3 of the Promote algorithm except that for any , if , set the lower bound of edge to be . This change is based on the fact that at most class packets from input can be promoted to set . Similarly for any edge , if , set to be , meaning at most the lower bound of class packets to output can be promoted to set . Step 3) Call the algorithm Max_Lower_Bound to find a flow in the . network , packets of Step 4) Compute the set is the flow on edge . from set to set . Note Step 5) Promote the set that the Max_Lower_Bound algorithm guarantees to will maximally that promoting packets of reduce the total excess degree of set . Obviously, -Promote has a time complexity of . , if the remaining set After calling G-Promote is non-overloaded, then the set is schedulable. Otherwise, we need to consider the optimization problem.


Algorithm Flow-Based Iterative Packet Scheduling

:

begin ; ; Drop-set is the set of dropped packets. Drop-set: It is empty initially. for

to

do

begin :

;

call algorithm One_Class_Dropping the scheduling frame is Let

; .

dropped packets from ; ;

Drop-set

Drop-set

;

; every packet in is treated with the same deadline . call algorithm G-Promote with the following of implementation detail: If there are packets to be promoted, , select them in increasing order of deadlines, i.e., , then packets first select packets of until their sum is equal to . of be the set of selected packets from class Let , be the set of selected packets from class , then ; schedule

using Birkhoff–Von Neumann algorithm; to

for

do

begin ;

Compute remaining set

for each class end; end; Reduce the deadline of each packet in set by and schedule them in the next frame. end The algorithm for the two-class optimization problem is a special case of the algorithm for the multi-class case that we will present in Section V. We omit this special case here. V. AN EFFICIENT ALGORITHM FOR MULTIPLE CLASSES In this section, we consider the scheduling problem with three or more service classes. Since this problem is NP-com-

219

plete, an approximation algorithm has to be used. We propose a new and efficient algorithm named Flow-based Iterative Packet Scheduling (FIPS) for this problem. FIPS utilizes our results for two service classes and works for non-overloaded as well as overloaded cases. As indicated in Section II, we consider the problem in the context of frame-based scheduling. FIPS works at the core part that schedules a given set of packets with differentiated deadline requirements in a frame. Let the frame length be equal to . Let be the set of packets in the buffer to be scheduled for the current frame, , where . Let , and . Without loss of generality, assume , where is used as a deadline also. This corresponds to a general scenario where some of buffered packets have deadlines less than , and the others have deadlines greater than . Our main is to be scheduled, idea is that when class with deadline all classes with larger deadlines are treated as a second class collectively. The scheduling begins with class one (of deadline ), then class two (of deadline ), and so on. Obviously, if Flow-based Iterative Packet Scheduling generates an empty Drop-set, is schedulable. Note that the . Drop-set does not contain packets with deadline It is straightforward to see that the time complexity of FIPS , where is the number of classes with algorithm is deadlines less than and equal to frame length . On average, the . complexity for each class thus remains VI. SIMULATION RESULTS As FIPS is employed with a frame-based model, it takes the advantage of temporal correlation of arriving packets. Moreover, it aims to provide differential QoS guarantees. We have done extensive simulations and compared the performance between FIPS and a number of widely employed heuristic algorithms for the cases of two classes and three classes. The case of three classes serves as a representative of multiple service classes. For the schedulability problem, the case of two classes is less interesting because our algorithm has a 100% successful rate for all schedulable sets. For the case of three and more service classes, the problem is NP-hard, and previous researchers could only test schedulability on small sized switches, such as a 4 4 switch [20], because they had to use exponential time to run an exhaustive search to determine the schedulability. In our simulations, we introduce an efficient way to construct a schedulable packet set for large sized switches. The trick is, we generate a schedule first, then backward confrom the schedule. Suppose struct the sets switch with we want to generate a sample set for an , . For each time slot, we randomly generate a permutation on , and the corresponding packets without specifying the deadlines. Obviously, these packets can be scheduled in the same slot without contention. Do this . Now, we assign deadlines for every slot, from slot 0 to slot to these packets: first randomly select some packets from slots and assign deadline to them; then randomly select and assign some packets in the remaining set within slots deadline to them. Continue in this way until deadline

220


Fig. 5. Success rate as a function of switch size for the case of

Fig. 6. Success rate as a function of line load for a 16

jD j = 3 .

2 16 switch.

has been assigned. It is not necessary to include all packets in the sample set, i.e., some packets may not be included in any of the classes and thus discarded from this sample set. Our simulations compared FIPS with five other heuristic algorithms in terms of success rate and packet dropping ratio. These five algorithms are Earliest Deadline First Row-by-Row (EDF-RR) [20], Earliest Deadline First using Systems of Distinct Representatives (EDF-SDR) [20], Minimum Laxity First Row-by-Row (MLF-RR) [20], Minimum Laxity First using Systems of Distinct Representatives (MLF-SDR) [20], and Dynamic Minimum Laxity First (MLFDYN) [17]. For brevity, we do not provide here a description of these algorithms. Readers are referred to the literature [17], [20] for detailed descriptions. We first test their success rate with schedulable sets constructed as above. The simulation result is shown in Fig. 5, where frame size is 15 time slots. It is clear that FIPS has a higher success rate than all other five algorithms for a wide range of switch sizes. We also investigated how the line load may impact the performance of algorithms in terms of success rate. Simulations are done under various line loads. We used the same definition of line loads (line utilization) given in [17]. A representing result for a 16 16 switch with a line load ranging from 0.5 to 1 is displayed in Fig. 6, where the frame size is also 15 time slots. Although the algorithms degrade with the increase of line load, clearly FIPS has the best performance among all compared algorithms in terms of success rate. Another major metric is packet dropping ratio, . We investigated how the five algorithms and FIPS perform in terms of their packet dropping ratios for more general cases, where the packet set may not be schedulable. In these simulations, the bursty on-off stochastic model is adopted to generate the traffic

Fig. 7. Packet dropping ratio as a function of switch size for the case of

jD j = 2 .

Fig. 8. Packet dropping ratio as a function of switch size for the case of

jD j = 3 .

matrix. All flow matrices to the inputs were generated randomly and independently. The destinations of packets are uniformly distributed. Fig. 7 shows the packet dropping ratio for the case of two service classes with an average line load of 0.9 as a representative. (The frame size of 10 time slots is used in this simulation.) It is seen that our algorithm has the lowest packet dropping ratio compared to EDF-RR, EDF-SDR, MLF-RR, MLF-SDR, and MLFDNY, for a wide range of switch sizes. For the case of three service classes, Fig. 8 shows the packet dropping ratio at the average line load of 0.9. (The frame size used is 15 time slots.) Again it is evident that our algorithm has better performance than all other algorithms for various switch sizes. In the simulations, packets missing their deadlines are dropped. We define throughput, , to be the ratio of packets scheduled no later than their deadlines to the total arriving . Since a lower packet dropping ratio implies packets, a higher throughput, it is seen that FIPS has a higher throughput than other algorithms under both cases. By simulations, we also investigated how the frame size would affect the performance. Fig. 9 shows the results for a 32 32 switch. As we expected, the larger frame size, the better performance. This is because, when the frame is larger, it likely contains more classes and the packets in the later classes have more chances to be promoted to the classes with earlier deadlines. However, in practice, the frame size is limited by the hardware resources and other network conditions such as the line speed, memory capacity, algorithm complexity, end-to-end delay, etc. As an open problem, we conjecture that the FIPS algorithm can guarantee at least 50% throughput of the offered load in the worst case.


221

Fig. 9. Throughput as a function of frame size for various line loads.

VII. CONCLUSION In this paper, we have proposed efficient algorithms for the deadline guaranteed scheduling problems for IQ switches and shown how to appropriately apply them to frame-based scheduling. For the case of a single service class, we have presented an algorithm that can always guarantee the maximum number of packets scheduled before their deadlines no matter whether the switch is overloaded or not. For the case of two service classes, we have presented an algorithm that can also guarantee the maximum number of packets to be scheduled before their deadlines if the switch is not overloaded. When the switch is overloaded or multiple classes are present, we have presented an efficient algorithm, FIPS. Simulation results show that the proposed FIPS algorithm has a much higher success rate, a much lower packet dropping ratio and a higher throughput than the EDF-RR, EDF-SDR, MLF-RR, MLF-SDR, and MLFDNY algorithms under a wide range of switch sizes and line loads in cases they cannot guarantee all deadlines. APPENDIX This Appendix presents an efficient algorithm to find an integral legal flow for a network constructed by in Section IV, if such a legal flow algorithm Promote exists. If no legal flow exists, this algorithm will produce an integral flow that maximally achieves the lower bound requirements under the allowed upper bounds. First, let us define the problem for general flow networks, not restricted to the network of Section IV. We assume that the reader is familiar with the regular maximum network flow problem with no lower bounds [16], [19]. A flow network with lower bounds and upper bounds is a diwhere every edge rected graph is associated with two nonnegative numbers and , , called the lower bound and upper bound of , respectively. We assume if edge . Given such a flow network, we wish to and only if find a legal flow from a source vertex to a sink vertex , where and are two specified vertices. is a real-valued function A legal flow in , satisfying for all for all

(i) (ii)

Fig. 10. Illustration of transforming a network with lower bounds to a network without lower bounds. (a) A network with lower and upper bounds. (b) The transformed network without lower bounds.

Given a flow network, we wish to find a legal flow if one exists. An Existing Algorithm for Finding Legal Flows: The existing algorithm [16], [19] for finding a legal flow transforms into a regular flow network the network without lower bounds as follows. Step 1) Vertex set is obtained from set by adding two vertices, and . That is, . , construct an edge with an Step 2) For every . upper bound , construct an edge with an Step 3) For every upper bound . is also included in with Step 4) Every edge . an upper bound and with a very high Step 5) Construct edges upper bound . Fig. 10 illustrates this transformation. Obviously, this transtime. formation needs at most has a legal flow if It has been proven that has a maximum flow from to and only if that saturates edge and edge for every vertex . If this is the case, the legal flow on edge can be to the flow obtained by adding the lower bound in . That is, . The correctness proof of this algorithm can be found in [16], [19]. The network in Fig. 10(a) has a legal flow. Fig. 11 shows the flow in the transformed network and its corresponding legal flow in the original network. Note that the pair of numbers on each

222


Fig. 11. A legal flow exists in G(V; E; B; C; s; t) if the maximum flow in G (V ; E ; C ; s ; t ) saturates every edge (s ; v ) and (v; t ). (a) A maximum flow in the transformed network. (b) A legal flow in the original network.

edge in Fig. 11(a) are the upper bound and the assigned flow for this edge, while the pair of numbers on each edge in Fig. 11(b) are the lower and upper bounds for this edge. The existing algorithm, however, does not solve the corresponding optimization problem. To show this, let us define the problem precisely. with Definition A-1: Given a network lower bounds and upper bounds, a flow is called a partiallegal flow if it only satisfies the upper bound requirement and the flow balance requirement for all for all

(iii) (iv)

Definition A-2: Given a partial-legal flow in , the lower bound achievement of is defined to be the flow if , edge and otherwise. That is, . in Definition A-3: Given a partial-legal flow , the total lower bound achievement is the sum of achievements of all edges in . That is, . . Obviously, a partial-legal flow Let is also a legal flow if and only if , where is called the total lower bound requirement. , Definition A-4: Given a flow network the lower bound optimization problem is to find a partial-legal is maximized. flow such that

Fig. 12. A counterexample showing that the existing algorithm does not solve the optimization problem. (a) An illegal flow f obtained from flow f . (b) A maximum flow f in the transformed network.

The existing algorithm does not solve the lower bound optimization problem when the maximum flow in the transdoes not saturate every formed network and . In this case, using the relation edge may produce an illegal flow for the original network. Fig. 12 shows such a counterexample, where some vertices, such as vertex , are out of balance between the incoming flow and the outgoing flow, violating condition (iv). A General Algorithm for the Lower Bound Optimization Problem: We propose a general algorithm that transforms the optimization problem to a minimum cost maximum flow problem. Specifically, given a flow network with lower bounds, we construct a flow network without lower bounds as follows. , if , we construct For each edge . The first one is edge with an upper three edges in and a zero cost rate. The bound second edge is , where is a new added vertex. and a cost This edge has an upper bound . The third edge is whose upper rate and a cost rate . bound is If , we just include it in the set with the and a zero cost rate. same upper bound consists of plus the added vertices, Hence, the vertex set . Fig. 13 shows how to transform the network of Fig. 12(a) to the network , where the zero cost rate is omitted for clarity. , we can compute After obtaining from to in a minimum cost maximum flow


223

times, resulting in maximum flow algorithm time complexity [16]. In the following, we introduce a more efficient algorithm for the network specifically constructed in Section IV. An Efficient Algorithm for the Lower Bound Optimization : This efficient algoProblem for Network rithm is presented below which works only for the networks constructed in Section IV. Algorithm Max_Lower_Bound

Fig. 13. Transforming a lower bound optimization problem to a minimum cost maximum flow problem. (a) A flow network with lower bounds. (b) The transformed network with a cost rate on each edge.

using an existing algorithm in [16]. From the construction of , we observe that, given a minimum cost flow in , we can obtain a partial-legal flow in network using the relation: if , and otherwise. This is because: (1) ; , the total incoming flow (2) Since in is equal to (or outgoing flow from) any vertex to the total incoming flow to (or outgoing flow from) the in . Moreover, we observe that, in the vertex , unless and are flow , saturated, for otherwise, we can further increase and but reduce by the same amount to achieve even smaller cost. Therefore, the lower bound achieveis exactly equal to the value of if ment , or zero otherwise. Therefore, the total lower bound achievement in is

Because has the minimum cost, is the largest lower bound achievement. The above general algorithm works for any flow network. However, the min-cost maximum flow algorithm is quite complicated and had high time complexity. It recursively calls the

Step 1) Transform the network to a flow using the existing algorithm as we network , discussed above, except that the edges , , are removed. We denote the by . As an example, capacity on edge Fig. 14(a) shows the transformed network from the the network of Fig. 13(a). from by Step 2) Construct another flow network reversing the direction of every edge. Fig. 14(b) constructed from the illustrates the flow network network of Fig. 14(a). For simplicity, we keep the same label for the pair of corresponding vertices in these two networks. Step 3) Introduce two new nodes, and . Construct four , , , and , each edges with a upper bound of . Also, connect the node in to node in by a pair of oppositely each, and directed edges with a upper bound of to node in by a pair connect the node in of oppositely directed edges with a upper bound of each. Fig. 15 shows the constructed flow network from networks and . to Step 4) Compute a maximum integral flow, , from in the network . from Step 5) Obtain the flow, , for , in (only the part in the flow, is needed) using the following equations: for any vertex , for any vertex , for any edge , , . where Step 6) If for any vertex , and for any vertex , then return “Legal flow found” and flow . Otherwise, return flow . The correctness of this algorithm can be proved as follows. and First, we can assume . If not, suppose , then since and are symmetrical, we can replace the flow with the mirror image of the flow in . By doing so, we in and therefore the total flow can increase the flow of , contradicting the assumption that is on any edge maximum. Further, we can assume that is equal to the flow on the corresponding edge of . in

224


Fig. 15. Illustration of the construction of G .

Fig. 14. Illustration of the construction of G and G . (a) The transformed flow network G . (b) G obtained from G by reversing edge directions.

Second, we observe that the flow obtained for is a partial-legal flow because conditions (iii) and (iv) are satisfied. We point out that any partial-legal flow in network can also be . The details are left for converted to a flow in the network the readers. unless Third, we can assume for any vertex . This is because if but , then an augmenting path can be found as illustrated by Fig. 16. It is well known that an augmenting path allows us to push more flow along the path. cannot be maximum, yielding a contradiction. Then the flow unless . Similarly, is the lower bound This property guarantees that of and is the lower bound achievement achievement of . Therefore, the value of the flow , is equal to the following:

Fig. 16. An augmenting path exists if f (s

; u) < b(s; u)

but f (s; u) > 0.

This implies that the total lower bound achievement is equal to the value of the flow . Because the flow is a maximum flow, the total lower bound achievement is maximized. This completes the proof. The construction of the network takes linear time . Computing the maximum flow has a Comif the push-label method is used. Obtaining the plexity from the flow in takes time flow for of . Therefore, the time complexity of Algorithm is . Max_Lower_Bound ACKNOWLEDGMENT The authors would like to thank the anonymous reviewers for their thorough reading and valuable comments. REFERENCES [1] M. Karol, M. Hluchyj, and S. Morgan, “Input versus output queueing on a space-division packet switch,” IEEE Trans. Commun., vol. COM-35, no. 12, pp. 1347–1356, Dec. 1987. [2] T. Anderson, S. Owicki, J. Saxe, and C. Thacker, “High-speed switch scheduling for local-area networks,” ACM Trans. Comput. Syst., vol. 11, no. 4, pp. 319–352, Nov. 1993. [3] N. McKeown, V. Anantharam, and J. Warland, “Achieving 100% throughput in an input-queued switch,” in Proc. IEEE INFOCOM, 1996, pp. 296–302.


[4] A. Mekkittikul and N. McKeown, “A practical scheduling algorithm to achieve 100% throughput in input-queued switches,” in Proc. IEEE INFOCOM, 1998, pp. 792–799. [5] A. Kam and K. Siu, “Linear-complexity algorithms for QoS support in input-queued switches with no speedup,” IEEE J. Sel. Areas Commun., vol. 17, no. 6, pp. 1040–1056, Jun. 1999. [6] N. Mckeown, “The iSLIP scheduling algorithm for input-queued switches,” IEEE/ACM Trans. Networking, vol. 7, no. 2, pp. 188–201, Apr. 1999. [7] D. Stiliadis and A. Varma, “Providing bandwidth guarantees in an input-buffered crossbar switch,” in Proc. IEEE INFOCOM, 1995, pp. 960–968. [8] A. Hung, G. Kesidis, and N. Mckeown, “ATM input-buffered switches with the guaranteed rate property,” in Proc. IEEE Symp. Computers and Communications (ISCC’98), 1998, pp. 331–335. [9] T. Lee and C. Lam, “Path switching—A quasi-static routing scheme for large scale ATM packet switches,” IEEE J. Sel. Areas Commun., vol. 15, no. 5, pp. 914–924, Jun. 1997. [10] S. Chuang, A. Goel, N. McKeown, and B. Prabhakar, “Matching output queueing with a combined input/output-queued switch,” IEEE J. Sel. Areas Commun., vol. 17, no. 6, pp. 1030–1039, Jun. 1999. [11] I. Stoica and H. Zhang, “Exact emulation of an output queueing switch by a combined input output queueing switch,” in Proc. IWQoS 1998, pp. 218–224. [12] A. Charny, P. Krishna, N. Patel, and R. Simcoe, “Algorithms for providing bandwidth and delay guarantees in input-buffered crossbars with speedup,” in Proc. IWQoS 1998, pp. 235–244. [13] T. Inukai, “An efficient SS/TDMA time slot assignment algorithm,” IEEE Trans. Commun., vol. COM-27, no. 10, pp. 1449–1455, Oct. 1979. [14] S. Li and N. Ansari, “Input-queued switching with QoS guarantees,” in Proc. IEEE INFOCOM, 1999, pp. 1152–1159. [15] C. Chang, W. Chen, and H. Huang, “Birkhoff-von Neumann input buffered crossbar switches,” in Proc. IEEE INFOCOM, 2000, pp. 1614–1623. [16] R. K. Ahuja, T. L. Magnanti, and J. B. Orlin, Network Flows: Theory, Algorithms, and Applications. Englewood Cliffs, NJ: Prentice-Hall, 1993. [17] M. A. Bonuccelli and M. C. Clò, “Scheduling of real-time messages in optical broadcast-and-select networks,” IEEE/ACM Trans. Networking, vol. 9, no. 5, pp. 541–552, Oct. 2001. [18] C. L. Liu and W. Layland, “Scheduling algorithms for multiprogramming in a hard real-time environment,” J. ACM, vol. 20, pp. 46–61, Jan. 1973. [19] S. Even, Graph Algorithms. Rockville, MD: Computer Science Press, 1979. [20] I. R. Philp and J. W. Liu, “A switching problem for real time periodic messages,” Real-Time Systems Lab., Univ. Illinois, Urbana-Champaign. [Online]. Available: http://www-rtsl.cs.uiuc.edu/papers [21] V. Tabatabaee, L. Georgiadis, and L. Tassiulas, “QoS provisioning and tracking fluid policies in input queueing switches,” IEEE/ACM Trans. Networking, vol. 9, no. 5, pp. 605–617, Oct. 2001. [22] J. Giles and B. Hajek, “Scheduling multirate periodic traffic in a packet switch,” in Proc. 1997 Conf. Information Sciences and Systems at John Hopkins Univ., Baltimore, MD, Mar. 1997, pp. 691–703. [23] A. D. Goldberg and R. E. Tarjan, “A new approach to the maximal flow problem,” J. ACM, vol. 35, pp. 921–940, 1988. [24] M. Andrews and L. Zhang, “Achieving stability in networks of inputqueued switches,” in Proc. IEEE INFOCOM, 2001, pp. 1673–1679. [25] M. A. Marsan, E. Leonardi, M. Mellia, and F. Neri, “On the throughput achievable by isolated and interconnected input-queueing switches under multiclass traffic,” in Proc. IEEE INFOCOM, 2002, pp. 1605–1614. [26] A. Bianco, P. Giaccone, E. Leonardi, and F. Neri, “A framework for differential frame-based matching algorithms in input-queued switches,” in Proc. IEEE INFOCOM, 2004, pp. 1147–1157.

225

[27] Z.-L. Zhang, Z. Duan, and Y. T. Hou, “Virtual time reference system: A unifying scheduling framework for scalable support of guaranteed services,” IEEE J. Sel. Areas Commun., vol. 18, no. 12, pp. 2684–2695, Dec. 2000.

Yong Lee received the B.S. degree in mathematics and the M.S. degree in communication engineering from the PLA Information Engineering University, China, in 1993 and 2001, respectively. He is currently working toward the Ph.D. degree in the School of Computer Science and Engineering, Southeast University, China. His research area is high-performance networking with focus on packet scheduling for switches/routers.

Jianyu Lou received the Bachelor degree in engineering thermophysics from the University of Science and Technology of China in 1992, and the M.S. degrees in physics and computer science from the University of Missouri-Kansas City (UMKC) in 1999 and 2001, respectively. He is currently working toward the Ph.D. degree in computer science at UMKC. His research interests include various algorithmic aspects of high-speed networks, switches and routers, real-time scheduling, and traffic management and analysis.

Junzhou Luo received the B.S. degree in applied mathematics and the M.S. and Ph.D. degrees in computer science from Southeast University, China, in 1982, 1992, and 2000, respectively. As a principal investigator, he has completed 18 national and provincial projects in the past 10 years. He has published over 250 journal and conference papers on computer networks. He is now Professor and head of the School of Computer Science and Engineering, Southeast University. His research interests are protocol engineering, network security, network management, and grid computing.

Xiaojun Shen (SM’02) received the B.S. degree in numerical analysis from Qinghua University, Beijing, China, in 1968, the M.S. degree in computer science from the Nanjing University of Science and Technology, China, in 1982, and the Ph.D. degree in computer science from the University of Illinois at Urbana-Champaign in 1989. He is a Professor in the School of Computing and Engineering, University of Missouri-Kansas City. His current research interests include computer algorithms and computer networking with focus on packet scheduling.