Over-provisioning or Differentiated Services

5 downloads 0 Views 481KB Size Report
The mechanism of choice for explicit QoS support is Differentiated Services ... multiple service classes help to save costly WAN link capacity on integrated ...
1

Over-provisioning or Differentiated Services - A Case Study on integrating services over IP Ulrich Fiedler, Polly Huang, Bernhard Plattner fiedler, huang, plattner @tik.ee.ethz.ch Computer Engineering and Networks Laboratory, ETH Zurich, Switzerland Abstract A key question when integrating voice traffic into a corporate data network is cost versus benefit. Will an Integrated Services network save WAN link capacity which determines network costs? Does the complexity of additional explicit quality of service enablers like Differentiated Services (DiffServ) pay off its capacity savings? In a case study we have simulated the busy hour of a bank’s Intranet with ns-2 to determine the capacity needed with and without integration, with and without DiffServ. Due to online processing requirements we modeled all data traffic as web traffic with stringent response time requirements. Voice traffic was modeled as of ITU-T G.711 PCM type without silence suppression. For our business case we find that neither over-provisioning a one service class network nor a two class DiffServ network brings any savings. Only when relaxing the response time requirement for some of the web traffic, which is more realistic than adding additional traffic to the Intranet, we find that a three class DiffServ may save up to 60% of capacity. The amount of savings depends on the amount of traffic without QoS requirements in the Intranet. Keywords Capacity planning, DiffServ, web traffic, VoIP

I. I NTRODUCTION Many enterprises consider to converge networks for telephony and data applications on a single IP Intranet. Integrating services on the Intranet has the potential to save maintenance costs and to enable new applications. In addition it may save costs for WAN link capacity which determines the total costs for the Intranet [1]. However, integrating services over IP has a performance problem: On the one hand IP Intranets only support one service class to serve applications: best effort. This means that there is no guarantee when or even if data arrives at its destination. On the other hand real time applications such as telephony do have quality of service requirements. Bursty data traffic such as web traffic may cause interrupts in voice calls or loss of urgent data. Such performance problems can be addressed either by over-provisioning or by explicit quality of service (QoS) support. Over-provisioning means to provision enough capacity that contention for network resources can never significantly degrade applications’ performance. Explicit quality of service support means to add mechanisms to the network that may enforce quality of service. As WAN link capacity determines costs for the network, we go through different possibilities of integration and study the minimal link capacities needed to meet quality of service of telephony and data. The mechanism of choice for explicit QoS support is Differentiated Services (DiffServ) [2], [3], [4]. DiffServ implements different service classes by marking each packet with a so-called codepoint which then determines how the packet is treated at each node. DiffServ is reviewed in section II. We check the claim that multiple service classes help to save costly WAN link capacity on integrated services networks. We use ns-2 to perform simulations of the busy hour of a bank’s Intranet that carries web traffic and voice traffic. User session related parameters for web traffic are determined with web crawling. We measure performance for both applications at different capacities to determine the minimal provisioning for the different possibilities of integration. We start with separate networks for web and voice. For integrated services networks, we compare a two and a three class DiffServ capable network to a over-provisioned one service class network. In the DiffServ network we put voice in one service class, web traffic with response time requirement in another and web traffic without response time requirement in a third service class. We determine minimal link capacity for web traffic with a response time requirement and minimal link capacity for voice traffic with requirements on delay and loss. This approach is special since (1) we model data traffic as web which has in part a stringent response time requirement and (2) we assess network quality of service for

2

voice based on network parameters like end-to-end delay and loss patterns. Though limited on uncompressed voice without silence suppression this way of assessing quality of service for voice is much simpler than psychoacoustic methods. This simulation setup provides a basis to assess the capacity savings when integrating voice and data over IP in a bank with our specific requirements. In addition it measures the benefit of two and three class DiffServ depending on the fraction of web traffic that has no response time requirement. From our initial simulation for a branch, where 50 employees make use of telephony and web services, we observe the following: If all data traffic (web traffic) has a stringent response time requirement integration does not save any link capacity. If we compare an over-provisioned one service class integrated services Intranet to a two class DiffServ one, we find no saving if all web traffic has stringent stringent response time requirements. If some fraction of the web traffic has no response time requirement, we can deploy a three class DiffServ network. This network saves up to 60% of link capacity compared to an over-provisioned one service class Intranet. The saving depends on the fraction of web traffic without response time requirement. The rest of this report is organized as follows: The section II reviews DiffServ and discusses related work on the benefit of multiple service classes. Section III describes the simulation environment, topology, source model and explains measurement metrics. Results are presented in three stages: In section IV we explore provisioning and relevant traffic dynamics for voice and web traffic on separate networks. In section V we present results on integrating web and voice services. Here all web traffic has a stringent response time requirement We compare results for an over-provisioned one service class network to a two class DiffServ capable one. In section VI we investigate how results change if we relax the response time requirement for some fraction of the web traffic. Here we compare results for an over-provisioned one service class network to a three class DiffServ capable one. We conclude in section VII with a discussion of the results, the limitations of our experiments and results, and some comments on further work. II. R ELATED WORK In this section, we review Differentiated Services (DiffServ) [2], [3], [4] and discuss related work on the benefit of multiple service classes. DiffServ enables service differentiation by aggregating packets of flows with similar QoS requirements into corresponding service classes. At the network edges, packets are classified and marked with a so called codepoint according to service profiles. Inside the network, packets are buffered and scheduled solely depending on their codepoint. This makes DiffServ scale well and makes it attractive as statistical QoS enabler in Intranets. Within DiffServ we consider three service classes based on different forwarding mechanisms as specified by the IETF: expedited forwarding (EF) [5]. This forwarding mechanism is intent ed for delay sensitive applications such as interactive voice or video. We use it to forward voice traffic. assured forwarding (AF) [6]. This forwarding mechanism is intended for urgent data that requires a sort of controlled load service in the network. In conformance with the RFC our simulation implements two drop precedences for this forwarding mechanisms. Out of profile packets get higher drop precedence. We use this mechanism to forward urgent web traffic if web traffic without response time requirements is present. best effort (BE). This forwarding mechanism is intended for all other traffic that does neither receive expedited nor assured forwarding. We use it to forward web traffic. To our knowledge a number of telecom providers made tests to estimate cost and benefit to deploy service differentiation with DiffServ. However, results of these tests are not publicly available. Within academia the first significant work which analyzes benefit of more that one service class in terms of capacities savings is by Bajaj, Breslau, and Shenker [7]. They analyzed the impact of multiple service classes on performance of telephony traffic at the presence of video background traffic. Their simulation results show that capacity savings

3

depend on burstiness of background traffic and on the network utilization. However, network performance is measured with abstract utility functions and real life scenarios are not addressed. III. S ETUP This section describes simulation setup for a bank’s Intranet. After introducing our simulation environment, we present topology and source model. Then we discuss measurement metrics. We have tried to chose network parameters and requirements as realistic as possible. For this reason we were in contact with a large Swiss bank. Thus, the requirements regarding QoS of web and voice traffic are derived from the practice. A. Simulation Environment We used ns-2 [8] as the basis for our simulations. It is a packet based event driven simulator targeted at network research. It is designed and implemented by the same people that also developed the protocols that are wildly deployed throughout the Internet. This and its comprehensive validation suite give a good confidence in the simulator used for this study. Moreover, we make use of Sean Murphy’s DiffServ addition [9] to the simulator. This addition extends the IP header with a DiffServ codepoint to classify packets according to the service classes specified. We start with setting this codepoint not at the network ingress but at the sources. We assign a EF codepoint to all voice traffic and a BE codepoint to all web traffic. Later we use AF for web traffic with response time requirement and BE for web traffic without response time requirement. At first we do not use any conditioning at the ingress as we control network load via the number of active sources. Routers inside the network perform scheduling with a simple weighted round robin scheduler by servicing each queue according to a prespecified weight. To generate web traffic on the application level, we explicitly model the request and reply interactions of HTTP/1.1 [10]. The reply objects (index object and embedded objects) are generated with sizes according to statistical distributions determined by web crawling. Same for the number of embedded objects. We also use statistics to model server selection and think time (inter page time). Underneath HTTP, we use ns2’s FullTCP to simulate the TCP connections. FullTCP is a bidirectional TCP implementation with Reno congestion control and three way handshake to establish connections. We refine a model of web browsing by Mah [11] which generates HTTP/1.0 type of web traffic [12] with empirical data from 1997. As we think that in a bank Intranet web traffic will use HTTP/1.1, we model the two basic features which make the difference in traffic dynamics: persistence and pipelining. Persistence allows the reuse of already established TCP connections e.g. to get a number of embedded objects from the same server. These TCP connections are kept open to avoid consecutive slow start phases. Pipelining allows to send all requests to one server at a time without waiting for any reply. In addition we use up to date empirical data for the statistics driving the simulation. To measure network performance for web traffic from a user centric perspective, we measure one metric: the response time for web page downloads. We define the response time for a down load of a web page as the time in seconds that has elapsed between the time that the client sent a packet to establish the connection to the server which hosts the index object and the time that the last TCP packet of the last embedded object arrived at the client side. We model telephony traffic as uncompressed voice over IP (VoIP) without silence suppression. A phone is modeled as a source/sink pair. The source produces constant bit rate (CBR) traffic during on times. This results in bidirectional traffic between phones. We model call duration and inter call time by choosing on and off times according to empirical data from partners. To measure network performance for such VoIP traffic, we measure three metrics. One is the end-to-end delay, the other two characterize interruptions caused by packet loss. B. Topology The Intranet of a bank connects its central office and branches with leased lines which is typical for such a business setup. The stock markets back office, as well as the private banking and trading room are linked to the Intranet with LAN connectivity. Figure 1 gives a schematic overview of the Intranet’s topology. To increase

4

availability branches are not only linked to the central office but also to one neighbor to have redundant routes to the central office. However, these links which are not depicted in figure 1 are not used for load balancing and should therefore not be considered in a performance study. Branch1

Branch2

Branch3

Branch4

Branch7

Branch8

Central Office

Branch5

Branch6

Fig. 1. Intranet Topology

We chose a simple dumbbell topology which represents the connectivity between branch and central office with a bottleneck on the WAN link in between. This topology is depicted in figure 2. As many have argued [7], there is always a single bottleneck link on any network path. This bottleneck link is usually not fast moving. This justifies our choice of the network topology, which we think is a good starting point to understand the impact multiple service classes on provisioning. The choice of 10ms of delay on the WAN link, which is rather large, is to model the worst case. Queue Length: variable Access links Bandwidth: 10Mb Delay: 0.1ms

Queue Length: variable

Bottleneck link Bandwidth: variable Delay: 10ms

Access links Bandwidth: 10Mb Delay: 0.1ms 50 Web Clients 50 Phone Sinks 50 Phone Sources

5 Web Servers 50 Phone Sinks 50 Phone Sources

Fig. 2. Simulation Topology

The dimensions of the parameter space in this simulation are queue length and capacity of the bottleneck link. In the DiffServ case, parameters to adjust service rates for the different traffic classes may be added. The range for potential queue length is limited by physical propagation delay of the bottleneck link, which is 10ms, on the one hand and by the delay requirement for VoIP, which is 50ms, on the other hand. Given the link bandwidth of 1.5Mb, which we find to be the absolute minimum when transmitting voice traffic only, the bandwidth delay product determines queue length to be around 15KB - 75KB. For web traffic, we found that making queues much longer than 15KB did not significantly improve down load times. We therefore started with a queue length of 20KB in most experiments. C. Source Model To model the traffic generated by a medium sized branch in our bank Intranet scenario, we start with 50 networked workplaces, though this branch may have more than 50 employees. Each workplace has a phone and a PC to run a web client which are both heavily used e.g. to process transactions. To generate traffic, the web clients request web pages on 5 web servers which are located somewhere at the central office. The number of 5 is chosen to keep the ratio of web clients to web servers at 10:1 which is about the ratio of a bank. As we explicitly model HTTP/1.1s’ interactions, the behavior of sources is determined by the distribution of

5

the following entities: size of request object, number and size of reply objects, server selection and think time between two successive down loads. The distribution of object sizes and number of embedded objects were determined with web crawling a bank’s Intranet. We compare this data to Barford et al. [13] who measured those entities at Boston University. From that we chose the size for a request of any web object to be constant at 400B. All other entities were chosen as a heavy tailed negative power law Pareto distributed. The parameters of the Pareto distribution are depicted in table I. We did not model server processing latency or address the matching problem, i.e. which request goes to which server. We randomly chose the server for each object. Since we model the peak business hour, we chose a very small average of 30 seconds for inter page time. This model generates a mean load around 800Kb at sufficient bandwidth. 1 TABLE I USER / SESSION RELATED PARAMETERS TO GENERATE

Parameter Size of Index Obj. Size of Embedded Obj. Number of Embedded Obj. Inter Page Time

Distribution Pareto Pareto Pareto Pareto

W EB TRAFFIC

Average Shape 8000 B 1.2 4000 B 1.1 20 1.5 30 sec 2.5

To generate VoIP traffic the phone sources of a workplace in the branch may “phone” with a sink in the central office. As phones are modeled as source/sink pairs, the corresponding source in the central office then also sends VoIP traffic to this workplace. In conformance with ITU-T G.711, we assume use of a 8Khz 8bit Alaw PCM coder as deployed in conventional telephones. This coder does not implement speech compression or silence suppression. To bound packetization delay, we assume that a VoIP packet contains 10ms of speech. This leads to 108 byte VoIP packets with 24 bytes overhead due to IP and UDP headers. Sources send 86.4Kb of CBR traffic when active. In residential environments call durations are simply exponentially distributed with a mean of 3 minutes [14]. However, we found that a bimodal distribution for call durations representing long and short calls better suits to model call duration during the peak business hour of a bank. Both types of calls, long and short, as well as the inter call time were represented by a negative exponential distribution. We chose parameters as depicted in table II. This generates a mean load of 850Kb. TABLE II USER / SESSION RELATED PARAMETERS TO GENERATE VOICE TRAFFIC

Parameter Call duration

Distribution Bimodal

Long Call Short Call Inter Call time

Exponential Exponential Exponential

Average long call: 20% Short Call: 80% 8 min 3 min 15 min

D. Measurement Metrics In this section we introduce QoS performance measurement metrics for web and voice which we need to assess network performance at a given bottleneck link capacity. We want these metrics to be user centric as the purpose of a bank’s Intranet is to support its employees to do business. This business has inherent time and quality constraints. To control the network QoS for web traffic response times for down loads are To verify our model, we monitored used bandwidth and checked the self similar characteristics of the traffic generated.

6

measured. This QoS is considered sufficient, if the percentage of web page down loads that terminate within five seconds is in the high nineties. We call such a percentage a five seconds response time quantile. Since we do not model server latency, we require that the five seconds response time quantile exceeds 99%. To measure network performance on transmission of VoIP, we measure end-to-end delay and the impact of packet loss. We define three requirements: one for the end-to-end delay and another two which limit loss and its impact. Coming from conventional telephony we define 50ms as an upper limit to end-to-end delay. This is our first requirement for VoIP. This is somewhat stricter than the 100-200ms proposed by Steinmetz and Nahrstedt [15]. As integrated services networks carry voice and data traffic packet losses may occur in bursts which cause successive packet losses. In general such successive packet losses are perceived as a crackling sound when played out at the receiver’s side. However, how packet loss affects VoIP application quality of service depends on the coder/decoder used (see e.g. the overview article of Kostas et al. [16] for more details). Here we focus on the ITU-T G.711 coded speech. To analyze the impact of loss we group successive losses of VoIP packets into outages. This notion comes from Paxson when he investigated on end-to-end Internet packet dynamics [17]. We applied the worst outage patterns of our simulations to prerecorded speech. We found out that outage frequency and duration may characterize the impact of loss reasonably well for ITU-T G.711 type of coded speech in situations with low or moderate loss. We define the requirements on outages for VoIP as follows: Frequency of outages must not exceed 5 per minute of transmitted speech. Outage duration must not exceed 50ms. We argue that although restricted to a certain type of coder and limited to low or moderate loss situations these metrics are well suited to assess network performance for VoIP in our situation. We think that assessing transmission quality with ITU-T P.861’s PSQM (perceptual speech quality measurement) [18] is not suitable for VoIP. New proposals to measure VoIP speech transmission quality in a ‘internal psychoacoustic domain’ such as PSQM+ [19], MNB [20] and PAMS [21] are not freely available and exceed the scope of this research. IV. S EPARATE N ETWORKS Before investigating capacities for integrated services networks which accommodate both, web and VoIP traffic, we want to study capacity planning on separate networks. In particular we want to find out how much capacity needs to be provisioned on the bottleneck link to meet the required QoS for each traffic type. To get a better understanding we also investigate relevant traffic dynamics of each of these traffic types on separate networks. A. VoIP Traffic We start with investigating the VoIP traffic generated by 50 employees at the branch during the main business hours with respect to link capacity. Given the user/session related parameters of table II, these 50 employees make 125 phone calls with an average duration of 4 minutes. At first we tried with Markov theory to calculate the capacity which needs to be provisioned in this case. We could not find a closed solution for the stationary distribution. This is basically due to the bimodal call duration distribution on the one hand and due the integrals coming from the losses on the other hand. Nonetheless, we may estimate the provisioning with the following argument: When introducing call blocking and approximating the bimodal call duration distribution with a negative exponential one, the Erlang loss formula can be used to determine the provisioning required. A call blocking rate of 1% - 0.1%, which would be required in the business case of a bank, leads to a provisioning of 1.5Mb–1.8Mb at the bottleneck link. Since we did not want to approximate the call duration distribution, we decided to investigate the bottleneck link capacity with simulation. To get statistically representative results, we run each simulation of the main business hour for 8-20 times with different seeds for the random number generator and take averages of the parameters measured. Varying capacity at the bottleneck link, we measured end-to-end delay (maximum and average). To characterize outages on VoIP transmissions, which stall speech, we measure frequency, duration, and the distribution of the durations. From the outage frequency we learn how often telephone calls are stalled per minute. The outage duration gives the time duration of such events. We determine the minimal provisioning with the end-to-end delay requirement and the requirements on outage frequency and outage

7

duration as defined in section III. In addition, we study the cumulated density function (CDF) of outages which reflects the distribution of outage durations. With this distribution, we learn whether long outages are frequent or rare. To understand how capacity planning and network performance relates to traffic dynamics of VoIP, we measured loss rate and queue fill state. Here come the results: Delay vs. Link Capacity

Number of Outages per minute vs. Link Capacity 30 Maximum Avgerage QoS requirement.

140

# outages QoS req. 25

120

20

Number

Delay [ms]

100

80

15

60 10 40 5 20

0 1.2

1.3

1.4

1.5

1.6

1.7

1.8

0 1.2

1.9

1.3

1.4

Link Capacity [Mb]

1.5

1.6

1.7

1.8

1.9

Link Capacity [Mb]

Fig. 3. Packet Delay and Outage Frequency for VoIP (Separate Network)

Figure 3 (left side) depicts maximum and average end-to-end delays of VoIP packets. We find that the average delay sightly decreases with increasing capacity on the bottleneck link until it reaches a constant minimum. This constant minimum which is essentially given by the physical propagation delay is at around 12ms and is reached at capacities around 1.4Mb. For the maximum delay, we find that it also sightly decreases with increasing capacity. It reaches the value of propagation delay at 1.8Mb. This means that at this capacity there is no more queuing delay in the system. However, the QoS requirement that the maximum end-toend packet delay is lower than 50ms is already reached at 1.6Mb. This QoS requirement on end-to-end delay finally determines minimal provisioning. Figure 3 (right side) depicts outage frequency for VoIP on a separate network. We learn that the outage frequency steadily decreases with capacity on the bottleneck link. The curve is convex from above which means that more capacity helps more and more and that things behave nicely. Outages totally disappear at 1.8Mb. However, the QoS requirement for outages frequency which is less than five per minute is already met at a capacity of 1.5Mb. VoIP Outage Duration vs. Link Capacity

Outage CDF for VoIP packets

900

1 Avg Max QoS req.

800

1.5Mb

0.8

700

cumulated density

Outage [ms]

600 500 400

0.6

0.4

300 200

0.2

100 0 1.3

0 1.4

1.5

1.6 Link Capacity [Mb]

1.7

1.8

1.9

0

20

40

60

80

100

Outage Duration [ms]

Fig. 4. Outage Duration and Outage CDF for VoIP (Separate Network)

Figure 4 (left side) depicts average and maximum for outage durations. Average outage duration continuely stays at a comparably low level below 100ms and decreases with increasing capacity at the bottleneck link before it disappears at 1.8Mb. Maximum outage duration suddenly decreases if more than 1.5Mb link capacity is provided at the bottleneck link. It is also at this capacity that the QoS requirement for outage duration which

8

is it should not exceed 50ms is reached. This link capacity of 1.5Mb seems to be enough to accommodate the traffic although it takes as much as 1.8Mb to make outages totally disappear. From this we conclude that all QoS requirements for VoIP delay and loss are met at 1.6Mb which determines the minimal provisioning although 1.8Mb is needed to make queuing delay totally disappear. It seems as if the delay requirement which determines the minimal provisioning with 1.6Mb is just slightly more stringent than the two requirements on outages. The requirements on outages both would require a provisioning of 1.5Mb which is not much less. To better understand if outages have a tendency to be short or long, we review the outage distribution. From figure 4 we learn that long outages causing hearable interruptions in speech transmission is extremely seldom. Most outages (roughly 98%) last only for 20ms or less. Loss Rate for VoIP

Queue Fill state for VoIP only

3.5 loss rate

queue fill state 14

3 12 2.5

Fill State [%]

Loss Rate [%]

10 2

1.5

1

6

4

0.5

0 1.2

8

2

1.3

1.4

1.5

1.6

Link Capacity [Mb]

1.7

1.8

1.9

0 1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9

Link Capacity [Mb]

Fig. 5. Loss Rate and Queue Fill State for VoIP Packets (Separate Network)

To learn about traffic dynamics, we also plot the packet loss rate as well as the queue fill state (see figure 5). We see that these parameters more and more decrease with increasing capacity. This means that it just takes a certain capacity to accommodate all the smooth CBR VoIP traffic. At this capacity of 1.8Mb packet drops can be totally eliminated which means that there is no more outages. This nicely corresponds with the data on outage frequencies and outage duration (see figure 3 (right side) and figure 4 (left side)). Figure 5 (right side) depicts the average queue fill state which seems to be considerably small already at 1.2Mb. At capacities larger than 1.5Mb it is so small that any queuing delay should disappear. This nicely corresponds with the above figure about end-to-end packet delay 3. We summarize the findings for VoIP traffic in this section as follows: Increasing capacity makes queuing delay and outages due to losses disappear at 1.8Mb At 1.6Mb effects of queuing and packet loss are sufficiently small that all QoS requirements on end-to-end delay and outages can be met. At this link capacity telephone calls have reasonable good voice quality. B. Web Traffic In this subsection we investigate the web traffic generated by 50 web clients at the branch during the main business hours given the user/session related parameters of table I. These parameters result in around 5000 down loads of web pages within such an hour which produce an average of 800Kb of traffic in low loss situations. We measure response times of these downloads to determine the minimal capacity that need to be provisioned at the bottleneck link. As defined in section III we consider the capacity as sufficient if 99% of the downloads terminate within five seconds. To get a deeper understanding we additionally measure relevant traffic dynamics such as loss rate and queue fill state at the bottleneck link. Here come the results which represent averages over 10-20 simulation runs: Figure 6 depicts the fraction of web pages can be downloaded within five seconds at given capacities. This response time quantile is measured at a queue length of 20 KB at both ends of the bottleneck link. We observe that the five seconds response time quantile increases strongly until 1.6Mb. From this link capacity on the

9

Response Time Quantile vs. Link Capacity 100

Response Time Quantile [%]

98

96

94

92 Five Seconds Response Time Quantile QoS requirement

90 1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

Link Capacity [Mb]

Fig. 6. Five Seconds Response Time Quantile on separate network

gradient significantly lowers until the curve becomes very flat at 2.2Mb. This means that additional capacity helps less and less to speed up down load times for web pages. At 2.4 Mb the capacity is sufficient that 99% of the web pages can be down loaded in less than five seconds, which is the requirement for minimal provisioning. The shape of the curve does not look significantly different for a three or ten seconds response time quantile (not depicted here). But as the curve is very flat in the range which determines the minimal provisioning the actual choice of the response time quantile has large impact. However, we continue with five seconds here because this value comes from the business case of the bank. We also investigated whether response times depend on the queue length chosen at the bottleneck link. As we want to integrate web with a VoIP on the network we only looked at queue length between 10KB and 400KB. The bounds are given by the bandwidth delay product and the delay requirement for VoIP. In agreement with results of Christiansen [22] we found that queue length does not significantly impact download response times. To gain more insight we additionally measured traffic dynamics: Loss Rate vs. Link Capacity

Queue Fill State vs. Link Capacity 50 20 KB

3

45 40

2.5

Queue Fill State [%]

Loss Rate [%]

35 2

1.5

1

30 25 20 15 10

0.5 5 0 1.2

1.4

1.6

1.8

2

Link Capacity [Mb]

2.2

2.4

2.6

0 1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

Link Capacity [Mb]

Fig. 7. Loss Rates (Bytes Lost) and Queue Fill State (Separate Network)

Figure 7 depicts the loss rate in units of bytes lost and the average queue fill state at the bottleneck link. We learn that at 2.4Mb, when the QoS requirement for web traffic is met, the loss rate is 1.4% and the queue fill state is 12%. As expected the depicted dynamics differ strongly from those of smooth VoIP traffic (see figure 5). For web traffic loss rates and average queue fill state do not steadily converge to zero. Additional link capacity helps less and less to improve traffic dynamics. There is no ‘sufficient link capacity’ which empties queues and makes losses disappear. We presume that this behavior comes from the bursty and multi fractal nature of web traffic. We summarize the findings for web traffic in this section as follows:

10

The minimal provisioning for a separate network for web traffic is 2.4Mb. This provisioning is determined by the requirement that 99% of the web pages can be downloaded within five seconds. This results is very sensitive to the waiting time chosen as the curve is flat in this range. The choice of queue length does not significantly influence the result. Unlike for VoIP, loss rate and average queue fill state for web traffic do not converge to zero at some ‘sufficient link capacity’. We have to learn that additional link capacity helps less and less to improve download response times and traffic dynamics. This is presumable due to the bursty and multi fractal nature of web traffic. V. I NTEGRATED N ETWORKS – O NE

OR

T WO S ERVICE C LASSES ?

Maximum end-to-end Delay for VoIP packets vs. Link Capacity 70 One Service Class Two class DiffServ QoS Requirement 60

Delay [ms]

50

40

30

20

10

0 3.6

3.8

4 Link Capacity [Mb]

4.2

4.4

Fig. 8. Maximal end-to-end Delay of VoIP packets (Integrated Services Network)

Number of Outages per minute vs. Link Capacity

Maximum Outage Duration vs. Link Capacity

7

60 One Service Class Two Class DiffServ QoS requirement

6

Maximum Outage Duration [ms]

50

Number

5

4

3

2

One Service Class Two Class DiffServ QoS Requirement 40

30

20

10

1

0

0 3.6

3.8

4 Link Capacity [Mb]

4.2

4.4

3.6

3.8

4

4.2

4.4

Link Capacity [Mb]

Fig. 9. Outage Frequency and Duration for VoIP (Integrated Services Network)

In this section we investigate minimal provisioning on integrated services networks which carry VoIP and web traffic. We assume that all web traffic has the same stringent response time requirement that 99% of the downloads terminate within five seconds. This integrated service network can have either one service class or two. We realize a two service class network with DiffServ. To protect the VoIP from bursty web traffic we segregate VoIP packets by marking them with an EF codepoint from packets for web traffic which we mark as BE. We do no traffic conditioning at ingress since we control the traffic load via the number of sources. At the bottleneck link EF and AF packets are buffered in two separate queues and forwarded according to service rate configuration. All queues are FIFO and have 20KB length. To determine minimal provisioning for both one service class and two class DiffServ, we measure the quality of service for web and VoIP. As explained in

11

Five Seconds Response Time Quantile vs. Link Capacity 100 Two Class DiffServ One Service Class QoS Requirement

Response Time Quantile [s]

99.5

99

98.5

98

97.5

97 3.6

3.8

4

4.2

4.4

Link Capacity [Mb]

Fig. 10. Five Seconds Response Time Quantile (Integrated Services Network) Bandwidth share for VoIP vs. Weight for ES Queue 100 3.8Mb

Bandwidth Share for VoIP [%]

80

60

40

20

0 0

20

40

60

80

100

Weight for EF Queue [%]

Fig. 11. Fraction of used capacity for VoIP versus service rate parameter for VoIP

section III, we measure response time for web and end-to-end delay, outage frequency and duration for VoIP. We directly compare results for a one service class network with a two class DiffServ capable one. With that we want to investigate the savings in terms of link capacity that DiffServ brings as it protects VoIP traffic from bursty web traffic. In addition we check how DiffServ configuration impacts the results. User/session parameters for VoIP traffic are as in table II and for web traffic as in table II. Here come the results for VoIP, first delay then outages. From figure 8 we learn that maximum delay for VoIP in a one service class network is significantly higher than in a two service class DiffServ network. This is as DiffServ is capable to protect VoIP from bursty web traffic by enforcing a minimum service rate for the VoIP in EF. Figure 8 nice shows how DiffServ keeps the maximum VoIP delay as small as 12ms on the entire range of investigate capacities from 3.5Mb to 4.5Mb. As 12ms is close to the physical propagation delay this means that DiffServ keeps delay more or less constant for VoIP. This is different in a one service class network. In a one service class network we need to provision more and more capacity to get the delay down to a acceptable maximum. Maximum packet delay starts with 58ms at 3.5Mb and linearly decreases with increasing bandwidth down to 45ms at 4.5Mb. The QoS requirement of a maximum delay of 50ms is met at 4.2Mb. This delay requirement will finally determine the minimal provisioning. Average delay for VoIP in a one service class network is more or less constant at around 15ms (not depicted). This is an indication that bursts of web traffic that significantly impact VoIP performance are rare events. To measure the impact of loss on network performance for VoIP, we depict outages frequency and duration at various capacities 9: From the left side of the figure we learn that DiffServ has significantly less outages than a one service class network. DiffServ may eliminate outages at sufficient capacities. With our configu-

12

ration this capacity is 3.9Mb. In a one service class network outages slowly decease with increasing capacity. In the studied interval of capacities, outages start at 5.5 per minute for 3.5Mb and reduce to 2.8 per minute at 4.5Mb. The maximum outage duration is analyzed in figure 9. We learn that a DiffServ network has considerably smaller outage durations than a one service class network. DiffServ’s outage duration are less than 20ms on the entire range of capacities studied. This means there is just 1 or 2 subsequent packets lost. For a one service class network maximum outage duration stays around 30 – 40ms in the studied interval and does not decrease at increasing link capacity. This is an indication that our web traffic so bursty that a little more capacity just does not help. However, the QoS requirement on maximum outage duration which is 50ms is no problem for neither the DiffServ network nor the one service class network. To measure the network performance for web traffic at various capacities we measure the response time. From figure 10 we learn that a one service class network does very much better than a DiffServ capable one. This is because in the one service class network the web traffic may make use of all available capacity on the link during bursts. In DiffServ this is not so. Here web traffic cannot possess link capacity assigned to VoIP. In a one service class network the fraction of web pages that can be downloaded within five seconds exceeds 99% for all capacities studied. For the two class DiffServ network we need to provision more and more capacity to finally meet the QoS requirement at 4.3Mb. We again see that more and more capacity for web traffic helps less and less. When we look at all QoS requirements for VoIP and web at a time we learn that the delay requirement for VoIP determines the capacity provisioning in a one service class network. For DiffServ we have the possibility to assign service rates to web and VoIP packets at the bottleneck link. The figures discussed were obtained with service rate of 40% for VoIP and 60% for web. Assigning a higher service rate to VoIP does not considerably alter measurements as unused service rate for VoIP eludes to the web traffic which we put in BE. To test this we plotted the capacity share of VoIP versus the service rate share in figure 11. Assigning a lower service rate to VoIP means to throttle VoIP to give more space to web traffic. Lowering this rate we found out that we can only shift the requirement which limits the minimal provisioning. We could not find any configuration which allows for smaller provisioning. In the set of measurements depicted above the response time for web traffic determines the minimal provisioning to 4.3Mb. However, assigning less share to VoIP finally results in the delay requirement for VoIP to determine the minimal provisioning. The bad new is that this does not lower minimal provisioning. We summarize our result for provisioning an integrated services network which supports web and VoIP traffic as follows: A two class DiffServ network does well for VoIP if configured with enough service rate share. Then it keeps end-to-end delay constantly low and eliminates outages at capacities reasonably large (i.e. greater than 3.9Mb). With a DiffServ network the performance for VoIP is won at the cost of performance for web traffic. Depending upon configuration the end-to-end delay requirement for VoIP or response time requirement for web determine the minimal provisioning. We found this minimal provisioning to be 4.3Mb. A one service class network does better for web traffic than a DiffServ one. This is because in a one service class network web traffic may profit of the entire link capacity. In a one service class network bursty web traffic cause outages and queuing delay for VoIP. However, this can be cured with over-provisioning. The stringent end-to-end delay requirement of 50ms determines the provisioning to 4.2Mb. To conclude this result section, we review the following three cases, which one is likely to require lower capacity at the bottleneck link: 1. keep voice and data on separated networks, 2. integrate voice and data services on a one service class network and plan with over-provisioning, or 3. integrate voice and data services on a two class DiffServ network. Table III shows the results. We find that in our bank scenario required capacity does not significantly differ among the three cases. Integrating services or service differentiation does not save any capacity. This is because the web traffic in our scenario is not real “best effort traffic” but has stringent response time require-

13

TABLE III C APACITY

OF THE BOTTLENECK LINK TO ACHIEVE THE DESIRED

Type of Network Separate Networks VoIP Web Total Integrated Services Network One Service Class Two Class DiffServ

QOS

Capacity 1.6Mb 2.4Mb 4.0Mb 4.2Mb 4.3Mb

ments. VI. M ULTIPLE S ERVICE C LASSES Five Seconds Response Time Quantile vs. Link Capacity 100

Response Time Quantile [s]

98

96

94

92 AF clients BE clients QoS requirement 90 2

2.2

2.4

2.6

2.8

3

Link Capacity [Mb]

Fig. 12. Five Seconds Response Time Quantiles in three class DiffServ network

In this section we investigate on capacity savings with DiffServ when we introduce a third service class for web traffic without response time traffic. The usual approach with that is to add “best effort traffic” to the simulation. However, we think in the business case of a bank it is more meaningful to relax the response time requirement for some fraction of the web traffic. We think this a valid approach since not all web traffic in a bank’s branch is used to perform mission critical transactions. The hope with that is to find some capacity saving compared to a one service class network or separate networks. In the three service class DiffServ network we marked traffic as follows: 1. as before we marked VoIP traffic for expedited forwarding (EF). 2. unlike section V we mark web traffic with response time requirement for assured forwarding (AF). 3. the remaining web traffic is marked as best effort (BE). We started with a fraction of 20% of web clients to generate traffic with a response time requirement. From figure 12 we see that DiffServ can nicely differentiate between web traffic with and without response time requirements. The plots on VoIP’s end-to-end delay and outages look similar to those of section V. Therefore they are not depicted. Minimal provisioning for this case is determined with 2.4Mb. This is considerable saving compare to 4.2Mb in a one service class network which cannot differentiate services. Then we varied the fraction of web clients that produce web traffic with/without response time requirements to check how this impacts the savings. Figure 13 shows that more traffic without response time requirement brings more savings. If all web traffic has no response time requirement, the saving is almost 60%, if all web

14

Capacity Provisioning vs. Fraction of Web Traffic without QoS Requirement 4.5 4

Capacity Provisioning [Mb]

3.5 3 2.5 2 1.5 1 0.5

Three Class DiffServ capable network One service class network

0 0

20

40

60

80

100

Fraction of Web Traffic without QoS Requirement [%]

Fig. 13. Capacity Provisioning versus fraction of web traffic without QoS requirement

traffic has our stringent response time requirement the saving is zero. The correlation is almost linear. VII. D ISCUSSION Our goal is to answer, for the following four cases, which one is likely to require lower capacity at the bottleneck WAN link: 1. Keep voice and web services on two separate IP networks 2. Integrate voice and web traffic in a one service class network and plan with over-provisioning. 3. Integrate voice and web traffic in a two class DiffServ network. 4. Integrate voice and web traffic in a three class DiffServ network which additionally differentiates between web traffic with and without response time requirement. In this preliminary study, we focused on a practical bank scenario and found required capacity on the bottleneck link is not much different for the first three cases. Class four comes from the fact that not web traffic in a bank carries mission critical transaction. Relaxing of the response time requirement for some of the web traffic leads to up to 60% of capacity savings, which is a nice result. However, there is a number of subtle issues with it: First is the impact of configuring service rate parameters. The service rate parameters must be use to balance QoS for VoIP and web with response time requirement. Therefore DiffServ needs to be accurately configured to reach the minimal provisioning. Unlike two class DiffServ excess capacity eludes to the BE service class which carries the web traffic without response time requirement. Although it is quite obvious to guess a reasonable configuration for our simulation this is far from trivial in more complex practical settings. Second is the impact of algorithm and the implementation of the scheduler on the performance. The DiffServ package [9] implements a weighted round robin scheduler and three queues. The queues are for packets which are marked with EF, AF, and BE codepoints. As packets with the same codepoint may have different sizes this scheduler assigns credits how many bytes of each queue can be served on a round. The problem with that is that bursty web traffic in the AF class may acquire many credits which then can be cashed in all in single round. This causes large queuing delays and even outages in VoIP. We found that with such kind of scheduler it is impossible to meet the minimal provisioning. We therefore slightly modified it to a WRR based on packet counts to produce the results of section VI. Although we know that this is not the optimum for performance, we have the feeling that we are quite close. This is first because we can give minimal guarantees for delay and capacity share with this scheduler and second because we can check convergence to end point with zero or 100% of web traffic with response time requirement. However, we plan to clarify this issue by integrating WFQ in our simulations. Third issue to discuss here is that it is not realistic to assume a business scenario which includes web traffic which is totally free of any response time requirement. Even for “best effort” web traffic user’s expect some performance. (e.g. that 90% of download terminate in five seconds). Reviewing our measurements from this

15

direction, the maximal saving that DiffServ brings for VoIP and such “best effort” web traffic in the described Intranet drops to values around 40%. Finally, we would like to stress that in addition to the results reported here we completed a simulation framework (ns modifications, scripts, etc.). This package is publicly available and will enable the research community, network service providers, banks and other large enterprises to study the use of DiffServ in their own scenarios. VIII. S UMMARY

AND

F URTHER WORK

In Intranets WAN link capacities determine network costs [1]. Therefore we investigated cost savings for WAN link capacity when integrating voice and web services one a single network. We studied this integration in the business case of a bank and simulated with one, two, and three service classes. The third service class is to differentiate web traffic with and without response time requirement. We found that neither over-provisioning a one service class network nor deploying a two service class DiffServ network saved any link capacity compared to separate networks. We think that this result comes from the stringent response time requirement on web traffic which is used for mission critical transactions. Relaxing this QoS requirement for some fraction of the web traffic may save up to 60% of capacity. This approach of relaxing the requirement seems to somewhat better reflect the business case of a bank’s Intranet which is not open to the public than adding additional traffic. The amount of savings depends on the amount of traffic without QoS requirement. The savings largely depend on the configuration of service rates and on the algorithms and implementation of schedulers. To be sure that we measure the optimum in our performance experiments we plan to integrate weighted fair queuing (WFQ) in our simulation. We also think about integrating video conferencing traffic and more complex topologies in our simulations. IX. ACKNOWLEDGMENTS The authors would like to thank Sean Murphy for many helpful discussions and his permission use his DiffServ additions to ns. R EFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22]

A. M. Odlyzko, “The internet and other networks: Utilization rates and their implications,” Tech. Rep., Feb. 2000. S. Blake et. al., “An architecture for differentiated services,” RFC 2475, Internet Request For Comments, Dec. 1998. J.Wroclaski D. Clark, “An approach to service allocation in the internet,” Tech. Rep., IETF Draft, July 1997. L. Zhang K. Nichols, V. Jacobsen, “A two-bit differentiated services architecture for the internet,” Tech. Rep., Internet Draft, Apr. 1999. V. Jacobson et. al., “An Expedited Forwarding PHB,” RFC 2598, Internet Request For Comments, June 1999. J. Heinanen et. al., “Assured Forwarding PHB Group,” RFC 2597, Internet Request For Comments, June 1999. S. Bajaj et. al., “Is service priority useful in networks?,” in Proceedings of the ACM Sigmetrics ’98, Madison, Wisconsin USA, June 1998. S. McCanne and S. Floyd, “ns-2: Network simulator,” http://www-mash.cs.berkeley.edu/ns/. S. Murphy, “Diffserv package for ns-2,” http://www.teltec.dcu.ie/ murphys/ns-work/diffserv/index.html. R. Fielding et. al., “Hypertext transfer protocol — http/1.1,” RFC 2616, Internet Request For Comments, June 1999. B. Mah, “An empirical model of HTTP network traffic,” in Proceedings of the IEEE Infocom, Kobe, Japan, Apr. 1997, pp. 592–600, IEEE. T. Berners-Lee et. al., “Hypertext transfer protocol — http/1.0,” RFC 1945, Internet Request For Comments, May 1996. P. Barford et. al., “Changes in web client access patterns: Characteristics and caching implications,” World Wide Web, Special Issue on Characterization and Performance Evaluation, vol. 2, no. 2, pp. 15–28, 1999. Siemens, Telephone traffic theory tables and charts, Siemens Aktiengesellschaft, Berlin - Mnchen, Germany, 1981, 3rd edition. R. Steinmetz and K. Nahrstedt, Multimedia: Computing, Communcations & Applications, Prentice-Hall, ???, 1995. T. Kostas et. al., “Real-time voice over packet-switched networks,” in IEEE Network, Jan. 1998. V. Paxson, “End-to-end internet packet dynamics,” IEEE/ACM Transactions on Networking, vol. 7, no. 3, pp. 277–292, June 1999. ITU-T, “Psqm perceptual speech quality,” Tech. Rep., International Telecommuncation Union, 0000. FIXME, “Psqm+,” Tech. Rep., –FIXME–, 0000. FIXME, “Mnb,” Tech. Rep., –FIXME–, 0000. FIXME, “Pams,” Tech. Rep., –FIXME–, 0000. et. al. M. Christiansen, “Tuning red for web traffic,” in proc # acm-sigcomm, Stockholm, Sweden, Aug. 2000, ACM.