An Efficient and Accurate Decomposition Method for Open Finite-and

0 downloads 0 Views 225KB Size Report
large, open networks of mixed finite- and infinite-buffer queues with ... stricted, product-form results, such as those for Jackson networks, can be used.
An Efficient and Accurate Decomposition Method for Open Finite- and Infinite-Buffer Queueing Networks Ramin Sadre, Boudewijn R. Haverkort, Alexander Ost Rheinisch-Westf¨ alische Technische Hochschule Department of Computer Science, D–52056 Aachen, Germany (http://www-lvs.informatik.rwth-aachen.de)

Abstract. We address a decomposition approach for the evaluation of large, open networks of mixed finite- and infinite-buffer queues with phase-type distributed service and interarrival times. Such queueing networks can be seen as high-level specifications of continuous-time Markov chains (CTMCs). However, these CTMCs can become very large (in case all buffers are finite) or even infinitely large. To avoid these large state spaces we decompose such queueing networks in separate queues which can be solved efficiently using matrix-geometric or spectral expansion techniques, or specialised Markovian techniques for the solution of a finite system of global balance equations. However, since we allow for job losses and feedbacks to occur, we cannot decompose the overall model in a single pass to models of the individual queues; instead we have to iterate the decomposition procedure, so as to incorporate correctly the changes in the traffic streams. This paper extends our earlier work on open networks of (finite-buffer) PH|PH|1|K queues [10, 11] by replacing a number of approximation steps with exact results. The paper presents the theoretical background as well as some validation results.

1

Introduction

Queueing networks (QNs) have been used widely for the analysis of performance problems in computer and communication systems. For many classes of queueing networks elegant and efficient solution methods exist [12]. In case the QNs under study are open, i.e., when the number of customers is not a priori restricted, product-form results, such as those for Jackson networks, can be used. A disadvantage of these results is that they are only valid under a number of restrictions: the service times need to be of exponential type when combined with FCFS scheduling, the stations have unbounded buffer capacity, and all arrival processes are Poissonian. These restrictions might not always apply in practice. The above restrictions have lead researchers to search for extensions and approximations. A number of papers have appeared focusing on QNs with blocking, mostly in combination with FCFS scheduling and exponential service and interarrival times [18]. We focus on the form of blocking common in communications: customers arriving at a completely filled queue are simply lost.

Queueing network models with either finite customer number or with finite buffers (and hence with customer losses) can be analysed via the numerical solution of the underlying CTMC. This is the approach that has been successfully persued in tools like MARCA (and XMARCA) [20, Section 10.2], MACOM [19] and others (see also the list of tools in [20, Section 10.1.3–5]), although it might lead to CTMCs with very large state spaces. A similar approach (with similar problems) has been followed when using stochastic Petri nets. In this paper we also rely on the mapping of the queueing network model to an underlying CTMC, however, we do use separate CTMCs per queue, thus avoiding the state space explosion problem. Our approach has been motivated by the approximate solution method of large open queueing networks with infinite-buffer stations and FCFS scheduling, as proposed by K¨ uhn [13] and later extended by Whitt [24, 23]. A key issue in their approach is the approximate decomposition of the overall model in separate models for each queue. The decomposition method of Whitt has been refined by various authors. Harrison and Nguyen recently proposed the QNET method [9], based on reflective Brownian motion processes which need to be solved via partial differential equations. Although their method is more exact, an efficient solution only exists for models with just two stations. In their method πNET, they combine the QNET approach with an approximate decomposition. The resulting method is comparable in sophistication and accuracy with Whitt’s method. More recently, Dai et al. [6] proposed an alternative decomposition method, the sequential bottleneck method, in which an open queueing network is decomposed in a set of groups of queues, i.e., not necessarily in individual queues. Their method outperforms QNA and QNET in most cases, but not always; for details see their original paper. It should be noted that these refined decomposition methods address open queueing networks with infinite buffers. In the mid-1990’s we extended Whitt’s approach by both replacing a few of its approximations and by changing the modelling assumptions. In particular, we allowed for the inclusion of finite-buffer queueing stations and incorporated matrix-geometric and general Markovian techniques to analyse the individual queues exactly, instead of the approximations used originally [10, 11, 22]. In the current paper we extend this work in that we remove a few approximate steps in the decomposition procedure. In particular: – we now use exact results for the departure process of PH|PH|1|K queues, as first developed by Bocharov and Naumov [1]; – we make the per-queue analysis more efficient by using dedicated Markovian solution techniques as well as spectral expansion techniques; – we are now able to obtain more detailed information on the loss process, namely the expected “interloss” time and the variance. To the best of our knowledge, the method presented here is currently the “most exact” analytic/numerical approach for analysing large open networks of finitebuffer PH|PH|1|K queues. This paper is further organised as follows. In Section 2 we introduce the class of queueing network models we address as well as the measures we want to

obtain. Then, in Section 3 we discuss the form and size of the CTMCs underlying such queueing network models. Since such CTMCs can become too large to analyse, we present a fixpoint decomposition technique in Section 4. Details of the per-queue analysis required in this decomposition approach, both for finiteand infinite-buffer queues, are presented in Section 5. Tool support is briefly addressed in Section 6, as is a validation of the approach in Section 7. The paper is concluded in Section 8.

2 2.1

The model class and aimed-at performance measures Model class

We address open queueing networks consisting of nn service stations (nodes) interconnected by edges. Customers arrive from one of the ns external sources, go their way through the network and finally leave the queueing network at one or more exit points. Figure 3 shows a simple open queueing network which consists of nn = 3 queueing stations, ns = 1 external source (marked S) and a single exit point (marked E). During its sojourn in the queueing network, a customer visits one or more nodes and requests for service there. The queueing network may contain feedback connections. The routing of customers through the network is specified using routing probabilities ri,j (i, j = 1, · · · , nn), summarised in the nn × nn matrix R. Each node i has its own characteristics about the time it needs to serve a customer, specified via the mean service time E[Si ] = 1/µi and the squared co2 efficient of variation Cs,i . When the service of a customer at a node has finished, the customer leaves the node and either goes to the next node or leaves the queueing network. At the nodes, customers are served in FCFS order. Nodes may have infinite of finite buffer capacity for waiting customers. The number of customers that can be “stored” at a node, including the one in service, is denoted li . At nodes with finite buffer capacity, arriving customers may be lost in case all the buffer slots have already been occupied by other customers. The stream of lost customers at a node either directly leaves the queueing network, or it is routed to another node. The latter case allows us to explicitly model effects due to buffer overflows, such as they appear in communication systems with retransmission facilities. A traffic stream is specified by the first two moments of the interarrival times of its customers. When a traffic stream enters a node, its characteristics might change due to the service characteristics of the node, or due to the limited buffer capacity of the node. In a similar way, traffic stream characteristics change when they are superpositioned or split. In Figure 1 we illustrate these three operations. In Table 1 we give an overview of the involved model parameters. For ease of notation, we indicate the external source and sink of customers as node 0. Notice that we do not further address the so-called customer multiplication factor γi in this paper. It offers an approximate way to model customer combination and

λj,i

λa,i

λi,j ′

λd,i

i 1

3

2

Fig. 1. Operations on traffic streams: (1) merge, (2) serve, (3) split nn ns λ0,i 2 C0,i E[Si ] = 1/µi 2 CS,i γi li rP i,j 1 − j ri,j

the the the the the the the the the the

number of nodes in the network number of sources in the network external arrival rate at node i squared coefficient of variation of the external stream arriving at node i mean service time at node i squared coefficient of variation of the service time at node i customer multiplication factor at node i queueing capacity of node i routing probability from node i to node j (summarised in matrix R) (regular) departure probability from node i Table 1. Overview of the model parameters

customer splitting at a queueing station. In the sequel of this paper, all values γi are set to 1. 2.2

Performance measures

We will compute steady-state performance measures for the queueing network. Three classes of performance measures are computed: 1. For each node i we compute the following traffic-related measures: – the rate and squared coefficient of variation of the overall traffic arriving 2 at node i: λa,i and Ca,i ; – the rate and squared coefficient of variation of the traffic departing from 2 node i: λd,i and Cd,i ; – the rate and squared coefficient of variation of the traffic leaving the 2 network from node i: λnetd,i and Cnetd,i ; – the blocking ratio blocki at node i: blocki = – the overall offered load at node i: ρˆi =

λa,i −λd,i ; λa,i

λa,i µi ; λ

. Notice that – the server utilisation (or accepted load) at node i: ρi = µd,i i due to losses at finite-capacity nodes ρˆi can be larger than ρi . 2. For each node i we compute the following performance measures: – the expectation and variance of the waiting time Wi ; – the expectation and variance of the queue length Ni . 3. Finally, we derive the following network-wide performance measures:

– the total rate of external arrivals; – the total rate of traffic that leaves the network, excluding the traffic leaving the network due to losses; – the expectation and variance of the total number of customers in the network; – the expectation and variance of the total sojourn time of an average customer in the network.

3 3.1

CTMC representation of the queueing network Phase-type representation of service and arrival processes

PH-distributions form a special class of probability distributions that are defined via a finite continuous-time Markov chain consisting of a finite number m of transient states and one absorbing state. A PH-distribution is represented by the tuple (T, α), where α is an initial probability vector of length m, and where the m × m matrix T is part of the following generator matrix [16]:   T T0 Q= . 0 0 The matrix T contains the transition rates between the m transient states of the CTMC and the column vector T 0 contains the transition rate of each transient state to the absorbing state. The initial probability vector of the Markov chain is given as (α, αm+1 ). The i-th moment of the probability distribution of the time until absorption in state m + 1 is given as E[X i ] = (−1)i i! · αT−i e,

for i = 0, 1, . . .

(1)

In the queueing network model description, we specify interarrival and service times by the first two moments (E[X] and C 2 ). This allows us some freedom to select an appropriate PH-distribution to represent these. To leave the required computations as compact as possible, we aim at selecting the PH-distribution with the smallest number of states m, given the first two moments. We therefore have to distinguish two cases:   – In case C 2 ≤ 1, we use a hypoexponential distribution with m = C12 phases and initial probability vector α = (1, 0, · · · , 0) [22]. The matrix T is then given as   −λ0 λ0   −λ1 λ1     .. .. T=  , . .    −λm−2 λm−2  −λm−1 where λi = m/E[X], for 0 ≤ i < m − 2 and where q 2m(1 + 12 m(mC 2 − 1) mλm−1 λm−1 = and λm−2 = . E[X](m + 2 − m2 C 2 ) 2λm−1 E[X] − m

For small C 2 , PH-type distributions with a large number of transient states m will be obtained. To limit the computational requirements in the analysis 1 . This approximation corprocess we do not allow C 2 to be smaller than 10 responds to an Erlang-10 distribution and produces generally good results, also as approximation for deterministic services. – In case C 2 > 1, we take an hyperexponential distribution with m = 2 phases. Such a distribution has three free parameters (the choice probability p between the two possible phases and the rates µ1 and µ2 of the two phases). Fitting on the first two moments leaves us one degree of freedom. We resolve this by assuming so-called “balanced means”, meaning that the ratios p/µ1 and (1 − p)/µ2 should be equal. This then gives us that α = (p, 1 − p) and ! r 2p − E[X] 0 1 1 C2 − 1 with r = + . T= 2 2 C2 + 1 0 − 2(1−p) E[X] 3.2

The underlying CTMC of a queueing network

If we describe the external sources and the service times at the nodes of an open queueing network by PH-type distributions, we can construct a CTMC that represents the stochastic behaviour of the complete queueing network. Let ns be the number of external sources with PH-type arrival-time distribution (α(i) , Ti ) with ma,i transient states for arrival process i. Correspondingly, let nn be the number of nodes in the network with PH-type service-time distribution (β (i) , Si ) with ms,i transient states for service process i. We now distinguish two cases: 1. All queues of the network are finite. In this case the network can be represented by a finite CTMC with N oS states where nn ns Y Y ms,i li . ma,i · NoS = i=1

i=1

Obviously, even for small networks very large values for NoS will arise. 2. The network contains u nodes with unlimited queue length. The CTMC is an infinitely large u-dimensional quasi-birth-death process with a repetitive structure whose size N oSR depends on the sources and the service units: ns nn Y Y NoSR = ms,i li∗ , ma,i · i=1

i=1

where li∗ = li in case node i has finite capacity, and li∗ = 1 otherwise.

Since we are interested in steady-state performance measures of the queueing network, we have to compute the steady-state probability vector of the CTMC. In both the above cases the CTMC will be far too large for standard analysis methods. Hence, our goal is to reduce the CTMC. We achieve this by splitting the CTMC into nearly-independent smaller parts, one per node, which we analyse separately. This is achieved in the next section.

4

The decomposition approach: a fixed-point iteration

Considering the fact that the CTMC is a representation of a queueing network, there is an obvious way to partition the CTMC into a number of smaller CTMCs: if we suppose that the nodes of the network work fairly independently, we can construct one CTMC for each node and analyse it in isolation. Of course, there are dependencies between the nodes: the performance as perceived at a node depends on its input traffic stream which, in general, depends on the output of other nodes. Thus, our first goal is to compute the inter-node traffic streams. Since we do not know the traffic stream characteristics leaving the nodes, we can not compute the traffic stream characteristics for all the incoming streams in a single pass. Hence, to resolve the recursive dependencies, we apply a fixed-point iteration technique, as outlined in the following algorithm: 1 procedure AnalyseNetwork 2 λa := λ0 3 C 2a := C 20 4 λold := (0, . . . , 0) 5 C 2old := (0, . . . , 0) 6 while difference(λa , λold ) > ε or difference(C 2a , C 2old ) > ε do 7 λold := λa 8 C 2old := C 2a 9 foreach node i do 10 analyse: compute the output traffic of node i (2) 11 split: split the output traffic of node i (3) 12 merge: superposition the output traffic to input traffic (1) 13 (this step computes the new values for λa and C 2a ) 14 end 15 end 16 compute node performance results 17 compute network-wide results 18 end The algorithm fundamentally consists of three phases: lines 2 through 15 compute the traffic characteristics of the flows through the network, line 16 computes performance results for individual nodes, and line 17 computes network-wide performance results. In the following three sections we explain the operations performed in each of these phases. 4.1

Internal flows phase

The algorithm iterates on the arrival rate λa and the squared coefficient of variation C 2a of the incoming traffic1 . At the beginning of the iteration these 1

λa stands for a vector whose ith element is the arrival rate of incoming traffic to node i. This notation is analogously applied to all other node parameters.

values are set to the characteristics of the external sources (lines 2 and 3). The main part of the algorithm (lines 9 through 14) applies the following steps to each node i: it superpositions the traffic streams that flows into node i to one total input stream (line 12; operation “1” in Figure 1), then it computes the output 2 traffic (λd,i , Cd,i ) of the node (line 10; operation “2” in Figure 1 which will be explained in detail in Section 5), and finally the output is split into sub-streams for each outgoing connection (line 11; operation “3” in Figure 1). Note that the actual algorithm performs the superpositioning as last step since there is no output traffic at the beginning of the iterative procedure. The above three steps return intermediate values for λa and C 2a (actually in line 13) which are used as starting point for the next iteration. The iteration stops when the difference between two consecutive intermediate results is smaller than the desired precision. The difference is calculated by the function “difference” (line 6), e.g., by summing up the absolute differences between the components of the two vectors. Below, we will explain how traffic streams are split (line 11) and merged (line 12). The node analysis (line 10) will be elaborated on in Section 5. Splitting traffic streams When splitting traffic streams we assume that they are generated by renewal processes. In this case the traffic stream that flows from node i to node j is characterised by [24]: 2 2 + (1 − ri,j ) = ri,j · γi · Cd,i λi,j = λd,i · γi · ri,j and Ci,j

Analogous to the routing matrix R we can define a routing vector D for the traffic that leaves the network from the nodes (excluding the traffic that leaves the network due to losses): D = 1 − R1 . Thus, we obtain 2 2 λnetd,i = λd,i · γi · Di and Cnetd,i = Di · γi · Cd,i + (1 − Di ) .

Merging traffic streams The total arrival stream that enters node i is the sum of the traffic streams flowing from outside the network and from the other nodes into node i. Its PH-distribution can be constructed by combining the PHdistributions of the sub-streams. This would allow us to compute the moments of the inter-arrival time by using (1). However, the number of states of the resulting PH-distribution is given by the sum of the sizes of the constituting PHdistributions, which would result in very large PH-distributions when merging many traffic streams; therefore we do not pursue this approach and use Whitts hybrid superposition equation instead. The arrival rate is given exactly by nn X λj,i , λa,i = λ0,i + j=1

and the second moment of the inter-arrival time is approximated as [24, Eq. (33)]: ! n X 2 2 2 pi,j Ci,j , Ca,j = 1 − wj · 1 − p0,j C0,j − i=1

where pi,j = λi,j /λa,j is the proportion of arrivals to j that come from i and p0,j = λ0,i /λj is the proportion of arrivals to j that come from outside the network. The weight function wj is given by [24, Eq. (34–35)]:  −1 −1 Pn 2 wj = 1 + 4 (1 − λa,j /µj ) (vj − 1) . It should where vj = p20,j + i=1 p2i,j be noted that the expressions for these weights have been derived empirically. 2 Node analysis In this step we compute λd,i and Cd,i for each node i. We explain the employed algorithms in Section 5.1 for finite queues, and in Section 5.2 for infinite queues.

4.2

Node performance results

We describe the node performance phase in Section 5.1 for finite queues, and in Section 5.2 for infinite queues. The result of this phase is, for each node i, the expectation and the variance of both the queue length Ni and the waiting time Wi . 4.3

Network-wide results

The following overall network performance measures can be computed: – The total rate of external arrivals: λ0,total =

nn X

λ0,i .

i=1

– The total rate of traffic leaving the network excluding the traffic leaving the network due to losses: nn X λnetd,i . λnetd,total = i=1

– The expectation and the variance of the total number of customers in the network: nn nn X X Var [N ]i . E [N ]i and Var [N ]total = E [N ]total = i=1

i=1

Additionally, we are interested how an aggregate customer experiences the network. In the following we will state the results of Whitt [24]. The expected number of visits to node i for a customer that successfully traverses the network λa,i . (i.e., does not get lost) is given by E[Vi ] = λ0,enter Thus, the mean of the time T that a customer spends in node i is given by i   1 E [Ti ] = E [Vi ] µi + E [Wi ] , which leads to the expected total sojourn time of a customer in the network:   nn X 1 E [Vi ] E [T ] = + E [Wi ] . µi i=1

The results for the variance of Vi , Ti and T can be found in [24, Eq. (80)–(84)].

5 5.1

Analysis of PH|PH|1|K and PH|PH|1 queues Analysis of PH|PH|1|K queues

We start with a discussion of the CTMC underlying a node and its solution. We then discuss the output traffic of such a node and finally present how the per-node performance measures are to be computed. The underlying CTMC The analysis is based on the idea that the arrival and the service processes can be represented by PH-distributions where entering the absorbing state stands for the arrival of a customer respectively the departure of a served customer. Let (α, A) be the arrival PH-distribution with l states and (β, B) the service PH-distribution with m states. Then we can describe the behaviour of a node with queueing capacity k by a Quasi-Birth-Death (QBD) process with k + 1 levels where level 0 consists of l states and where levels 1 through k consists of l · m states. The i-th level represents the state of the system when it contains i customers. A step from level i to level i + 1 (i < k) stands for an arrival and a step from level i to level i − 1 (i > 0) stands for a departure. The l · m respectively l states of a level describe the current state of the arrival and of the service processes (level 0 contains only l states because the queue is empty and the service process has not yet started). This leads to the following generator matrix of the Markov chain:   A A0 α ⊗ β 0 0  I ⊗ B A ⊗ I + I ⊗ B A α⊗I   0 0   0 I⊗B β A⊗I+I⊗B A α⊗I Q=  , (2)   . . . . . .   . . . I ⊗ B0β

(A + A0 α) ⊗ I + I ⊗ B

where A0 = −A · 1, B 0 = −B · 1 and A ⊗ B is the tensor product of the matrices A and B. The steady-state solution v of the Markov chain with generator Q can be obtained by solving the global balance equation v · Q = 0 and v · 1 = 1 .

The vector v is of size l + k · l · m. In the following we write v 0 for the vector (v0 , . . . , vl−1 ) which contains the steady-state probabilities of level 0 and we write v i for the vector (vl+(i−1)·l·m , . . . , vl+i·l·m−1 ) which contains the steadystate probabilities of level i = 1, . . . , k. There are several methods available to solve the above global balance equations [20]. Since the system of equations is finite we can use direct solution methods, such as LU-decomposition or Gaussian elimination method. However, they both require the explicit representation of the entire generator matrix which

may be a problem since the matrix consists of (l + l · m · k)2 elements. To avoid these large storage demands, we apply an iterative solution technique, based on the Gauss-Seidel method, but adapted to the special structure of the generator matrix. We only store the non-zero (tri-) diagonal blocks of the matrix, thereby significantly reducing the memory requirements: the non-zero blocks of level 0 (and its transitions to/from level 1) contain l2 (2m + 1) elements, the non-zero blocks of levels 1 through K − 1 (and their transitions to/from the next level) contain 3l2 m2 each, and level K contains l2 m2 elements. Since levels 1 through K − 1 exactly have the same structure, we need to store them only once. This means that the memory consumption to store the generator matrix does not depend on the queueing capacity k. Of course, the complexity of the solution still depends on the queueing capacity, i.e., the number of non-zeroes (nz) that the Gauss-Seidel method has to process per iteration is nz = l2 (2m + 1) + 3l2 m2 (k − 1) + l2 m2 ∼ 3kl2 m2 , i.e., we observe linear dependence on the buffer size (k) and quadratic dependence on the size of the interarrival and service time representations. Instead of solving the global balance equations directly, we can also exploit the repetitive structure of the CTMC by using matrix-geometric methods or the spectral expansion method, both adapted for the finite-buffer case (see e.g., [2, 8, 21], [5] or [4, 3]). This will become beneficial whenever k is very large. Until we have performed experiments to find the most efficient technique depending on the size of k, we stick to the use of the adapted Gauss-Seidel iteration scheme. The output traffic The steady-state solution vector v now allows us to compute the departing traffic stream, i.e., the first two moments of the interdeparture time distribution of the node. First, we compute the probability vector v a,k that an arriving customer encounters a full buffer [1, Eq. (2.2)]: v a,k =

1 v (A0 ⊗ I) , λa k

where λa stands for the arrival rate to the node and k stands for the queueing capacity of the node. This leads us to the blocking probability π, i.e., the probability that an arriving customer encounters a full queue and is lost: π = v a,k · 1 . The departure rate of served customers is then given by λd = λa (1 − π) . We can use Bocharov’s results [1] to compute the second moment of the interdeparture time distribution of served customers as well: if the queue is not empty after a departure took place, the time to the next departure is distributed according to the service time. Otherwise, it is distributed like the PH-distribution

(v d,0 , A) where v d,0 are the probabilities of leaving an empty queue at departure instants: 1 v (I ⊗ B0 ) . v d,0 = λd 1 Then, the total variance of the interdeparture time distribution can be derived as the sum of the variance of the service time distribution and the variance of the PH-distribution (v d,0 , A): 2 σd2 = σs2 + σd,0

which finally leads to the squared coefficient of variation c2d = λ2d σd2 . The “blocking behaviour” of a finite-buffer node can be evaluated as follows. The rate of lost customers is simply given by λloss = λa π , where π is the probability that an arriving customer encounters a full queue. We can analyse the variance of the process of blocked customers of a finite-buffer node by a transformation of the CTMC describing the usual queueing behaviour of the node in a PH-distribution (γ, Γ ) whose absorbing state represents the loss of a customer. The CTMC of this PH-distribution is similar to the CTMC for the queueing process (Eq. (2)) but differs at one point: it will enter the absorbing state when an arrival occurs toa full queue. This leads us to the PH-distribution  Γ Γ0 where with generator matrix 0 0  A A0 α ⊗ β  I ⊗ B0 A ⊗ I + I ⊗ B  A0 α ⊗ I   0 0  0  β A ⊗ I + I ⊗ B A α ⊗ I I ⊗ B Γ=  .   . . . .. .. ..   0 I⊗B β A⊗I+I⊗B 

This matrix differs from the one in Eq. 2 in the lower right corner: an arrival to a full queue pushes the corresponding process into the absorbing state. This means that the restart of the arrival process (represented by the term A0 α) is no longer part of the transition rates between the transient states of the distribution. We still need the initial probability vector γ to compute the variance of the PH-distribution. The absorbing state of the PH-distribution represents the loss of a customer due to a full queue. The distribution “immediately restarts” after entering the absorbing state, i.e., the initial probability vector depends on the state of the system at absorption time. We know about absorption time that 1. the queue contains k customers, 2. the arrival process restarts with initial probability vector α, and 3. the state of the service process when an arriving customer is lost is distributed as v a,k .

The first item implies that the subvectors γ 0 , . . . , γ k−1 of γ are set to 0. Thus, we arrive at the following initial probability vector γ: γ=

1 v a,k · e

0 · · · 0 α ⊗ v a,k



,

where v 1 ·e is a renormalisation factor. Now that we have completely specified a,k the departure stream of lost customers as a PH-distribution, we can compute the second moment of this stream as 2γΓ−2 1 . We avoid the explicit inversion of Γ by solving the linear system of equations x(2) Γ2 = 2γ , and summing the elements of the vector x(2) [12]. Computing node performance The steady state probability vector v immediately gives us the first and the second moment of the queue length distribution: E [N ] =

k X

i · v i 1 and Var [N ] =

i=1

k X

2

i2 · v i 1 − E [N ]

.

(3)

i=1

The first and second moment of the waiting time distribution are given by [1]: 1 (E [N ] − 1 + v 0 1) , λd   2  2 Var [W ] = − E [W ] b1 q2 − q 1 1 ⊗ B−1 1 λd E [W ] =

where b1 = −βB−1 1 is the service rate. The components of the vector q 1 = Pk of the queue length distribution as a funci=2 (i − 1)v i give the first momentP k 1 tion of the system state and q2 = i=3 2 (i − 1)(i − 2)v i 1 is the second noncentral moment of the queue length distribution. Unlike E [N ] and Var [N ], q 1 and q2 only refer to the customers in the queue (not those in service). 5.2

Analysis of PH|PH|1 queues

In case the buffer is infinitely large, a similar type of regular CTMC as in the finite-buffer case is obtained; the only difference is the fact that the generator matrix has repeating columns ad infinitum. The steady-state probability vector can then be computed with the spectral expansion method [3, 15] or with matrix geometric methods [16, 12, 14]. We will not discuss these methods here, but only note that we currently use the spectral expansion method as it seems to be most efficient [17]. Since infinite queues produce no losses, we have  λa , if λa < µ, λd = error “overloading”, otherwise.

The variance of the output stream is calculated using the same approach as in 2 the case of finite-buffer queues, hence v d,0 = λ1d v 1 (I ⊗ B 0 ) and σd2 = σs2 + σd,0 2 where σd,0 is the variance of the PH-distribution with representation (v d,0 , A). Similar to the case of finite-buffer queues, Bocharov’s result to compute the queue length distribution and the waiting time distribution can be used here, thereby setting λd = λa and k = ∞. When using the spectral expansion or matrix geometric method, the equations in (3) are transformed into geometric series which lead to closed-form expressions for E [N ] and Var [N ].

6

Tool support

We have developed an analysis tool called FiFiQueues in C++ that implements the algorithms described in this paper. The tool consists of a command line based program that accepts the model specifications in the form of text files. When started it proceeds through five phases: The first phase is the input phase: the model description is read from the input file. The second, third and fourth phase respectively correspond to the three computation phases described in Section 4 and 5. The last phase is the output phase.

Fig. 2. Main window of the GUI

To ease the model specification, we have also developed a graphical user interface in Java for the analysis tool. Figure 2 shows its main window. FiFiQueues, including examples and documentation, is available at http://www-lvs.informatik.rwth-aachen.de/tools/fifiqs/fifiqs.html

7

Validation

In the following sections we evaluate the accuracy of the new decomposition and analysis algorithms. We do so by a comparison with simulation results; the simulations are performed with a discrete event-based simulation package (also developed in our group) for the simulation of queueing networks. Due to space constraints we present only a few validation results. 7.1

Single queues

We address a single queueing node with one external source. Obviously, for such a simple model only a few parts of the described algorithms are actually needed. In fact, the analysis consists of only two steps: 1. the transformation of the arrival and the service processes to a PH-distribution, 2. the analysis of the queueing station. Step 2 produces exact results provided that step 1 is correct, i.e., the original arrival and service processes can be represented by PH-distributions and the mapping algorithm (Section 3.1) chooses the right PH-distribution. Example 1: Single queue We consider a single finite queue with queueing capacity 20 and Erlang-2 distributed service time with mean 1. The arrival process is assumed to be hyperexponentially distributed with C 2 = 2 and rate 0.8. The algorithm fits it to a 2-phase hyperexponential distribution as described in Section 3.1. We compare the results of the analysis with the results of the corresponding simulation. The simulation uses the same distributions as the analysis. The results are shown in Table 2. The numbers in parentheses next to the simulation results represent the half-width of 95% confidence intervals as percentage of the simulation average. The numbers in the third column represent the relative errors2 of the analysis results relative to the simulation averages. The first two rows show the results for the departure traffic of the queue. The middle two rows show the results for the stream of blocked customers. The last two rows show the expected queue length and the expected waiting time. As can be seen, the relative errors of the analytic results are very small or are comparable to the half-width confidence intervals of the simulation runs. This is valid not only for the first moment, but also for the second moment of the inter-departure/loss time. 7.2

Queueing networks

In the analysis of general queueing networks, our decomposition approach employs three approximations: 2

The error of x relative to y is defined as (x − y)/y

Analysis λd 0.796 Cd2 1.08 Loss traffic λloss 0.00367 2 9.44 Closs Node performance E [W ] 4.43 E [N ] 4.32 Output traffic

Simulation 0.795 (0.279 %) 1.08 (0.513 %) 0.00360 (2.03 %) 9.23 (2.97 %) 4.42 (0.136 %) 4.32 (0.139 %)

Relative Error 0.126 % 0.0 % 1.94 % 2.28 % 0.226 % 0.0 %

Table 2. Results for the single queue network (Example 1)

1. the transformation of the arrival and service processes into PH-distributions, 2. the superpositioning of incoming traffic streams, 3. the splitting of outgoing traffic streams. We have already addressed the first issue in the previous example. Regarding the second issue, Whitt showed in [23] that his superpositioning approximation generally produces good results. Finally, the splitting of renewal processes is an exact operation and results in a new renewal process. However, it is only under special conditions that the process to be split is a renewal process. In the following two examples we will see that the above approximations have only a minor impact on the accuracy of the overall results of the decomposition method. Example 2: Queueing network with feedback We address three queueing nodes in series, with a feedback from the last to the first queue, as shown in Figure 3. The external Poisson source has rate 1.3 and the service times are Erlang-5 distributed with rate 1.5; the queueing capacity is 10 at all queues. The feedback probability is 25%. Note that even for this small network the underlying CTMC has (1 + 5 · 10)3 = 132651 states.

0.25

S

1

2

E

3 0.75

Fig. 3. 3-node queueing network with feedback (Example 2)

The results are shown in Table 3. The first two rows show the characteristics of the traffic departing the queueing network from node 3. The middle three rows show the customer loss rate at each node, and the last three rows show the expected queue length at each node. The analysis results for the offered load (respectively the accepted load) of node 1 through 3 are: 1.106 (0.978), 0.978 (0.965), and 0.965 (0.956). As can be observed, the decomposition method works

Output traffic λnetd,3 2 Cnetd,3 Loss traffic λloss,1 λloss,2 λloss,3 Queue length E[N1 ] E[N2 ] E[N3 ]

Analysis 1.076 0.4121 0.1916 0.01992 0.01286 6.473 4.432 3.959

Simulation 1.077 (0.0247 %) 0.4125 (0.0299 %) 0.1912 (0.1920 %) 0.01935 (1.362 %) 0.01128 (3.340 %) 6.471 (0.130 %) 4.455 (0.177 %) 3.948 (0.525 %)

Relative Error -0.0926 % -0.0970 % 0.2092 % 2.946 % 14.01 % 0.0309 % -0.516 % 0.279 %

Table 3. Results for the network with feedback (Example 2)

λa,1

very well even for server loads close to 1. Although the relative error of λloss,3 is quite large, its absolute error relative to λnetd,3 is tolerable. Figure 4 shows the incoming traffic to node 1 as a function of the number of iterations in the fixed-point procedure. As can be observed, the fixed-point is reached after a very small number of iterations. This behaviour has been typical for all queueing networks we have analysed so far.

1.7 1.65 1.6 1.55 1.5 1.45 1.4 1.35 1.3 1.25 0

1

2

3

4

5

Number of iterations Fig. 4. Incoming traffic to node 1 as a function of the number of iterations in the fixed-point procedure (Example 2)

Example 3: K¨ uhn’s nine-node network As a larger queueing network we evaluated a modified version of K¨ uhn’s nine-node network [13], as shown in Figure 5. The external arrival rate to nodes 1–3 equals 0.5 and C02 = 0.5. The service rate at each node is 1.0 (except for node 5 where µ5 = 0.5), and the

Cs2 = 0.5. All nodes are finite with queueing capacity of 25. The size of the underlying CTMC equals 23 · (1 + 25 · 2)9 ≈ 1.86 × 1016 .

1 5 0.3

2

0.7 0.1

4

7

9

0.3

0.2

0.6

0.1

6

0.8

0.7

0.3

8

0.7

0.2

3

Fig. 5. K¨ uhn’s nine-node network (Example 3)

This network has also been examined in [22, 11]. Table 4 shows the simulation results stated there, as well as our results for the departure rate at each node. The last column shows the errors of the analysis results relative to the simulation. Node Analysis Simulation Relative Error 1 0.500 0.500 0.000 % 2 0.500 0.500 0.000 % 0.500 0.501 -0.200 % 3 4 0.764 0.765 -0.131 % 5 0.229 0.228 0.437 % 6 0.570 0.571 -0.175 % 0.690 0.692 -0.290 % 7 8 0.557 0.557 0.000 % 9 0.881 0.881 0.000 % Table 4. Results for the departure rates in K¨ uhn’s nine-node network (Example 3)

8

Summary, conclusions and further work

In this paper we have presented a decomposition method for large open queueing networks with phase-type distributed service and interarrival times and mixed finite- and infinite-buffer queues. In comparison to our earlier work in this direction, the new method incorporates exact methods where in the past approximations were used and provides us with more detailed information on the losses of

customers at finite-buffer stations. In comparison with decomposition methods known from the literature, our method is the only one allowing for finite buffers and, hence, for customer losses. Experiments have shown that our method is both very accurate and converges in a small number of steps. For the near future we are planning to remove even more approximation steps in the decomposition procedure, e.g., regarding the superpositioning of traffic streams. We will also increase the modelling flexibility, e.g., by allowing for multiple customer classes, priority scheduling and Markovian arrival processes. In [7] an extension has been presented that allows to analyse multicast networks. The supporting tool will be extended accordingly; we will also integrate the discrete-event simulation algorithms in it. Next to these methodological developments, we will continue to use our method for the study of capacity planning problems in communication systems. Acknowledgements Ramin Sadre has been supported by the Ministry for Education, Science and Research of the Government of Nord-Rhein Westfalen, Germany, in the context of the “Schnelle Netze” (high-speed networks) project (107 003 98). Alexander Ost has been supported by the German Research Foundation (DFG) in the context of the “Graduiertenkolleg Informatik & Technik”; he is now with Ericsson Eurolab, Herzogenrath.

References 1. P.P. Bocharov. Analysis of the queue length and the output flow in single server with finite waiting room and phase type distributions. Problems of Control and Information Theory, 16(3):211–222, 1987. 2. P.P. Bocharov and V. Naumov. Matrix-geometric stationary distribution for the PH|PH|1|r queue. Elektronische Informationsverarbeitung und Kybernetik, 22(4):179–186, 1986. 3. R. Chakka. Performance and Reliability Modelling of Computing Systems Using Spectral Expansion. PhD thesis, University of Newcastle upon Tyne, 1995. 4. R. Chakka and I. Mitrani. Spectral expansion solution for a finite capacity multiserver system in a Markovian environment. In D.D. Kouvatsos, editor, Proceedings of the 3rd International Workshop on Queueing Networks with Finite Capacity, pages 6.1–6.9, 1995. 5. S. Chakravarthy and M.F. Neuts. Algorithms for the design of finite-capacity service units. Naval Research Logistics, 36:147–165, 1989. 6. J.G. Dai, V. Nguyen, and M.I. Reiman. Sequential bottleneck decomposition: an approximation method for generalised Jackson networks. Operations Research, 42(1):119–136, 1994. 7. G. Schneider, M. Schuba, and B.R. Haverkort. QNA-MC: A performance evaluation tool for communication networks with multicast data streams. In R. Puigjaner, N.N. Savino, and B. Serra, editors, Proceedings of the 10th International Conference on Modelling Techniques and Tools for Computer Performance Evaluation, Lecture Notes in Computer Science 1469, pages 63–74. Springer-Verlag, 1998.

8. L. G¨ un and A.M. Makowski. Matrix geometric solutions for finite capacity queues with phase-type distributions. In P.J. Courtois and G. Latouche, editors, Proceedings Performance 1987, pages 269–282, North-Holland, 1988. 9. J.M. Harrison and V. Nguyen. The QNET method for two-moment analysis of open queueing networks. Queueing Systems, 6:1–32, 1990. 10. B.R. Haverkort. Approximate analysis of networks of PH|PH|1|K queues: Theory & tool support. In H. Beilner and F. Bause, editors, Quantitative Evalution of Computing and Communication Systems, Lecture Notes in Computer Science 977, pages 239–253. Springer-Verlag, 1995. 11. B.R. Haverkort. Approximate analysis of networks of PH|PH|1|K queues with customer losses: Test results. Annals of Operations Research, 79:271–291, 1998. 12. B.R. Haverkort. Performance of Computer Communication Systems: A ModelBased Approach. John Wiley & Sons, 1998. 13. P.J. K¨ uhn. Approximate analysis of general queueing networks by decomposition. IEEE Transactions on Communications, 27(1):113–126, 1979. 14. G. Latouche and V. Ramaswami. The P H/P H/1 queue at epochs of queue size change. Queueing Systems, 25:97–114, 1997. 15. I. Mitrani and R. Chakka. Spectral expansion solution of a class of Markov models: Application and comparison with the matrix-geometric method. Performance Evaluation, 23:241–260, 1995. 16. M.F. Neuts. Matrix Geometric Solutions in Stochastic Models: An Algorithmic Approach. Johns Hopkins University Press, 1981. 17. A. Ost and B.R. Haverkort. Steady-state analysis of infinite stochastic petri nets: A comparison between the spectral expansion and the matrix-geometric method. In Proceedings of the 7th International Workshop on Petri Nets and Performance Models, pages 36–45. IEEE Computer Society Press, 1997. 18. H.G. Perros. Queueing Networks with Blocking. Oxford University Press, 1994. 19. M. Sczittnick and B. M¨ uller-Clostermann. MACOM—a tool for the Markovian analysis of communication systems. In R. Puigjaner, editor, Participants Proceedings of the 4th International Conference on Data Communication Systems and Their Performance, pages 456–470, 1990. 20. W.J. Stewart. Introduction to the Numerical Solution of Markov Chains. Princeton University Press, 1994. 21. D. Wagner, V. Naumov, and U. Krieger. Analysis of a finite-capacity multi-server delay-loss system with a general Markovian arrival process. In A.S. Alfa and S. Chakravarthy, editors, Matrix-Analytic Methods in Stochastic Models. MarcelDekker, 1996. 22. A.J. Weerstra. Using matrix-geometric methods to enhance the QNA method for solving large queueing networks. Master’s thesis, University of Twente, Department of Computer Science, 1994. 23. W. Whitt. Performance of the queueing network analyzer. The Bell System Technical Journal, 62(9):2817–2843, 1983. 24. W. Whitt. The queueing network analyzer. The Bell System Technical Journal, 62(9):2779–2815, 1983.