Predictive and Dynamic Resource Allocation for Enterprise Applications

1 downloads 2648 Views 545KB Size Report
in web analytics [16]. In the first, called server-side data collection, the log files where all transactions and requests to the web site are stored undergo systematic ...
2010 10th IEEE International Conference on Computer and Information Technology (CIT 2010)

Predictive and Dynamic Resource Allocation for Enterprise Applications M. Al-Ghamdi, A.P. Chester, S.A. Jarvis Department of Computer Science University of Warwick Coventry, UK [email protected] Abstract—Dynamic resource allocation has the potential to provide significant increases in total revenue in enterprise systems through the reallocation of available resources as the demands on hosted applications change over time. This paper investigates the combination of workload prediction algorithms and switching policies: the former aim to forecast the workload associated with Internet services, the latter switch resources between applications according to certain system criteria. An evaluation of two well known switching policies – the proportional switching policy (PSP) and the bottleneck aware switching policy (BSP) – is conducted in the context of seven workload prediction algorithms. This study uses real-world workload traces consisting of approximately 3.5M requests, and models a multi-tiered, cluster-based, multi-server solution. The results show that a combination of the bottleneck aware switching policy and workload predictions based on an autoregressive, integrated, moving-average model can improve system revenue by as much as 43%. Keywords-predictors; dynamic resource allocation; enterprise applications; switching policies

I. I NTRODUCTION As the use of enterprise applications becomes more widespread, so issues concerning infrastructure performance and dependability become more significant. It is widely recognised that a slow or unreliable response from an ebusiness site is one of the main reasons for a customer to seek an alternative [26]. Such issues are mitigated through capacity planning and workload forecasting [27]. However, forecasting is error prone – one need only look at recent examples from the financial markets, climate studies or production and operations management [17] to be aware of these concerns. The observation of past values in order to anticipate future behaviour represents the essence of the forecasting process as seen in this paper. Numerous predictors are discussed and the way in which they are applied in the context of dynamic resource allocation is analysed. Our premise is that workload forecasting may assist revenue-generating enterprise systems which already employ methods of dynamic resource allocation; however, as with forecasting in other domains, the predictions may in fact be wrong, and this may result in server reallocation to the detriment of the service. Dynamic resource allocation systems have been shown to improve revenue in such environments by reallocating 978-0-7695-4108-2/10 $26.00 © 2010 IEEE DOI 10.1109/CIT.2010.463

servers into a more beneficial configuration; contrast this with static systems which are periodically unable to deal with significant changes in workload, and as a result lead to a decrease in revenue. Enterprise systems usually employ a multi-tier architecture, which provides a clear separation of roles between the tiers and each tier can be modified or replaced without affecting the other tier. Commonly a multitier architecture consists of three tiers; a client-facing web tier, which is responsible for receiving the requests from the client and sending the response back, an application tier used for the application logic, and a data-persistence tier that is usually comprised of a relational database management system (RDBMS). At each tier servers may be clustered to provide high-availability and to improve performance. In this paper, the typical enterprise system is modelled using a multi-class closed queuing network to compute the various performance metrics (such a representation is common as there is a limit to the number of simultaneous customers logged into the system [21]). An advantage of using an analytical model compared with other approaches is that we can easily capture the different performance metrics, identify potential bottlenecks and, importantly, investigate a wide variety of hypothetical scenarios without running the actual system. One should envisage such a model running alongside a real system, where the model can react to parameter changes when the application is running (e.g. from monitoring tools or system logs) and make dynamic server switching decisions to optimise predefined performance metrics [28]. In order to improve the reliability of the approach, we use the work developed in [5], and used in [1] and [28], where convex polytopes are used for bottleneck identification. Multi-class queuing networks have also been used. Revenue may also be affected by overloading, where the response time increases significantly as a result. Thus an admission control policy is developed and used throughout [1], [28]. The approach used here is quite different from that found in [8], where an algorithmic approach is used to optimise a resource allocation problem where resources are given in discrete units; it differs to from the graph-theoretic approach for solving a resource allocation optimisation problem, which is used in [24] and then developed to the case where 2776

there are multiple classes of resources in [25]. Two well-known methods are used for data collection in web analytics [16]. In the first, called server-side data collection, the log files where all transactions and requests to the web site are stored undergo systematic analysis; in the second, the visitor’s Web browser is used to collect data. In this paper we employ the first method, that is collecting the data directly from the web server. The workload can be characterised at four different levels: the business layer, the session layer, the function layer, and the HTTP-request layer [20]. Here the real-world Internet workload is characterised at the second of these levels, where the set of requests issued from different users are clustered periodically. Workload forecasting approaches can be divided into two different categories: quantitative and qualitative [22]. The qualitative approach is a subjective process based on different information such as expert opinion, historical analogy, and commercial knowledge. The estimation of future values of different workload parameters which relies on the existence of historical data, i.e. that seen in the quantitative approach, is the approach to forecasting used in this work. As in previous capacity planning work [4],[10],[22], we generate a workload model from the characterisation of real data. The predictive forecasting is based on past values, using several different (but common) predictors: Last Observation (LO), Simple Average (SA), Sample Moving Average (SMA) and Exponential Moving Average (EMA), Low Pass Filter (LPF), and an AutoRegressive Integrated Moving Average (ARIMA). These forecasting algorithms are combined with two well known switching policies – the proportional switching policy (PSP) and the bottleneck aware switching policy (BSP). The forecasting and switching work in tandem; after applying the predictor, the system’s resources are reallocated with respect to the prediction. A. Paper Contributions and Structure The contributions of this paper are as follows: •





We construct a model-based environment in which the combination of workload prediction and dynamic server switching can be explored. A multi-tiered, clusterbased, multi-server solution is modelled, which contains bottleneck identification through the use of convex polytopes and also admission control. A workload model is also constructed from the characterisation of real data; After introducing several schemes for workload prediction in this context, the forecast accuracy of these schemes is compared; An evaluation of two well known switching policies – the proportional switching policy (PSP) and the bottleneck aware switching policy (BSP) – is conducted in the context of seven workload prediction algorithms.

All fourteen cases are compared with a control system where no switching is applied; The remainder of this paper is organised as follows: Section II presents related literature and contrasts this with our own work; The modelling of multi-tiered internet services and the associated revenue function are described in section III; In section IV we present the bottleneck and admission control systems to enhance the overall system’s revenue and also its authenticity, the dynamic resource allocation policies applied to our system are found in section V. In section VI the workload that has been used in the experiments and the predictive algorithms which are used are studied; The experimental setup and results can be found in section VII, and the paper concludes in section VIII. II. R ELATED W ORK The work in [15] focussed on maximising profits of best-effort requests when combined with requests requiring a specific quality of service (QoS) in a web farm. It is assumed that arrival rates of requests are static, whilst the arrival rates in our work are dynamic. In [11] the authors attempt to maximise revenue by partitioning servers into logical pools and switching servers at runtime. This paper differs from [11] as the switching is considered in a multitier environment. This paper is different from [29] in the following respects: 1) The experiments are conducted using the same switching policies (Proportional Switching Policy and Bottleneck-aware Switching Policy), but with additional model-based workload prediction. 2) The workload used in our work is also based on real-world Internet traces, it is however extensive, containing two months worth of HTTP requests to the NASA Kennedy Space Center web-server in Florida [2]. The work in [6] examines the effectiveness of admission control policies in commercial web sites. A simple admission control policy was developed in our previous work [28] and is applied again in this paper. The use of five different predictive algorithms (regression method, linear regression, nonlinear methods, moving average, and exponential smoothing) are all found in [22] to enable workload forecasting for Web services. Several of the predictors found here (last observation, sample average, low pass filter, and ARIMA model) were used in [12] to predict the behaviour of data-exchange in the Globus Grid middleware MDS. A number of predictive methods (running average, single last observation, and low pass filter) were used in [7] for optimizing the choice of indexers made by QM (reducing QM wait time and thus, user wait time). To the best of our knowledge this is the first example of a model-based workload prediction and dynamic server switching analysis in the context of real-world HTTP requests.

2777

III. M ODELLING OF M ULTI - TIERED I NTERNET S ERVICES AND R EVENUE F UNCTION .

and vir represents the visiting ratio of class-r jobs at station i. Service demand Dir is defined in [14] as the sum of the service times at a resource over all visits to that resource during the execution of a transaction or request:

A description of the system model which has been used in this work, together with the revenue function, are presented in this section. The notation used in this paper is summarised in table I.

Dir = Sir • vir The total population of the network (K), the mean system response time T i (k), the throughput of class-r jobs, the mean queue length K ir , and the utilisation per-class station Uir (k) are described in detail in [1], [28], [29].

Table I N OTATION USED IN THIS PAPER

Symbol Sir vir N K R Kir mi φr πi T Dr Er Pr Xr 0 Xr Ui ts td

Description Service time of job class-r at station i Visiting ratio of job class-r at station i Number of service stations in QN Number of jobs in QN Number of job classes in QN Number of class-r jobs at station i Number of servers at station i Revenue of each class-r job Marginal probability at centre i System response time Deadline for class-r jobs Exit time for class-r jobs Probability that class-r job remains Class-r throughput before switching Class-r throughput after switching Utilisation at station i Server switching time Switching decision interval

B. Modelling the Revenue Function In [19] the session is defined as a sequence of requests of different types made by a single customer during a single visit to a site. Where a client request is met within the deadline, the maximum revenue is obtained, while revenue obtained from requests, which are not served within the deadline decreases linearly to zero, at which point the request exits the system. Equation 1 describes how the probability function of the request execution in the system, which is donated by P (Tr ), operates in our model; note the following assumptions – r, Dr , Tr , and Er represent the request and its deadline, the response time, and the dropped time from the system respectively.  1, Tr < Dr   T − Dr r (1) P (Tr ) = , Dr ≤ Tr ≤ Er  Er − Dr   0, Tr > Er With respect to the probability of the request execution, the gained and lost revenue is calculated. The loss revenue i function in pool i, which is denoted as Vloss , is calculated in equation 2, with the assumption that the servers are switched from pool i to pool j. Equation 3 is used to calculate i the gained revenue Vgain . Note that because the servers are being switched, they can not be used by both pools i and j during the switching process and the time that the migration takes cannot be neglected. The revenue gain from the switching process is calculated during the switching 0 decision interval time td as shown in equation 3, where the switching decision interval is greater than the switching time.

A. The System Model

Figure 1. A model of a typical configuration of a cluster-based multi-tiered Internet Service

i Vloss =

R X

Xri (ki )φir P (Tr )td −

r=1

A multi-tiered Internet service can be modelled using a multi-class closed queuing network [23] [30]. The closed queuing network model used in this paper is illustrated in figure 1. In a multi-class closed queuing network Sir represents the service time, which is defined as the average time spent by a class-r job during a single visit to station i

0

Xri (ki )φir P (Tr )td

(2)

r=1

j Vgain =

R X

0

Xrj (k j )φjr P (Tr )(td − ts )

r=1 R X



r=1

2778

R X

(3) Xrj (k j )φjr P (Tr )(td − ts )

After calculating the gained and lost revenue using equations 2 and 3, servers may be switched between the pools. In this paper servers are only switched between the same tiers, and only when the revenue gain is greater than the revenue lost. IV. B OTTLENECK AND A DMISSION C ONTROL A bottleneck in the system may be shifted between tiers according to changes in the workload mix and the number of jobs in the system [3]. It is clear that bottleneck identification should be one of the first steps in any performance study; any system upgrade which does not remove the bottleneck(s) will have no impact on the system performance at high loads, see [18]. Our work in [29] uses the convex polytopes approach to identify bottlenecks in two different server pools using two job classes (gold and silver). From the results we conclude that the bottleneck may occur at any tier and may shift between tiers; there is also a possibility that the system enters a state where more than one tier becomes a bottleneck. This method can compute the set of potential bottlenecks in a network with one thousand servers and fifty customer classes in just a few seconds; thus it is sufficiently effective for the purpose of this study. Overloading can cause a significant increase in the response time of requests, which leads to an obvious degradation in revenue. Admission control is a possible solution to the overloading problem. A simple admission control policy has been developed in our previous work [28] and has been applied again in this research. This policy works by dropping less valuable requests when the response time exceeds a threshold, and therefore maintaining the number of concurrent jobs in the system at an appropriate level. V. S ERVER S WITCHING P OLICIES In a statically allocated system, comprised of many static server pools, a high workload may exceed the capacity of one of the pools causing a loss in revenue, while lightly loaded pools may be considered as wasted resources if their utilisation is low. In other words, when the workload level is high, allocating a fixed number of servers is insufficient for one application, whereas it is a wasted resource for the remaining applications while the workload is light. Dynamic resource allocation has been shown to provide a significant increase in total revenue through the switching of available resources in accordance with the changes in each of the application’s workloads. The policies which we utilise here are the Proportional Switching Policy (PSP) and the Bottleneck-aware Switching Policy (BSP). A. Proportional Switching Policy The proportional switching policy was first presented in [28] and then used in [1], [29]. This policy works by allocating servers at each tier in proportion to the workload and subject to an improvement in revenue.

B. Bottleneck-aware Switching Policy There are some factors that may affect the systems performance (e.g. workload mix and revenue contribution from individual classes of job in different pools). The second algorithm which is used in this work is the bottleneck-aware switching policy, which overcomes these factors (using techniques described in section IV) in order to obtain improved performance results. This is a best-effort algorithm [29]. VI. T HE W ORKLOAD AND P REDICTIVE A LGORITHMS A. The Workload: The workload is the set of all inputs the system receives from its environment during any given period of time [9], [22]. In this study the workload (see figure 2) is based on Internet traces containing two months worth of HTTP requests to the NASA Kennedy Space Center web-server in Florida [2]. This trace contains 3,461,612 requests starting from 00:00:00 July 1, 1995 to 23:59:59 August 31, 1995. In typical fashion (see also [4], [22], [10]) we characterise this workload to form a workload model, which can then be used as the input to our system model. B. Predictive Algorithms Several predictive algorithms are employed: 1) Last Observation (LO): The forecasting procedure is based on the most recent observation. The last value is most likely to reflect the behaviour of future queries [12]. Px = Vx−1 2) Simple Algorithm (SA): The simple average algorithm is appropriate for short term forecasting [22]. The accuracy achieved by the technique is usually high when it is applied to nearly stationary data [13]. In this algorithm the predictive value Px is the mean average of all the previous observations. Px−1 Px =

Vi x−1 i=0

3) Sample Moving Algorithm (SMA): The predictive value Px is the mean average of the past performance values within a sample set s. Equal weighting is given to each performance value in this predictive algorithm. For this study we set the sample set s to be of size 3; other set sizes can be easily tested within this proposed framework. Px Px =

2779

i=x−s

s

Vx

1000 900 800

No. of Requests

700 600 500 400

Pool 1

300

Pool 2 200 100 0 1

6

11

16

21

26

31

36

41

46

51

56

61

66

71

76

81

Time Periods

Figure 2.

The total requests for both application pools starting from 00:00:00 July 1, 1995 to 23:59:59 August 31, 1995

4) Exponential Moving Algorithm (EMA): In this predictor, an older value within the sample set is given less importance than a newer value, This is done by applying a weighting factor (which declines exponentially) for each value in the set.

based on the last two values in the previous two time periods, this is termed AR(2). Where; Px , Px−1 , V , s, and α represent the predictive value, the previous predictive value, the last performance value, the sample set size, and the weighting factor respectively.

Px = Px−1 + α • (V − Px−1 )

Table II N OTATION USED IN THE PREDICTORS

5) Low Pass Filter (LPF): The low pass filter also weights recent data more heavily than older data, where the weight on each observation decreases exponentially by the number of observations using the following formula:

Symbol Px Px−1

Px = (w • Px−1 ) + ((1 − w) • V ) The w here represents the weighting parameter and its value between 0 and 1. If the value of w is equal to 0, then the low pass filter is the same as the last observation (LO). On the other hand, the filter never changes if w = 1. In terms of increasing the accuracy of the low pass filter prediction, the value of the weighting parameter w is set to 0.95, see [7] for more details.

6) Autoregressive Integrated Moving Average Model (ARIMA): The autoregressive integrated moving average model is denoted by the function ARIM A(p, d, q). Here p indicates the order of the autoregression, d indicates the amount of differencing, and q indicates the order of the moving average part. The experiments have been conducted using two well-known ARIMA models. The first model is donated as AR(1) where the forecasting process for the next value is based on the value in the last time period; when

Description The predictive value The previous predictive value

V

The last performance value

s

The sample set size

α

The weighting factor

VII. E XPERIMENTAL S ETUP AND R ESULTS Two applications are modelled as running on two logical pools. Each of these is multi-tiered, with each tier comprising a cluster of servers. The service time Sir , the visiting ratio vir and the remaining experimental parameters are based on realistic (i.e. sampled) values, or from those supplied in supporting literature [28]. Different measures to assess forecasting accuracy have been applied; this is done by calculating the predicted values from several different predictive algorithms and comparing these with the actual values (derived from the system model). Various accuracy measures have been used in the literature and their properties are well understood. The forecast accuracy measures that have been used here are: Mean Square Error (MSE), Mean Average Percentage Error (MAPE), 2780

Mean Absolute Deviation (MAD) and the Cumulative sum of Forecast Error (CFE). Table III describes the supporting mathematics for each of these different forecast accuracy measures. Additional notation used in these equations – N , O, and P – represent the data sample set size, an observed value and a predicted value respectively.

when the BSP is applied to the system without prediction. The range of revenue improvement is between 18.7% and 34.1% when the EMA, SMA, LPF, SA, and LO are applied with BSP to the system. In addition to this, the revenue is improved by 38.7% when AR(2) is applied with BSP, and further 4.6% improvement when AR(2) is applied to the system along with BSP. These are significant revenue gains.

Table III F ORECASTING ACCURACY MEASURES Symbol MSE MAPE MAD CFE

Equation 1 PN −1 2 i=0 (Oi − Pi ) N   1 PN −1 |Oi − Pi | • 100 i=0 N Oi 1 PN −1 i=0 (|Oi − Pi |) N PN −1 i=0 (Oi − Pi )

The resulting values from each of the forecast accuracy measures (MSE, MAPE, MAD, and CFE), with respect to the observed performance values, are shown in table IV. In each case we are looking for a value as close to zero as possible. Thus, with regard to the MSE, we see that SA has the highest value and AR(1) has the lowest value, which means that the most accurate predictor is AR(1) and the least accurate predictor is SA. EMA performs well (and SA less well) when MAPE and MAD are applied to determine the forecast accuracy. EMA is least accurate and AR(1) most accurate with CFE. Therefore from the results in table IV it can be seen that the recommended predictors for use in the system are AR(2), EMA, EMA, and AR(1) when the the forecast accuracy measures are MSE, MAPE, MAD and CFE respectively. Table V shows the system’s gain in revenue from applying several different predictive algorithms with the two server switching server policies. In each case the results show the base-line revenue when no switching policy is applied (NSP) and also the case when the switching policy alone (without forecasting) is applied. These provide good indicators against which the new results can be compared. As can be seen from table V, the different predictors that have been applied to the proportional switching policy (PSP) provide better results (revenue) than the original PSP without prediction, with just one exception – when the predictor AR(1) is used, where the revenue drops by -0.9%. Nevertheless all the predictors including AR(1) perform better than the non switching policy (NSP). It can also be seen that the improvement in system revenue is 3.5%, 3.2%, 2.8%, and 2.5% when the predictors SMA, LO, EMA, and AR(2) are applied to the PSP respectively. The improvement with the remaining predictors, LPF, SA, and AR(1), is 2.1%, 1.5%, and -0.9% respectively. Table V also shows that the revenue improvement from applying different predictors with BSP is at least 18% over that

VIII. C ONCLUSIONS AND F UTURE W ORK In this paper we construct a model-based environment in which the combination of workload prediction and dynamic server switching can be explored. A multi-tiered, clusterbased, multi-server solution is modelled, which contains bottleneck identification through the use of convex polytopes and also admission control. A workload model is also constructed from the characterisation of real data. We investigate the behaviour of server switching policies in the context of workload predictors. Several schemes for workload prediction are explored and the forecast accuracy of these schemes is compared. An evaluation of two well known switching policies – the proportional switching policy (PSP) and the bottleneck aware switching policy (BSP) – is conducted in the context of seven workload prediction algorithms. All fourteen cases are compared with a control system where no switching is applied. It has also been found that the SMA and AR(1) predictors are the most accurate with PSP and BSP respectively, while the lowest revenue is achieved when the AR(1) and EMA forecasting strategies are applied with PSP and BSP. Interestingly the accuracy of each predictor is noticeably different from one policy to other and there is no general case where improvements in revenue can be guaranteed. We have demonstrated that revenue can be improved by as much as 43% if the right combination of dynamic serverswitching and workload forecasting are used. We are in the process of verifying these conclusions on other data-sets and plan to extend the system model to capture larger-scale systems. The ultimate aim of the work is to identify a way of automatically selecting the most effective dynamic server-switching and workload forecasting strategies; this will no doubt depend on the configuration of the system and the nature of the workload being applied. R EFERENCES [1] M. Al-Ghamdi, A.P. Chester, and S.A. Jarvis. The Effect of Server Reallocation Time in Dynamic Resource Allocation. In UKPEW 2009, July 6th, 2009, Leeds, UK. [2] M. Arlitt and C. Williamson. Web server workload characterization: the search for invariants. SIGMETRICS Perform. Eval. Rev., 24(1):126–137, 1996. [3] G. Balbo and G. Serazzi. Asymptotic Analysis of Multiclass Closed Queueing Networks: Multiple Bottlenecks. Performance Evaluation, 30(3):115–152, 1997.

2781

Table IV F ORECAST ACCURACY, AGAINST FOUR DIFFERENT CRITERIA , FOR THE SEVEN FORECAST ALGORITHMS

Policy

LO

SA

SMA

EMA

LPF

AR(1)

AR(2)

3836.4

19661.5

3357.3

4074.8

5753.6

3645

3283.7

Mean Absolute Percent Error (MAPE)

34.5

84

36.3

23.7

43.8

35.9

36.1

Mean Absolute Deviation (MAD)

39.7

86.3

35.9

34.6

45.4

38.8

36.2

Cumulative Sum of Forecast Errors (CFE)

-327

-1899

-3102

-144233

-80329

56.37

145.1

Mean Squared Error (MSE)

Table V R EVENUE GAINS FOR SWITCHING POLICY AND FORECASTING COMBINATIONS

PSP + Predictive Algorithm Policy Total Revenue Improvement over PSP (%)

NSP

PSP

LO

SA

SMA

EMA

LPF

AR(1)

AR(2)

513.8

717.9

740.6

728.6

743

737.9

733.3

711.3

736

-

0

3.2

1.5

3.5

2.8

2.1

-0.9

2.5

BSP + Predictive Algorithm Policy Total Revenue Improvement over BSP (%)

NSP

BSP

LO

SA

SMA

EMA

LPF

AR(1)

AR(2)

513.8

869.3

1166.2

1105.5

1053.3

1031.6

1060.6

1245.6

1205.3

-

0

34.1

27.2

21.2

18.7

22

43.3

38.7

[4] M. Calzarossa and G. Serazzi. Workload characterization: A survey. In Proceedings of the IEEE, vol. 81, no. 8, pages 1136–1150, August 1993. [5] G. Casale and G. Serazzi. Bottlenecks identification in multiclass queueing networks using convex polytopes. In 12th Annual Meeting of the IEEE Int’l Symposium on Modelling, Analysis, and Simulation of Comp. and Telecommunication Systems (MASCOTS), 2004. [6] L. Cherkasova and P. Phaal. Session-based admission control: A mechanism for peak load management of commercial web sites. IEEE Trans. Comput., 51(6):669–685, 2002. [7] N. Dushay, J. C. French, and C. Lagoze. Predicting indexer performance in a distributed digital library. In Third European Conference on Research and Advanced Technology for Digital Libraries (ECDL99), Paris, France, September 1999. [8] A. Federgruen and H. Groenevelt. The greedy procedure for resource allocation problems: Necessary and sufficient conditions for optimality. Oper. Res., 34(6):909–918, 1986. [9] D. Ferrari. Computer Systems Performance Evaluation. Prentice Hall, April 1978.

[10] H. Gnter, G. Kotsis, and K. Gabriele. Workload modeling for parallel processing systems. In 3rd International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, page 8, keine Angabe, 1995. [11] L. He, J. Xue, and S. Jarvis. Partition-based profit optimisation for multi-class requests in clusters of servers. In ICEBE ’07: Proceedings of the IEEE International Conference on e-Business Engineering, pages 131–138, Washington, DC, USA, 2007. IEEE Computer Society. [12] H. Keung, J. Dyson, S. Jarvis, and G. Nudd. Predicting the performance of globus monitoring and discovery service (mds-2) queries. In GRID ’03: Proceedings of the 4th International Workshop on Grid Computing, page 176, Washington, DC, USA, 2003. IEEE Computer Society. [13] H. Letmanyi. Guide on workload forecasting, special public. 500-123. In Computer Scince and Technology, National Bureau of Standards, 1985. [14] M. Litoiu. A performance analysis method for autonomic computing systems. ACM Transactions on Autonomous and Adaptive Systems (TAAS), 2(1):3, 2007. 2782

[15] Z. Liu, M. Squillante, and J. Wolf. On maximizing servicelevel-agreement profits. In EC ’01: Proceedings of the 3rd ACM conference on Electronic Commerce, pages 213–223, New York, NY, USA, 2001. ACM.

[29] W.J. Xue, A.P. Chester, L. He, and S.A. Jarvis. Model-driven server allocation in distributed enterprise systems. In In: ABIS 2009: Proceedings of the 3rd International Conference on Adaptive Business Information Systems, Leipzig, Germany (March 2009).

[16] A. Mahanti, C. Williamson, and L. Wu. Workload characterization of a large systems conference web server. Communication Networks and Services Research, Annual Conference on, 0:55–64, 2009.

[30] J.Y. Zhou and T. Yang. Selective early request termination for busy internet services. In 15th International Conference on World Wide Web, 2006.

[17] J. Martinich. Production and Operations Management : An Applied Modern Approach. John Wiley and Sons, 1996. [18] M. Marzolla and R. Mirandola. Performance prediction of web service workflows. The third International Conference on the Quality of Software-Architectures (QoAS), 4880:127– 144, 2007. [19] D. Menasc´e. Using performance models to dynamically control e-business performance. In Proc. 11th GI/ITG Conference on Measuring, Modelling and Evaluation of Computer and Communication Systems, pages 11–14, 2001. [20] D. Menasc´e. Workload characterization. In IEEE Internet Computing, pages 89–92, Piscataway, NJ, USA, 2003. IEEE Educational Activities Department. [21] D. Menasc´e and V. Almeida. Scaling for E-Business: Technologies, Models, Performance, and Capacity Planning. Prentice Hall, Upper Saddle River, NJ, May 7, 2000. [22] D. Menasc´e and V. Almeida. Capacity Planning for Web Services: Metrics, Models, and Methods. Prentice Hall, Upper Saddle River, NJ, September 21, 2001. [23] J. Rolia, X. Zhu, M. Arlitt, and A. Andrzejak. Statistical service assurances for applications in utility grid environments. Modeling, Analysis and Simulation of Computer and Telecommunications Systems (MASCOTS), pages 247–256, 2002. [24] A. Tantawi and D. Towsley. Optimal static load balancing in distributed computer systems. J. ACM, 32(2):445–465, 1985. [25] A. Tantawi, G. Towsley, and J. Wolf. Optimal allocation of multiple class resources in computer systems. SIGMETRICS Perform. Eval. Rev., 16(1):253–260, 1988. [26] GVU’s WWW Surveying Team Graphics Visualization and Usability Center College of Computing Georgia Institute of Technology Atlanta GA 30332-0280. In http://www.gvu.gatech.edu/user surveys. [27] Q. Wang. Workload Characterization and Customer Interaction at E-commerce Web Servers. Master’s Thesis, University of Saskatchewan. 2004. [28] J. Xue, A. Chester, L. He, and S. Jarvis. Dynamic resource allocation in enterprise systems. In ICPADS ’08: Proceedings of the 2008 14th IEEE International Conference on Parallel and Distributed Systems, pages 203–212, Washington, DC, USA, 2008. IEEE Computer Society.

2783