Execution Time Estimation for Workflow Scheduling - Semantic Scholar

3 downloads 16689 Views 478KB Size Report
Besides scheduling, the runtime estimation is used to give the expected price of ..... single task processing can be considered as the call of a computing service ...
2014 9th Workshop on Workflows in Support of Large-Scale Science

Execution Time Estimation for Workflow Scheduling Artem M. Chirkin

A. S. Z. Belloum

Sergey V. Kovalchuk

Marc X. Makkes

ITMO University Saint-Petersburg, Russia University of Amsterdam Amsterdam, Netherlands [email protected]

University of Amsterdam P.O. Box 94323, 1090 GH Amsterdam, Netherlands [email protected]

ITMO University Birzhevaya line, 4, 199034 Saint-Petersburg, Russia [email protected]

TNO Information and Communication Technology Netherlands [email protected]

Besides scheduling, the runtime estimation is used to give the expected price of the workflow execution when leasing Cloud computing resources [1]. Given the estimated time bounds, one can use a model described in [2] to provide a user with the pre-billing information.

Abstract—Estimation of the execution time is an important part of the workflow scheduling problem. The aim of this paper is to highlight common problems in estimating the workflow execution time and propose a solution that takes into account the complexity and the randomness of the workflow components and their runtime. The solution proposed in this paper addresses the problems at different levels from task to workflow, including the error measurement and the theory behind the estimation algorithm. The proposed estimation algorithm can be integrated easily into a wide class of schedulers as a separate module. We use a dual stochastic representation, characteristic / distribution functions, in order to combine tasks’ estimates into the overall workflow makespan. Additionally, we propose the workflow reductions - the operations on a workflow graph that do not decrease the accuracy of the estimates, but simplify the graph structure, hence increasing the performance of the algorithm.

I.

The central idea of the work is to allow a scheduler to use the estimates as the random variables instead of keeping them constant, and provide the complete information about them (e.g. estimated distribution functions) at maximum precision. Therefore, we do not compare quantitatively the estimation performance to the methods that do not provide the distribution estimation; instead, we compare our results to the only known to us alternative “fully probabilistic” algorithm, which is based on combining quantiles of the tasks’ makespans. When calculating the workflow execution time, we assume that the tasks runtime models are adjusted to the distinct computing nodes; this allows us to separate the task-level (estimating the runtime on a given machine) and the workflow-level (aggregate the tasks’ estimates into the overall makespan) problems. The paper is based on a Master’s thesis by A. Chirkin [3]. See the thesis for the proofs of formulae and the implementation details.

I NTRODUCTION

The workflow execution time estimation is used mainly to support the workflow scheduling. In this work we use “execution time”, “runtime” or “makespan” as synonyms. In order to obtain the workflow runtime estimation, one needs to combine the execution time estimations of the tasks forming the workflow graph. The quality of scheduling strongly depends on the runtime estimate, because it is usually either a target of the optimization or a part of it.

The paper is organized as follows: section II overviews related research, section III describes the stochastic model of the execution time, section IV briefly explains the theory of the proposed approach and shows the algorithm step-by-step, performance of the approach is shown in the tests in section V.

The runtime estimation problem is not so well-developed as the scheduling problem because several straightforward techniques exist providing performance that is acceptable in many scheduling cases. However, these techniques may fail in some cases, especially in a deadline constrained setting. For example, consider a simple approach - to take the average along previous executions as the only value representing the runtime; it underestimates the total time in case of a parallel parameter-sweep workflow, because it estimates the workflow makespan as the maximum of the average tasks runtime values; whereas in a real world, it is likely that at least some of the tasks take more time to execute than usual. Moreover, when estimating the workflow execution time, one should take into account the dependency between the executed tasks, and the heterogeneity of the tasks and the computing resources. One important feature of scientific workflows, from the estimation perspective, is that some components of the runtime are stochastic in their nature (data transfer time, overheads, etc.). These facets are not considered together in modern research, therefore the problem needs further investigation.

978-1-4799-7067-4/14 $31.00 © 2014 IEEE DOI 10.1109/WORKS.2014.11

II.

R ELATED WORK

The workflow scheduling is known to be a complex problem; it requires considering various aspects of application’s execution (e.g. resource model, workflow model, scheduling criteria, etc.)[4]. Therefore, many authors describing the architecture of their scheduling systems do not focus on the runtime estimation. We separate the estimation from the scheduling so that the results obtained in our research (the workflow makespan estimation) can be used in complex scheduling systems that already exist or are being developed. Additionally, in order to monitor the workflow execution process or reschedule the remaining tasks, one can use the estimation system to calculate the remaining execution time at the moment when the workflow is being executed. Some examples of scheduling systems that can use any implementations of the runtime estimation module are CLAVIRE [5], [6], a scheduling framework proposed by Ling Yuan et al. [7], and CLOWDRB by Somasundaram and Govindarajan [8]. 1

TABLE I. time \ scheduling unspecified random fixed ordinal

U SE OF THE RUNTIME ESTIMATION IN SCHEDULING - CLASSIFICATION task task ranking and makespans [9], [10] NP parallelizing models [15], [16], [17] Round-Robin, greedy etc. [21], [20]

There are no papers dedicated specifically to the workflow runtime estimation problem known to us. Instead, there is a variety of papers dedicated to scheduling that describe the runtime models in context of their scheduling algorithms. We decided to classify the schedulers by the way they represent the execution time. Table I shows a classification of workflow scheduling algorithms; the section below discusses the classes in details. Several papers are difficult to be classified due to two reasons: first, some approaches stay between two classes (e.g. in [9], [15], [16], [17], [10] the execution time is calculated as a mean of random variable, but the variance of the time is not used); second, some researchers focus more on the architecture of scheduling software than on the algorithms, so one can use different estimate representations [12], [8], [7].

workflow architecture [11], [12], [7], [8] Chebyshev inequalities [13], composing quantiles [14] optimization algorithms [9], [18], [19], [20] -

the maximum of their averages. Stochastic time: The last class of schedulers makes assumptions that has a potential to give the most accurate runtime predictions. Knowledge of the expected value and the variance allows measuring the uncertainty of the workflow runtime, making the rough bounds on time using Chebyshev’s inequality1 . For instance, Afzal et al. [13] used VysochanskiPetunin inequality2 to estimate the runtime of a program. If provided by the quantiles or the distribution approximation, one can make the rather good confidence intervals on the execution time that may be used in a deadline-constrained setting. N. Trebon presented several scheduling policies to support urgent computing [14]. To find out if some configuration of a workflow allows to meet a given deadline, they used quantiles of the tasks’ execution time. For example, it can be shown easily that if two independent concurrent tasks (T2 and T3 Figure 1.c) finish in 20 and 30 minutes with probabilities 0.9 and 0.95 accordingly, then one can set an upper bound on the total execution time (max(T2 , T3 )) to max(20, 30) = 30 minutes with probability 0.9 · 0.95 = 0.855. If the tasks are connected sequentially (Figure 1.a), the total execution time is 20 + 30 = 50, again with probability 0.9 · 0.95 = 0.855. This approach is quite straightforward and very useful to calculate the bounds for the quantiles, but does not allow evaluating the expected value. Additionally, it is highly inaccurate for a large number n of tasks: let say, one has quantile p for each task, then the total probability for any workflow topology is pn at least. For example, in Figure 1 workflows (a) and (b) give p3 , workflow (c) gives p4 (if calculate max(T2 , T3 ) firstly) or p5 (if calculate tasks sequentially), workflow (d) gives p7 or more depending on the way the steps are composed.

Ordinal time: The first class of schedulers use tasklevel scheduling heuristics (which do not take into account the execution time of the workflow). An algorithm of such scheduler uses a list of tasks ordered by their execution time this ordering is the only information used to map the tasks onto the resources. First of all, this class is based on various greedy algorithms; two of them are described in [20]. Comparing to the workflow-level algorithms, they are very fast, but shown to be less effective [20]. A number of scheduling heuristics of this class is described in a paper by Gutierrez-Garcia and Sim [21]. Fixed time: The second class of schedulers considers the task runtime as a constant value. On the one hand, this assumption simplifies the estimation of the workflow execution time, because in this case there is no need to take into account a possibility that a task with the longer expected runtime finishes its execution faster than a task with the shorter expected time. Especially, this approach simplifies the calculation of the workflow execution time in case of parallel branches (Figure 1.b): if the execution time of branches is constant, then the expected execution time of the whole workflow is equal to the maximum of the branches’ execution time. On the other hand, this assumption may lead to significant errors in the estimations and, consequently, to inefficient scheduling. In the case of a large number of parallel tasks, even a small uncertainty in the task execution time makes the expected makespan of the workflow longer. However, in some cases a random part of the program execution time is small enough, thus this approach is often used. One example of using fixed time is the work of M. Malawski [22] et al. who introduced a cost optimization model of workflows in IaaS (Infrastructure as a Service) clouds in a deadline-constrained setting.

The main drawback of using random execution time is that one needs much more information about the algorithm of the task than just a mean runtime; in order to compute the quantiles one has to obtain full information about distribution of a random variable. Some authors do not take into account the problem of obtaining such information assuming it is given as an input of the algorithm. However, in real applications the execution data may be inconsistent with the assumptions of the distribution estimation method: if a program takes several arguments, and these arguments change from run to run, it might be difficult to create a proper regression model. Additionally, even if one has created good estimations of the mean and the standard deviation of a distribution as the functions of program arguments, these two statistics do not catch the 1 Chebyshev’s inequality use the variance and the mean of a random variable to bound the probability   of deviating from the expected value: p P |X − E[X]| ≥ k Var(X) ≤ k12 , k > 0. We also refer to the Chebyshev-like inequalities as the other inequalities that use the mean and the variance to bound the probability. 2 Vysochanski-Petunin inequality is one of the Chebyshev-like inequalities; but improves the bound:  it is constrainedpto unimodal  distributions, p P |X − E[X]| ≥ k Var(X) ≤ 9k42 , k > 8/3.

Other examples of schedulers that do not exploit the stochastic nature of the runtime can be found in [9], [18], [19], [20]. They calculate the workflow makespan; this means that they have to calculate execution time of parallel branches, but the stochastic and fixed approaches to estimate it are different. The main problem, again, is that the average of the maximum of multiple random variables is usually larger than 2

T1

T2

T3

node (node that does not have the predecessors) and finishes at any of the last nodes (nodes that are not followed by any other nodes). The number of paths increases with each fork in the graph (Figure 1).

T3

a) d)

T1

T5

T4 T2

A. Task runtime

T1 c)

b)

T2 T1

Besides the calculation time of the programs, the workflow execution time includes the additional costs for the data transfer, the resource allocation, and the others. Therefore, single task processing can be considered as the call of a computing service, and its execution time can be represented in a following way:

T2 T4

T3

T3

Fig. 1. Primitive workflow types; (a) sequential workflow - the workflow makespan is the sum of variables; (b) parallel workflow - the makespan is the maximum of variables; (c) complex workflow that can be split into parallel/sequential parts; (d) complex irreducible workflow.

T = TR + TQ + TD + TE + TO

Here, TR is the resource preparation time, TQ - queuing time, TD - data transfer time, TE - program execution time, TO system overhead time (analyzing the task structure, selecting the resource, etc.). For detailed description of the component model (1) see [6]. At the learning stage, each component is modeled as a stochastic function of the environment parameters (e.g. machine characteristics, workload, data size, etc.). At this stage one may apply any known techniques to create the good models. When the scheduler asks for the estimate, it provides all the parameters (the unknown parameters are estimated either), thus the components transform into the random variables. In the case if all components are independent (at least conditionally on the environment parameters), one can easily find the mean and the variance of T as the sum of the components. Here and later in the paper for simplicity we assume all components are included into the overall task execution time T , which is a complex function at the learning stage and a variable at the estimation stage (after the parameters substitution).

changes in the higher-order moments like the asymmetry. As a result, under some circumstances (for some program arguments), distribution of the runtime may have a larger “tail” unexpected by the constructed empirical distribution function, hence the program has a larger chance to miss the deadline. III.

E XECUTION TIME MODEL

Section II describes the ways scheduling systems use the runtime estimations. Several examples show that the stochastic representation gave the most accurate results [20], [13]. The most significant advantage of the stochastic approach is the ability to measure the error of the estimation in a simple way. Additionally, it allows to represent the effect of the components that are stochastic in their nature, e.g. the data transfer time, the execution of randomized algorithms. However, this approach has its own problems, specific to the time representation. The first problem is the creation of the task execution time models: in order to efficiently perform the regression analysis one needs a sufficient dataset; estimating the full probability distribution as a function of the task input usually is a complicated task. The second problem is related to combining random variables (tasks’ execution time) into a workflow makespan because of the complex dependencies in the workflow graph.

The major problem of the stochastic approach is related to the way one represents the execution time when fitting it to the observed data. We use a regression model for this purpose: T = µ(X, β) + σ(X, β)ξ(X, β)

The workflow is represented as a directed acyclic graph (DAG) (Figure 1).



The nodes of the workflow are the tasks and are represented by random variables (execution time) Ti > 0; the random variables are mutually independent; we assume variables Ti to be continuous on the experimental basis.



The workflow execution time (runtime, makespan) is the time needed to execute all tasks in the workflow. An equivalent definition is the maximum along the time of all paths throughout the workflow.



The path throughout the workflow graph is a sequence of the dependent nodes which starts at any starting

(2)

Here X is the task execution arguments, represented by a vector in an arbitrary argument space; X is controlled by an application executor, but can be treated as a random variable from the regression analysis point of view. T ∈ R is a dependent variable (i.e. execution time); p β - unknown regression parameters, µ = E[T |X], σ = Var(T |X); ξ(X, β) random variable. The regression problem consists in estimating unknown parameters β, thus estimating the mean, the variance, and the conditional distribution of the runtime. Some authors used only the mean µ (e.g. [16], [20]), as a consequence they could not create the confidence intervals. Others computed the mean and the variance, but did not estimate ξ(X, β) (e.g. [13]); they used Chebyshev-like inequalities, which are based on the knowledge of the mean and the variance only. Trebon [14] used the full information about the distribution, but assumed it to be obtained by an external module. As we explained in section II (Stochastic time), it is not seemingly possible to estimate the conditional distribution of ξ for any input X, therefore one need to simplify the problem. As an example, one may assume that ξ and X are independent. However, in general case ξ depends on X - this means that the distribution of T not only shifts or scales with X, but also its shape changes.

In the most simple form the proposed solution to the problem is straightforward: create the task runtime models, collect the data and fit the models to it, and combine the task models into the workflow runtime estimate. Assuming the mentioned approach, we use the following features of the problem model: •

(1)

3

j kij < ki+1 is an index of node on the j-th path of a workflow, each path contains a number of nodes which is less than or equals to sj ≤ s. Our agreement on indexing is to use superscripts for the path numbers and subscripts for the task j j numbers. Fpath (t) is a CDF of the variable Tpath ; Fwf (t) is a CDF of the workflow runtime. The characteristic function (CF) is denoted as φ(ω) and uses the same subscripts/superscripts as CDF F (t). The execution time of a path throughout a workflow is denoted as follows:

Consequently, for some values of X, the time T may have a significantly larger right “tail” than for others, hence this effect should be taken into account when modelling task runtime. B. Obtaining task runtime A task runtime model always needs to be consistent with the execution data coming from the task execution. In order to achieve this we use evaluation/validation approach that allows to construct and update models in time.

j

j Tpath

1) Evaluation: The evaluation procedure creates a model of a task runtime, i.e. basically solves a regression problem (2); thus, an output of this procedure is a tuple of the mean and the standard deviation functions µ(X) and σ(X). The basis for the evaluation is the execution data from the provenance (arguments-runtime dataset). If a parametric model of the task execution time is known, we adjust the parameters using Maximum Likelihood Method (MLE)3 method. If no prior information about the structure of the time dependency on arguments is known, we use the Random Forest4 machine learning method to create the models based entirely on the provided data.

=

s X

Tkj .

(3)

i

i=1

The execution time (makespan, runtime) of a workflow is the time required to execute all paths of the workflow in parallel:  j  s   X j Twf = max Tpath = max  Tkj  . (4) j∈1..r

j∈1..r

i

i=1

Estimating moments: The mean and the standard deviation of a random variable are the easiest characteristics to estimate. However, since estimation of the workflow makespan requires to get the maximum of two random variables, we cannot calculate its moments directly. Since the tasks (as the random variables) are assumed to be mutually independent, the moments of path’s runtime can be computed directly:

2) Validation: Once the execution time models µ(X) and σ(X) for a task are created, they must be validated from time to time: there might be some changes in the environment (or a program represented by the task may be updated) that make an earlier created model inconsistent with the new data. The validation procedure solves this problem by checking task models with pre-defined time interval or each time the new execution data comes from the provenance (e.g. a logging system adds a record to a database with the task executions). The model is validated by means of performing statistical hypothesis tests on the normalized execution time data. We use simple heuristics (e.g. fixed number of points, fixed time interval) to select active sample - using this sample tests’ statistics are calculated and compared to the current values. Various tests are able to reject the null hypothesis (i.e. assumption that the model is still valid) with respect to different alternatives. We use a set of tests to check if the mean, variance, or shape of a sample have changed.

j

j E[Tpath ]=

s X

E[Tkj ] i

i=1 j

j Var(Tpath )=

s X

Var(Tkj ) i

i=1

Although the moments for a workflow cannot be calculated directly, it is possible to use the simple maximums as the initial approximations for the moments in order to use these values where the precision is not mandatory:   j E[Twf ] ∼ max E[Tpath ] j∈1..r   (5) j Var(Twf ) ∼ max Var(Tpath ) j∈1..r

IV.

E STIMATING WORKFLOW RUNTIME

Estimating distributions: If the CDFs are given for each task, then it is clear how to compute the overall probability. The sum of random variables turns into the convolution of two CDFs, the maximum of random variables turns into the multiplication of two CDFs. The only problem is to address the correlation of two variables: for example, in Figure 1.d random variable T1 + T3 partially depends on max(T1 , T2 ) + T4 .

In this section we provide the theoretical basis for the method and the algorithms. In order to proceed with the calculations, we use the following notation. Denote s - a total number of nodes in the graph. Tk , k ∈ 1..s - the independent random variables representing the execution time of each node; one can safely assume they are non-negative, continuous, and at least first two their moments are finite E[X l ] < ∞, l ∈ {1, 2}. Fk is a cumulative distribution function (CDF) of the variable Tk . Let r be the total number of different paths throughout the workflow; sj is the number of steps in the path j ∈ 1..r. Then kij , i ∈ 1..sj , j ∈ 1..r,

Define Zsum = T1 + T2 . Both random variables T1,2 are positive, have finite mean and variance. Then the CDF is given by the following expression: Zz FZsum (z) = FT2 (z − x1 |T1 = x1 )dFT1 (x1 ).

3 The MLE is a basic statistical method to estimate parameters of a distribution; it consist of creating a likelihood function (which measures “how likely” the data come from a given distribution) and maximizing it in a parameter space. 4 The Random Forest is a machine learning non-parametric method of creating functions out of argument-target tuples sample based on combining the decision trees [23].

0

Similarly for the maximum: Zmax = max(T1 , T2 ), then: Zz FZmax (z) = FT2 (z|T1 = x1 )dFT1 (x1 ). 0

4

Measuring the dependence (i.e. joint CDF) of two variables as well as the integration are complex procedures, hence there is a need to find a workaround. On the one hand, it is known that in case of independent variables X and Y , CDF of their maximum can be found simply as the multiplication of their CDFs: Fmax(X,Y ) (t) = FX (t)FY (t). (6)

to ensure that the workflow will execute not later than it is required.

B. Transform between CDF and CF Section IV-A explains how to bound the CDF of a workflow given the CDFs of all paths throughout it. However, it is easier to use the CF than the CDF for the paths, because it does not require the convolution (integration) of the sum of the random variables (tasks). Since the model of a task runtime is non-parametric, the only way to represent the CDF or the CF is to store their values in a vector (grid, i.e. set of points). The equations (6) and (7) are linear in term of the number of operations performed on the grid O(n)) where n is the size of the grid, whereas the convolution requires O(n2 ) operations (or, probably, O(n log n) if using some sophisticated algorithms).

On the other hand, one can use a CF to calculate the sum, because the CF of the sum of independent random variables equals to multiplication of their CFs: φX+Y (ω) = φX (ω)φY (ω).

(7)

Since the tasks (as the random variables) in one path throughout a workflow are considered to be independent (i.e. a path forms the sum of independent random variables), the easiest way to compute distribution of the runtime of one path is to use the CF. Thus, one can use (6) for the maximum operation and (7) for the sum operation. However, this raises the question of the conversion between CDF and CF - which is addressed in section IV-B.

If one stores both the CDF and the CF, then the sum and the maximum operations can be computed in the linear time; the CDF and the CF have the important property that they can be converted one into another using the Fourier transform; conversion of the functions using the Fast Fourier Transform (FFT) requires O(n log n) operations, resulting in the overall complexity of O(n log n). FFT is required only once per path, hence this also significantly reduces the number of “intensive” O(n log n) operations. As a result, the overall complexity of the method is O(rn log n). In this section we derive the conversion formulae for these two representations of the random variable distribution.

A. Workflow makespan CDF Schedulers usually have to satisfy a constraint on the maximum workflow makespan (“deadline”) when mapping tasks on computing nodes. Thus, a common request of the scheduler to an estimator is to get the probability that a workflow in a given configuration executes not later than the defined deadline. We call this the probability of meeting deadline t. Note, this definition is equal to the definition of the CDF - the probability that a random variable is lower than a fixed value t (argument of the CDF).

1) Estimating CDF and CF: Assume random variable X with CDF FX (t) and CF φX (ω). Here a parameter t denotes the time domain, and ω denotes the frequency domain. According to the definition,

Assume the CDFs of all paths in the workflow are already estimated. As it was mentioned, the paths as random variables may correlate, therefore it is difficult and computationally expensive to compute the workflow CDF directly. Our approach, instead, proposes to make a lower bound on the CDF. This allows to compute the lower bound on confidence level of meeting deadline - the most important application of the runtime estimation.

φX (ω) = E[eiωX ],

These functions can be estimated using sample X1 ..Xm - i.i.d.: m

1 X iωXj φˆm (ω) = e , m j=1

The formula for the probability of executing within a deadline t comes directly from (4) and can be written as follows:   j   s X Fwf (t) = P (Twf ≤ t) = P  max  Tkj  ≤ t (8) j∈1..r

m

1 X Fˆm (t) = 1{Xj ≤ t}. m j=1

Let’s say, one would like to calculate the distribution on a grid containing n points. Since the sample is finite, one can define such interval (a, b) according to the sample, that a < minj∈1..m Xj , b > maxj∈1..m Xj . Then Fˆm (a) = 0, Fˆm (b) = 1. Parameters a, b, and n completely define the grids that are used to store the values of CDF, CF:

i

i=1

Note, some indices kij overlap along varying paths j ∈ 1..r. The main problem is that, due to occurrence of the same random variables in different paths, the probability of the maximum cannot be decomposed into the multiplication of the probabilities directly. What one can do instead, is to compute j the distribution of each path Tpath separately and then bound the total distribution of Twf :  j  r s Y X P (Twf ≤ t) ≥ P Tkj ≤ t (9)

k (b − a) + a, n 2πj ωj = , b−a tk =

k = 0, 1, ...(n − 1) j = 0, 1, ...,

n 2

(10)

Note, ω-grid in (10) is almost half-sized (n/2 + 1 elements instead of n). The CF is a Hermitian function φX (−ω) = φX (ω) (and φˆm (−ω) = φˆm (ω)), hence one can restore the values of the CF for negative argument to get a full-sized grid.

i

j=1

FX (t) = P (X ≤ t).

i=1

The inequality above allows to say that the probability of meeting a deadline is higher than the calculated value, i.e.

5

2) Fourier Transform: After calculating the estimations of the CF and the CDF on the grids as shown in section IV-B1, one can transform between the representations using the FFT in a following way:

initial rough estimations of the moments: the size of the grid is always the same, proportional to the standard deviation of the workflow makespan. If one fixes the size of the t-interval (a, b) using the standard deviation of the overall workflow makespan and shifts the center of this interval with the average time of the calculated components, then the total error reduces significantly, because the length of the interval in this case is not proportional to the overall time and depends on the variance only. Additionally, in this case the ω-grid stays always the same, thus the calculation procedure does not become too complicated.

φˆ∗m (ωj ) =   −2πik P ˆ  i(a−b)ωj eiωj a n−1 k F (t ) − e n j if j 6= 0, m k n n = k=0  1 if j = 0. (11)    b+a n   − E[X] if j = 0,   a−b 2    ineiωj a φˆm (ωj ) if j ∈ {1.. n2 − 1}, ψj =  (a − b)ωj    ineiωn−j a ˆ    φm (ωn−j ) if j ∈ { n2 ..n − 1}. (a − b)ωn−j (12) n−1 X 2πij 1 k ∗ k Fˆm (tk ) = ψj e n + . (13) n j=0 n

C. Algorithm The formulae of section IV dictate the constraints on a possible algorithm to obtain the workflow runtime estimate. Firstly, one needs to get the estimations of the moments for tasks and paths using (5). Using the estimations, one should then calculate the CFs on a constructed grid (ω-grid stays the same during the computations). Finally, using (11,12 and 13)) one can convert the CFs into the CDFs and combine them on the shifted grids into the overall estimation.

The sums in (11) and (13) have exactly the form of the forward and backward Discrete Fourier Transforms (DFTs) respectively. Therefore, the FFT, which has the complexity O(n log n), can be applied directly to the equations.

Next we provide the algorithm of estimating the workflow runtime. Input of the algorithm is a workflow graph produced by the scheduler and the normalized samples (that are used to compute the runtime distribution of the workflow’s tasks). The task runtime distribution may be in some other form (e.g. parametric distribution together with its parameters instead of the data sample); the only requirement - is the possibility to estimate the CF or the CDF.

Going from continuous Fourier transform to discrete transform introduces a bias that is proportional to the approximation error of the finite sum of corresponding Fourier series. One can assume that the CDF is the analytic function, then the bias decays exponentially with the grid size n (and can be controlled easily by varying n), hence the significant part of the error is the variance of the initial estimations (statistical estimates of the CF and the CDF are the random functions). The overall error of the algorithm is expressed as follows: r    s ∗ ∗ ˆ FW F (t) − FW F (t) = O rsbn/2 + O · ξ (14) m∗

1 - Calculate moments The mean and the variance of the tasks are taken directly from the task models (2). Using formulae (5) one gets the estimations of the mean µwf and the standard deviation σwf of the workflow execution time. 2 - Create calculation grids The t-grids must satisfy several conditions: first, values of the CDF should be close to zero and one on the sides of a grid; second, it should not be too large in order to keep a good accuracy (see section IV-B2 for details); third, all grids should have the same length, so that ω-grids become the same; last, t-grids of all variables should be shifted in such a way that if they are continued to infinity, they all coincide (in other words, if the grids’ step is ∆t, then the shift can be only c∆t, c ∈ Z). Grids that satisfy these conditions may be constructed in a following way:  n µ  2qσwf tk = k − + b c ∆t, ∆t = , k = 0, .., n−1 2 ∆t n−1

b∗n/2

Here denotes the asymptotic of the Fourier series error the convergence may be polynomial or exponential, depending on the type of the CDFs; m∗ may be considered as the size of the smallest (among tasks within a workflow) sample. That is, the variance of the estimate depends on the sample size and the number of nodes. The bias also depends on the number of paths, but it can easily be controlled by varying the grid size; in addition, it depends on the shape of the CDF (and the interval b − a) indirectly via term b∗n/2 . 3) Combining shifted t-grids: Using one grid to compute the estimations of all tasks gives poor quality of the result. It is clear, because the execution time of a workflow usually is far higher than task’s execution time. Surely, if a workflow consists of ten similar tasks in a sequence, then each of the tasks has the average time ten times smaller than the overall time. But if they use the same grid for the CDF, then the most of grid’s points are wasted - filled by ones or zeros. Easy to show that, in general, the average time grows linearly with number of nodes, whilst the standard deviation grows as square root of the number of tasks. Therefore, in order to improve the overall accuracy of the method, we shift the center of the grid with the average execution time of the workflow’s component (task or path). We use formulae (5) to obtain the

2πj n n , j = − , .., −1, 0, 1, .., − 1 n∆t 2 2 Here µ is the mean of a variable for which a grid is computed (task, path, workflow); σwf - estimated standard deviation of the workflow runtime; q - constant, which affects the size of the grid interval (i.e. creates q-sigmas interval); n is the size of grid, which is recommended to be the power of 2 (for a faster work of the FFT). ωj =

3 - Calculate CF the paths throughout workflow On this step one should use provided tasks’ data to estimate

6

7

the CFs and sum them up into all possible paths throughout a workflow.

3 1

4 - Convert CFs into CDFs This step consist of applying the FFT formulae (12,13) on each path throughout a workflow.

5

7 3 1

5

7 3

Reduce Fork (RF) - combine all parallel parts (forkjoins) which start from the same set of nodes and join in the same set of nodes (i.e. tasks 7, 8, 9 in Figure 2 are reduced into a single task after deletion of edge 4 − 9).

10

4 9

2

3

Before performing the main algorithm one may simplify the workflow graph in order to reduce the complexity of the algorithm. It is clear that the complexity of an estimation algorithm cannot be lower than linear to the number of tasks s; hence we investigate the ways to decrease the number of paths r (which in the worst case may be exponential to the number of nodes). Figure 2 presents a workflow, which contains parts that can be simplified. One can count the number of paths m throughout it; it equals 11. Obviously, some of the paths in this graph are guaranteed to have lower value (execution time) than others. For example, path 1 − 4 − 9 − 10 is always shorter than 1 − 4 − 5 − 6 − 9 − 10, because the former sequence is a subsequence of the latter; then it does not contribute to overall runtime distribution, it only wastes an extra computing time. Therefore one should modify somehow the graph so that it does not contain the shorter paths. Another possible improvement is to compute the execution time of nodes 7 and 8 as a single random variable, which reduces the number of paths throughout the workflow by 3. Now the question is how to formalize these simplifications, implement them algorithmically, and prove that they help? Following are three types of reductions that partially answer these questions (the other types of reductions are also may be introduced in future):



8

6

1

6

1

Reduce Sequence (RS) - combine all fully sequential parts into the single meta-tasks; if the workflow contains the sequences of the tasks like 5 − 6 in Figure 2, these sequences are reduced into the single tasks;

10

9

2

D. Workflow reductions



8

6

4

The algorithm passes throughout all paths in a workflow once; the most expensive operation in terms of the grid size n is the FFT, which is performed for each path and in the sub-workflows. Therefore one can find out that the complexity of the presented algorithm is not more than O((r + s)n log n).

Delete Edge (DE) - delete all edges which introduce the paths fully contained in some other paths (“shorter” paths); e.g. edge 4 − 9 in Figure 2;

10

9

2

5 - Bound workflow CDF The final step is to bound the overall execution time CDF. Here one should directly apply (9). If the required value is not the CDF, but the CF, the result is then converted back using 11.



8

6

4

9

10

4 2

3 10

1 4 2

Fig. 2.

Example of performing reduction on a workflow

this particular example: final number of paths is almost four times lower than the initial (from 11 to 3). Unfortunately, it is difficult to give an analytical estimate on the performance of the reductions. Number of paths may grow exponentially with the number of nodes; and it is clear that any of the presented reductions alone cannot decrease the number of paths to polynomial. However one cannot claim the same for applying all three reductions together. Moreover, the real-world workflows are usually well-structured - have the polynomial number of paths and are easily reducible [24]. V.

B ENCHMARKS

In section II we showed the way to divide the assumptions on the execution time into several groups. The ordinal representation of the time is completely out of the scope of our work, because it does not allow to estimate the execution time. We also do not compare the fixed time approach, because it does not allow measure uncertainty. Moreover, fixed estimates (expected value of the time) are calculated as one of the stages of the main algorithm, thus in this sense the results should be identical. In this section we compare our method to other competitive stochastic approach - composing quantiles of the tasks (see [14]). Tests in this section evaluate the accuracy of the estimations by comparing the empirical cumulative distribution functions (EDFs). If a workflow is executed m times, each task has m executions in a database. On the one hand, the time of the workflow executions is known, thus one can estimate its EDF (call it the “real EDF”). On the other hand, given tasks’ execution time (which is also kept in database), one can apply the workflow runtime estimation algorithm (call it the estimated EDF). By comparing the real and the estimated EDFs

Figure 2 presents an example of performing the workflow reductions on one workflow. The initial workflow on this figure contains r = 11 paths and s = 10 nodes. The final workflow is irreducible w.r.t. discussed patterns (DE, RS, RF). As it is shown, the effect of the reductions is impressive for 7

0.9 0.8

0.7

0.7 Estimated EDF

0.8

0.6 F(t)

0.9

0.5

0.6

F(t)

0.9

1

QQ plot 1

Real EDF Estimated EDF EEDF + reductions

1 Real EDF Estimated EDF Quantile method

0.9

0.8

0.8

0.7

0.7

0.6

0.6 F(t)

EDF plots 1

0.5

0.5

0.5

0.4

0.4

0.4

0.3

0.3

0.3

0.3

0.2

0.2

0.2

0.2

0.1

0.1

0.1

0.1

0.4

0 600

620

640

660

680

t

0

0 1150

a) 0

0.2

0.4 0.6 Real EDF

0.8

1200

1250

1300

1350

1400

1450

t

1500

Real EDF Estimated EDF Quantile method

0 1230

b)

1240

1250

1260

1270

1280 t

1290

1300

1310

1320

1330

1

Fig. 4. Runtime estimation benchmark - real workflow. In Figure (a) the models use whole sample and fit poorly; in Figure (b) the sample is cut off by validation procedure, increasing quality of the fit.

Fig. 3. Runtime estimation benchmark - simulated workflow. Note, the distribution function F (t) may be interpreted as the probability of meeting a deadline t.

B. Case of real workflow one can judge on the accuracy of the estimated EDF (thus, the algorithm’s accuracy), because the real EDF represents the “ideal” estimation (from the statistics’ point of view, it exploits all information about the random variable).

The second test case is taken from the real data. The workflow structure, the models, and the execution logs belong to the urban flood simulation, which is described in [25]. The workflow consists of three sequentially connected tasks (Figure 1.a), which overall execution time depends on four parameters. The data for this example consist 173 launches of the workflow performed during four months; thus, configuration of the environment were changing (i.e. the software and the hardware were slightly changed), and one should expect a bad fit of the models.

A. Case of simulated workflow The first test case is designed in the way to evaluate the performance of the workflow runtime estimation algorithm under the assumption that the task execution time models are flawless. Hence, a workflow in Figure 2 is taken as a test case with fully simulated execution sample. The runtime of all tasks in the workflow does not depend on any parameters and is taken from the Weibull distribution; the shape, the scale and the shift parameters of the distribution are drawn randomly (uniform distribution) for each task; the average time for the tasks is between 50 and 150, the standard deviation is between 1 and 50. As a result, the workflow makespan varies around 610 − 670. Each task has an execution sample of size 200 simulating 200 executions of the whole workflow.

Figure 4 shows the performance of the algorithm on the example described above. The plots represent the workflow runtime EDF (for a fixed set of execution parameters). It turned out that the logs contain 10 launches of the workflow with selected parameter set; the blue line on the plots denotes the EDF (Real EDF) constructed using this reduced sample (the blue line represents the same data in both pictures, but the scale of the x-axes varies). The red line represents the EDF estimation by the workflow runtime estimation algorithm proposed in the thesis. The green line denotes the EDF created by the quantile method (see [14]). It is clear that the estimation in Figure 4.b fits far better than in Figure 4.a: the difference on the right plot is 1−5 seconds, whereas it is 50−80 seconds on the left plot in a band F (t) ≥ 0.9. The crucial effect on the accuracy is caused by the quality of tasks’ runtime models. The data were collected during a long period of time (four months); the configuration of the environment and the applications may have been changed several times - these changes are not caught in the models. As a result, the models based on the whole sample overestimate the uncertainty of the execution time. The estimation in Figure 4.b is obtained using the validation/evaluation techniques described in section III-A. There were two sessions of tests of the workflow separated by a significant pause (three months); the validation tests triggered the estimator to cut off the data sample, so that the models were created using only 100 last observations. Figure 4.b confirms the success of this approach: the final estimation matches the real data accurately.

Figure 3 presents results of the test on a simulated workflow. The left plot shows the real and the estimated (before and after reductions) EDFs. All three lines are close to each other - the estimations are rather good. One also can mention that the estimated EDFs are more smooth than the real EDF. This is caused by the repeating Fourier transforms: the discrete transform catches a finite number of frequencies, so the final distribution function is represented as a finite sum of sines and cosines. The estimation may be closer to the real function if using more grid points; the grids in Figure 3 contain 64 points. Inequality (9) may cause the estimated EDFs to move right on the plot; however, on this example this effect is too small to observe it graphically (green and red lines coincide almost everywhere on the plot). The right plot compares the real and the estimated (with graph reductions) EDFs by the means of comparing the quantiles, and confirms the high quality of the estimation - all the points are close to line x = y. This example shows that the workflow algorithm works accurately if tasks’ models are correct. This means that the bias term in the formula of the algorithm error (14) is wellcontrolled; one can enlarge the grid size, thus making the error to be low enough. However, the example does not show the behavior of the result if tasks’ models are inaccurate.

Despite the problems with the tasks’ model, presented algorithm clearly outperforms the quantile method in both Figures, which is the most precise method among used in today’s schedulers. Surely, even using inaccurate tasks’ models, the 8

error of the quantile method is twice bigger than the error of the proposed method. Thus we claim that the proposed method is more efficient than the others.

First, one can try to make the parametric EDF in order to improve the accuracy and reduce the number of calculations. For example, in many cases the EDF of the execution data perfectly fits to the Weibull distribution. One significant advantage of this approach is the ability to estimate the quantiles for high probabilities (i.e. extreme values). This is important if in the deadline-constrained setting a scheduler requests the confidence levels like 99.5% of meeting a deadline, but have less than one hundred executions in its logs (so the EDF has poor accuracy for given p-value). However, this approach have its own significant issues. One issue is that the sum and the maximum of random variables should belong to the same distribution family as the variables. It seems possible to generalize the Weibull distribution so that the sum and the maximum of random variables also belong to that family of distributions; but for other distributions it may be impossible. Another issue is that in order to use some family of distributions, one should prove that all random variables (i.e. tasks’ runtime) belong to that family, and find some robustness criteria to check if a sample satisfies them. One cannot simply say, for instance, that the execution time of any program is a random variable drawn from the Weibull distribution.

C. Performance notes The only operation that is strictly constrained in the execution time (i.e. is called by a scheduler on-line, multiple times) is the estimation of the workflow makespan; other procedures, such as the model evaluation or validation, are performed independently of an environment. Hence, it is important to make the workflow makespan estimation as fast as possible. Table II shows the execution time and the precision of the algorithm on the varying vector (calculation grid) size. Note, the grid size determines the number of used Fourier coefficients, thus directly affects the approximation error (bias); thus, one might want to use a larger grid to improve the estimate quality. The data were collected for a test case described in TABLE II. P ERFORMANCE OF THE ESTIMATION AGLORITHM R C ORE TM I 5-2500K CPU ( BENCHMARKS ARE COLLECTED ON I NTEL @3.30GH Z , U BUNTU 14.04); t±r - RUNTIME OF THE ESTIMATION PROCEDURE , WITHOUT / WITH REDUCTIONS ; err = maxt |Freal (t) − Festimated (t)|. vector size, n t−r , ms t+r , ms err

16 32 29 0.084

32 45 39 0.080

64 68 57 0.080

128 121 101 0.080

Another possible improvement, is that one can use such statistical tools as the Extreme Values Theory (EVT)5 to evaluate the good estimates for the high probabilities of meeting a deadline. The problem of this idea is that the presented algorithm loses the required information on the way from the tasks to a workflow, so EVT cannot be applied after the algorithm (on its result). In this case, one have to apply EVT on the tasks and modify the algorithm to keep the obtained information for the workflow estimation. This is difficult, because it requires to process the vector data and the EVT parametric models together. One way to overcome this problem is to use parametric model for the whole distribution (like in previous paragraph), but prove only that the model has a proper asymptotics (p → 1) instead of claiming that it is the actual distribution of the sample. For example, one can use the Weibull distribution for the tasks’ execution time, but use the algorithm only to give estimates for Twf = F −1 (p), where p ≥ 0.95.

256 200 142 0.081

section V-A (note, the size of the sample for that case is 200). The results are averaged across 1000 test runs. One can see that the error is not changed a lot with the grid size; this means that the variance component of the error is larger than the bias introduced by the algorithm. Thus, there is no reason to use a larger grid unless the variance is reduced. Especially, one should not use a grid with more points than the number of observations. The main message of this section is that the execution time is typically lower than 100ms for the optimal grid size; this allows the scheduler to use the algorithm without the significant increase of the scheduling time - the algorithm does not require much extra time comparing to its competitors. VI.

Last, one should consider the ways to improve performance of the task runtime models: the tests on the real data show that the errors in the task models cause the significant decrease of accuracy in the workflow estimate. Here one can try parametric distributions, improve evaluation/validation technique, or include machine performance models into runtime models in order to use more data from a set of computers.

C ONCLUSION

The presented research pursues two objectives. On the one hand, it aims at developing and validating an efficient algorithm to estimate the workflow execution time given the estimates of its tasks’ execution time. On the other hand, it aims at implementing an estimating system that can be embedded into existing schedulers. The algorithm presented in section IV-C solves the problem of estimating the workflow runtime. Tests in section V show that it gives the perfect estimates under the assumption of flawless tasks’ runtime models and outperforms the quantile method (which is used, for example, in [14]) on the real data. The algorithm exploits such mathematical tools as the DFT, the characteristic/distribution functions and their properties to give the most accurate result at the low computational cost. DFT’s error is known to decay exponentially with the size of a calculation grid, hence assures the efficiency of the method.

To sum up, the presented solution works on the real data and shows better performance than others: section V-B compares the proposed workflow algorithm to the quantile method and shows that the evaluation-validation technique allows to create acceptable task runtime models. However, the problem of the task runtime models may affect the accuracy of the workflow estimates like it affects the accuracy of other workflow runtime estimation methods. The technique of the workflow reductions has a potential to greatly reduce the computational complexity of the runtime estimation, but requires further analysis and formalization.

However, several ideas still can be introduced to the algorithm that may improve the quality of the estimates.

5 EVT is a branch of statistics dealing with the extreme deviations from the median of probability distributions.

9

ACKNOWLEDGMENT

[17]

This work was partially supported by Government of Russian Federation, Grant 074-U01, project ”Big data management for computationally intensive applications” (project #14613), and the Dutch national research program COMMIT.

[18]

[19]

R EFERENCES [1] C. Weinhardt, A. Anandasivam, B. Blau, N. Borissov, T. Meinl, W. Michalk, and J. St¨oß er, “Cloud Computing - A Classification, Business Models, and Research Directions,” Business & Information Systems Engineering, vol. 1, no. 5, pp. 391–399, Sep. 2009. [2] B. Sharma, R. K. Thulasiram, P. Thulasiraman, S. K. Garg, and R. Buyya, “Pricing Cloud Compute Commodities: A Novel Financial Economic Model,” in 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012). IEEE, May 2012, pp. 451–457. [3] A. Chirkin, “Execution Time Estimation in the Workflow Scheduling Problem,” M.S. thesis, http://dare.uva.nl/en/scriptie/496303, University of Amsterdam, the Netherlands, 2014. [4] M. Wieczorek, A. Hoheisel, and R. Prodan, “Towards a general model of the multi-criteria workflow scheduling on the grid,” Future Generation Computer Systems, vol. 25, no. 3, pp. 237–256, Mar. 2009. [5] K. V. Knyazkov, S. V. Kovalchuk, T. N. Tchurov, S. V. Maryin, and A. V. Boukhanovsky, “CLAVIRE: e-Science infrastructure for datadriven computing,” Journal of Computational Science, vol. 3, no. 6, pp. 504–510, 2012. [6] S. V. Kovalchuk, P. A. Smirnov, S. V. Maryin, T. N. Tchurov, and V. A. Karbovskiy, “Deadline-driven Resource Management within Urgent Computing Cyberinfrastructure,” Procedia Computer Science, vol. 18, pp. 2203–2212, 2013. [7] L. Yuan, K. He, and P. Fan, “A framwork for grid workflow scheduling with resource constraints,” in 2011 International Conference on Consumer Electronics, Communications and Networks (CECNet). IEEE, Apr. 2011, pp. 962–965. [8] T. S. Somasundaram and K. Govindarajan, “CLOUDRB: A framework for scheduling and managing High-Performance Computing (HPC) applications in science cloud,” Future Generation Computer Systems, vol. 34, pp. 47–65, May 2014. [9] D. I. G. Amalarethinam and F. K. M. Selvi, “A Minimum Makespan Grid Workflow Scheduling algorithm,” in 2012 International Conference on Computer Communication and Informatics. IEEE, Jan. 2012, pp. 1–6. [10] D. Kyriazis, K. Tserpes, A. Menychtas, A. Litke, and T. Varvarigou, “An innovative workflow mapping mechanism for Grids in the frame of Quality of Service,” Future Generation Computer Systems, vol. 24, no. 6, pp. 498–511, Jun. 2008. [11] G. Falzon and M. Li, “Evaluating Heuristics for Grid Workflow Scheduling,” in 2009 Fifth International Conference on Natural Computation, vol. 4. IEEE, 2009, pp. 227–231. [12] E. Juhnke, T. Dornemann, D. Bock, and B. Freisleben, “Multi-objective Scheduling of BPEL Workflows in Geographically Distributed Clouds,” in 2011 IEEE 4th International Conference on Cloud Computing. IEEE, Jul. 2011, pp. 412–419. [13] A. Afzal, J. Darlington, and A. McGough, “Stochastic Workflow Scheduling with QoS Guarantees in Grid Computing Environments,” in 2006 Fifth International Conference on Grid and Cooperative Computing (GCC’06). IEEE, 2006, pp. 185–194. [14] N. Trebon and I. Foster, “Enabling urgent computing within the existing distributed computing infrastructure,” Ph.D. dissertation, University of Chicago, Jan. 2011. [15] S. Ichikawa and S. Takagi, “Estimating the Optimal Configuration of a Multi-Core Cluster: A Preliminary Study,” in 2009 International Conference on Complex, Intelligent and Software Intensive Systems. IEEE, Mar. 2009, pp. 1245–1251. [16] S. Ichikawa, S. Takahashi, and Y. Kawai, “Optimizing process allocation of parallel programs for heterogeneous clusters,” Concurrency and Computation: Practice and Experience, vol. 21, no. 4, pp. 475–507, Mar. 2009.

[20] [21]

[22]

[23] [24] [25]

10

Y. Kishimoto and S. Ichikawa, “Optimizing the configuration of a heterogeneous cluster with multiprocessing and execution-time estimation,” Parallel Computing, vol. 31, no. 7, pp. 691–710, 2005. L. F. Bittencourt and E. R. M. Madeira, “HCOC: a cost optimization algorithm for workflow scheduling in hybrid clouds,” Journal of Internet Services and Applications, vol. 2, no. 3, pp. 207–227, Aug. 2011. W.-N. Chen and J. Z. J. Zhang, “An Ant Colony Optimization Approach to a Grid Workflow Scheduling Problem With Various QoS Requirements,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 39, no. 1, pp. 29–43, Jan. 2009. V. Korkhov, “Hierarchical Resource Management in Grid Computing,” Ph.D. dissertation, Universiteit van Amsterdam, 2009. J. O. Gutierrez-Garcia and K. M. Sim, “A Family of Heuristics for Agent-Based Cloud Bag-of-Tasks Scheduling,” in 2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery. IEEE, Oct. 2011, pp. 416–423. M. Malawski, K. Figiela, M. Bubak, E. Deelman, and J. Nabrzyski, “Cost optimization of execution of multi-level deadline-constrained scientific workflows on clouds,” in Parallel Processing and Applied Mathematics, ser. Lecture Notes in Computer Science, R. Wyrzykowski, J. Dongarra, K. Karczewski, and J. Waniewski, Eds. Springer Berlin Heidelberg, 2014, pp. 251–260. [Online]. Available: http: //dx.doi.org/10.1007/978-3-642-55224-3 24 L. Breiman, “Random forests,” 567, Department of Statistics, UC Berkeley, 1999. 31, Tech. Rep., 1999. “Pegasus workflow management system,” http://pegasus.isi.edu/, accessed: 2014-05-19. V. V. Krzhizhanovskaya, N. Melnikova, A. Chirkin, S. V. Ivanov, A. Boukhanovsky, and P. M. Sloot, “Distributed simulation of city inundation by coupled surface and subsurface porous flow for urban flood decision support system,” Procedia Computer Science, vol. 18, pp. 1046–1056, 2013.