OPTIMAL SCHEDULING ON PARALLEL PROCESSORS UNDER PRECEDENCE CONSTRAINTS AND GENERAL COSTS Zhen Liu INRIA, Centre Sophia Antipolis 2004 Route des Lucioles B.P. 93, 06902 Sophia-Antipolis FRANCE Rhonda Righter Department of Decision and Information Sciences Santa Clara University, Santa Clara, CA 95053 USA October 3, 1995, revised June 15, 1996

Abstract We consider preemptive and nonpreemptive scheduling of partially ordered tasks on parallel processors, where the precedence relations have an interval order, an in-forest, or a uniform out-forest structure. Processing times of tasks are random variables with an ILR (increasing in likelihood ratio) distribution in the nonpreemptive case and an exponential distribution in the preemptive case. We consider a general cost that is a function of time and of the uncompleted tasks and show that the Most Successors policy (MS) stochastically minimizes the cost function when it satis es certain agreeability conditions. A consequence is that MS stochastically minimizes makespan, weighted owtime, and weighted number of late jobs. PARALLEL PROCESSORS, PRECEDENCE CONSTRAINTS, SCHEDULING, MAKESPAN, FLOWTIME, LATENESS. AMS 1991 SUBJECT CLASSIFICATION: PRIMARY 90B35, SECONDARY 68M20

0

1 Introduction We consider the problem of scheduling tasks on parallel processors. The execution of tasks are constrained by precedence relations represented by a task graph which is a directed acyclic graph. A task can start execution only if all its predecessors in the task graph have completed execution. Processing requirements of tasks are independent and identically distributed (i.i.d.) random variables (r.v.'s). Three types of task graphs will be under consideration: interval order, in-forest, out-forest. For an interval-ordered task graph each task corresponds to an interval on the real line, and task a precedes task b if all points in the interval corresponding to task a are smaller than all points in the interval corresponding to task b. For an in-forest all tasks have at most one immediate successor, and for out-forests they have at most one immediate predecessor. We impose certain other conditions on the out-forest ordering that will be described later. Most prior work for scheduling tasks with precedence constraints and random task processing times on parallel processors has assumed that task executions can be preempted (and resumed), task processing times exponentially distributed, and the objective is to minimize makespan. When the task graph has an in-forest structure, the highest level rst - HLF policy has been shown to minimize makespan in various senses when there are two processors by Chandy and Reynolds (1975), Bruno (1985), Pinedo and Weiss (1985), Kulkarni and Chimento (1992). Papadimitriou and Tsitsiklis (1987) showed that HLF is asymptotically optimal when there are arbitrarily xed number of processors. Frostig (1988) showed that when task processing times are independent r.v. with (possibly dierent) ILR (increasing likelihood ratio) distributions, preemptive HLF (with ties broken by the task with the smallest hazard rate) stochastically minimizes makespan for in-forest precedences on two processors. Coman and Liu (1992) showed that the most successors (MS) policy stochastically minimizes makespan when precedence constraints have a special out-forest structure and processing times are i.i.d. and exponentially distributed. Liu and Sanlaville (1997) considered stochastic scheduling of interval-order task graphs and showed that MS stochastically minimizes makespan for any number of processors. They also extended results for in-forests and out-forests to variable pro les, i.e., they permitted the number of processors to vary over time. In this paper, we consider both preemptive and nonpreemptive stochastic scheduling. We assume that the task processing requirements have a common ILR distribution when preemption is not permitted and a common exponential distribution when preemption is permitted. When preemption is not permitted we assume that once a task is assigned to a processor its processing time becomes known. Thus, when decisions are made the times at which other processors will become available are known. In the case of nonpreemptive execution, all processors are identical and have speed 1. In 1

the case of preemptive execution, speeds of processors can be dierent and can vary in time according to some stochastic processes which are independent of task processing requirements. These speeds can be zero which represents the case that some processors are unavailable for the execution of this set of tasks during some time intervals. We will consider a general cost process which is a function of the uncompleted tasks at any time, provided the function satis es certain agreeability conditions. In particular, this cost function includes makespan, weighted owtime and lateness and weighted number of late tasks, with agreeable weights and due dates. We show that whenever a task is to be processed, the enabled task with the most successors should be processed (the MS policy) to stochastically minimize the cost process. In some cases a nonidling MS policy is optimal, where by nonidling we mean that a processor never idles if there is a task that could be done on it. A nonidling MS policy is optimal in the following cases (uniform and r-uniform will be de ned later).

preemptive execution, interval-order precedence constraints, exponential task processing times, and an arbitrary number of processors;

preemptive execution, in-forest precedence constraints, exponential task processing times, and two processors;

preemptive execution, uniform (r-uniform) out-forest precedence constraints, exponential task processing times, and two (an arbitrary number of) processors;

nonpreemptive execution, in-forest precedence constraints, ILR task processing times, and two identical processors.

In other nonpreemptive cases, an idling MS policy is optimal. Such policies may idle, but when it is decided to process a task, the task with the most successors is processed.

nonpreemptive execution, interval-order precedence constraints, ILR task processing times, and an arbitrary number of processors;

nonpreemptive execution, uniform (r-uniform) out-forest precedence constraints, ILR task processing times, and two (an arbitrary number of) processors.

Thus, for nonpreemptive scheduling the optimal policy is nonidling only when there are in-forest precedence constraints. Since a nonidling MS policy does not use any information on the times at which busy processors will become available, it will also be optimal when that information is not known. 2

Note that the HLF policy coincides with MS for in-forests. Our results are thus extensions (for the case of i.i.d. processing requirements) to a general cost function. Our proofs are also simpler than those in earlier work. In what follows, we begin with a description of precedence graphs and the permissible cost functions. Then we show the optimality of the MS policy for nonpreemptive and preemptive scheduling problems. The appendix contains preliminaries on the likelihood ratio ordering. Throughout we will use larger, increasing, etc. in the nonstrict sense.

2 Basic Model 2.1

Precedence Constraints

We assume precedence constraints are described by a directed acyclic task graph G = (V; E ), where V is the set of n vertices, or tasks, and E is the set of edges, or precedence constraints. Let s(i) = fj : (i; j ) 2 E g and p(i) = fj : (j; i) 2 E g be the set of immediate successors and predecessors respectively of a task i in V , and let

[ [

S (i) = s(i) (

j si 2

S (j ))

( )

be the set of all successors (including non-immediate successors) of task i. Thus, a task is in S (i) if and only if it cannot be started until task i has nished processing, We say a task is enabled if all of its predecessors have completed processing. A task cannot begin processing until it is enabled. Particular attention will be paid to the following three classes of task graphs.

Interval order : Each vertex i corresponds to an interval bi in the real line such that (i; j ) 2 E if and only if x 2 bi and y 2 bj imply x < y. In-forest : Each vertex has at most one immediate successor: js(i)j 1, i 2 V . Out-forest : Each vertex has at most one immediate predecessor: jp(i)j 1, i 2 V . 2.2

Cost Functions

The cost rate at time t is gt (Ut ) where Ut is the set of tasks present (uncompleted though possibly in process) at t, so U0 = V . Let U i = U [ fig. Consider the following conditions on 3

gt and the task graph G. The rst condition orders the tasks according to the MS policy, the second insures that adding tasks increases the cost, and the third is an agreeability condition that says that the marginal cost of a task is larger if it has more successors.

(A1) (A2) (A3)

jS (i)j jS (j )j 8 i; j 2 V; i j: gt (U i ) gt (U ) 8U V; i 2 U; and 8t: gt (U i ) ? gt (U ) gt (U j ) ? gt(U ) (i.e., gt (U i) gt (U j )) 8U V; i; j 2= U; i < j , and 8t:

Note that a special case is when gt (U i ) ? gt (U ) = ci 8U V; i 2= U , and ci cj 8 i; j 2 V; i j , and 8t, so our objective is weighted owtime with agreeable weights. Another special case is when gt (U i ) ? gt (U ) = ci I fdi < tg; 8U V; i 2= U , and ci cj and di dj 8 i; j 2 V; i j , where di is the deadline for task i, and I fg is the indicator function. For the makespan case g(U ) = 1 if U is not empty, so conditions (A2) and (A3) hold trivially. In the sequel when we refer to the MS policy, we will mean the MS/GC (most successors and greatest cost) policy. That is, the enabled task with the greatest marginal cost among the tasks with the most successors is processed rst on the fastest processor. Under assumptions (A1-A3), this is equivalent to processing the task with the smallest index rst. For in-forest precedence constraints we will need the following additional condition which guarantees that tasks with the same number of successors (i.e., they are at the same level) have the same marginal costs.

(A4) If jS (i)j = jS (j )j then gt (U i ) ? gt (U ) = gt (U j ) ? gt (U ) (i.e., gt (U i) = gt (U j )) 8U V and 8t:

3 Preemptive Scheduling In this section, we consider the preemptive scheduling problem. We will assume that speeds of processors can be dierent and can vary in time according to some stochastic processes which are independent of task processing requirements, the policy and the state of the system. The speed could be 0, which would correspond to the processor being unavailable or having failed. We further assume that processors do not change the speeds in nitely often during any nite time interval. Task processing times are independent and have exponential distributions. When a task is assigned to a processor, its remaining processing time is exponentially distributed with rate if the processor has speed at that time instant. The scheduling policy is allowed to preempt tasks and idle processors when there are enabled tasks. It is also permitted to be 4

randomized, that is, to make dierent decisions at an arbitrary point in time according to some probabilities. An MS policy processes the task with the greatest marginal cost among all those enabled tasks that have the most successors on the fastest processor. There is an equivalent de nition of interval-order graphs due to Papadimitriou and Yannakakis (1979): for all i; j 2 V , either S (i) S (j ) or S (j ) S (i). Such graphs include certain series-parallel graphs. We can relabel tasks according to the MS policy, so that for i j , S (i) S (j ) 8 i; j 2 V . This will imply condition (A1) for our cost function.

Theorem 3.1 For any interval-order task graph and any arbitrary number of processors, the nonidling MS policy stochastically minimizes fgt (Ut )gt among all preemptive policies, where gt () satis es conditions (A1-A3). 1

=0

Proof. Because of the exponential processing times, decision points occur only when the state of the system changes, that is, when a processor changes speed or a task completes processing. We will assume that the number of times processors change speeds is bounded, say by B in the whole time interval. The general case (B = 1) can be obtained by a limiting argument. Our proof is by induction on the number of decision points, T , where we assume that the problem stops and no further costs are incurred after time T . The statement trivially true for T = 0. Assume that it is true for T ? 1 and consider T . Let be an arbitrary policy and let the time of the rst decision point be time 0. It is easy to see by induction that can be improved upon if it is not an MS policy after time 0, so let us assume, without loss of generality, that it does agree with MS after time 0, but disagrees with MS at time 0. We will construct a new policy ^ that agrees with MS at time 0 and has stochastically smaller cost than . Then, by induction, there is an MS policy with stochastically smaller cost than ^ . Case I: First suppose that at time 0 task i is the enabled task with the most successors and that processes task j and does not process task i at time 0 (so S (i) S (j )). (We discuss the case when idles or is randomized at time 0 later.) Let ^ process task i instead of task j at time 0, and let it otherwise agree with at time 0. Let us couple the processes of processor speed changes and the processing times on each processor so that they are the same under both policies.

Case I(a): If the event at the next decision point is not the completion of task j (resp. i)

under policy (resp. ^ ), then the states will be the same under both policies, and letting ^ agree with from that point on, the result follows by induction. 5

Case I(b): If the event at the next decision point, at time s say, is the completion of task j

(resp. i) under policy (resp. ^ ), we let ^ agree with from that point on except that ^ treats task i in the same way as treats task j . Then because of our coupling of processing times and the machine pro le, the states under the two policies will be the same at all times except for tasks i and j , and task j under policy ^ will complete at the same time as task i under policy , at time u say. Recall that S (i) S (j ) so that during time interval (s; u) no successor of i nor of j is processed under , since task i has not yet nished under . Hence policy ^ thus constructed is feasible in the sense that the precedence constraints are satis ed, since it will not be processing successors of j while it is processing task j . With this construction, the set of uncompleted tasks, and hence the cost, will be the same under both policies before time s and after time u. From condition (A3), at all times t between times s and u,

ct [ fjg) = gt(Ubt); gt (Ut ) = gt (Wt [ fig) gt (Wt [ fj g) = gt (W

ct) is the set of uncompleted where Ut (resp. Ubt ) is the set of uncompleted tasks, and Wt (resp. W tasks excluding tasks i and j , at time t under policy (resp. ^ ). Case II: Assume now that both tasks i and j (with S (i) S (j )) are processed under at time 0, but i is assigned to a processor of speed 1 and j to a processor of speed 2 with 1 < 2 . Let ^ interchange the assignments of tasks i and j and otherwise agree with at time 0. If the event at the next decision point is not the completion of one of the tasks i and j , then the states will be the same under both policies, and letting ^ agree with from that point on, the result follows by induction. If the event at the next decision point, at time s say, is the completion of one of these two tasks, then the states under the two policies will be the same at all times before s. At time s, we couple the task that completes as follows. With probability 1 =(1 + 2 ) we let task j complete at time s under both policies (case II(a)). With probability 1 =(1 + 2 ) we let task i complete at time s under both policies (case II(b)). Finally, with probability (2 ? 1 )=(1 + 2 ) we let task j (resp. i) complete at time s under policy (resp. ^ ) (case II(c)). In the rst two cases the systems are identical. In the third case, case II(c), the argument proceeds as in case I(b), by letting ^ agree with from time s on except that ^ treats task i in the same way as treats task j . As in case I(b), the set of uncompleted tasks, and hence the cost, will be the same under both policies before time s and after time u, and the cost between times s and u will be greater under than under ^ . Case III: If idles at time 0 it is easy to construct a policy ^ that does not idle at time 0 and that has stochastically smaller cost, using condition (A2). Case IV: If is randomized at time 0, so that it makes dierent decisions with given

probabilities, we can construct a randomized ^ that is coupled to the decisions of such that for each possible decision of , ^ has stochastically smaller cost. 2 6

The proof provided above is simpler than that of Liu and Sanlaville (1997) who established the optimality of MS policy for the minimization of makespan. Consider now in-forests. In this case all tasks have at most one immediate successor, and we say that a task is at level l if it has l (not necessarily immediate) successors. Thus the MS policy corresponds to HLF (highest level rst), where by highest we mean largest.

Theorem 3.2 Let there be two processors. For any in-forest task graph, a nonidling MS policy stochastically minimizes fg(Ut )gt among all preemptive policies if the cost function satis es 1

conditions (A1-A4).

=0

This theorem can be shown using the ideas of the proofs of theorem 3.1 above and theorem 4.1 in the next section. The basic outline of the proof is as in the proof of theorem 3.1, with the only dierence being that in cases I(b) and II(c) it will now be possible for to process a successor of task j during the interval (s; t). Thus, ^ may not be able to agree with during that interval. In the proof of theorem 4.1 we show how to construct ^ during (s; t), when this is the case, so that the cost is less under ^ than under . For out-forest task graphs, a vertex is called a root if it has no predecessors, and a vertex and all its successors is called a subtree. We consider two special classes of such graphs, uniform and r-uniform out-forests, introduced in Coman and Liu (1992). Let T1 and T2 be two out-trees. The out-tree T2 is said to embed the out-tree T1 if T1 is isomorphic to a subgraph of T2 . That is, there is an injective embedding function f from T1 into T2 such that 8u; v 2 T1 , if v is a successor of u then f (v) is a successor of f (u). If T2 embeds T1 and if f (r1 ) = r2 where ri is the root of Ti, then we say f is a root-embedding function. An out-forest is uniform (r-uniform) if all its subtrees can be ordered by the embedding (rootembedding) function.

Theorem 3.3 For any uniform (resp. r-uniform) out-forest task graph, a nonidling MS policy stochastically minimizes fg(Ut )gt among all preemptive policies if the cost function satis es 1

=0

conditions (A1-A3) and if there are two (resp. arbitrary number of) processors.

The above theorem can be shown using the arguments of theorem 3.1 and of Coman and Liu (1992). The detailed proof is left to the interested reader.

4 Nonpreemptive Scheduling Now we consider nonpreemptive scheduling. We suppose in this section that as soon as a task is assigned to a processor, we learn its processing time. That is, the next time we make a 7

decision about task assignment to an available processor, we know when the other processor will become available. For the case of in-forest task graphs the optimal policy will not depend on this information and so will still be optimal even when the information is not available. For out-forest and interval-order task graphs the optimal policy will use the information on when processors will become available. First consider in-forest task graphs. We assume that there are two identical processors which have the same constant speed. Following Chandy and Reynolds (1975) we say that two sets of tasks satisfy U Ue (Ue is atter than U ) if Ue has smaller size than U and has at most as many tasks as U at or above level k for all k. Note that under assumptions (A2-A4) if U Ue then g(U ) g(Ue ). We will show that the nonidling MS policy is optimal in this case. Then, since the nonidling MS policy does not use the information on time to processor availability, it will still be optimal even when that information is unknown.

Theorem 4.1 Assume there are two identical processors. For any in-forest task graph, a nonidling MS policy stochastically minimizes fg(Ut )gt among nonpreemptive policies if the cost 1

=0

function satis es conditions (A1-A4) and if processing times are i.i.d. and ILR.

Proof. For this proof when we refer to an MS policy we mean a nonidling MS policy.

Let s(i) = s1 (i) be the immediate successor to task i, let s2 (i) = s(s(i)), etc. For time t let Ut (resp. Ubt , UtMS ) be the set of uncompleted tasks for policy (resp. ^ , MS).

Our proof is by induction on the number of tasks, n. We will prove that for an arbitrary policy , we can couple the sets of uncompleted tasks under and under MS for all time t so that Ut UtMS with probability 1. The result is trivially true for 1 task. Assume that it holds for n ? 1 and consider n. Let be a policy that disagrees with MS for the rst assignment. It is easy to see by induction that can be improved upon if it does not agree with MS after the rst assignment, so let us assume that agrees with MS after the rst assignment. We will construct a new policy ^ that agrees with MS for the rst assignment and such that Ut Ubt with probability 1. Then by induction Ubt UtMS with probability 1, so the result will then follow. If is randomized, we can construct a policy ^ for each of its decisions, so let us consider just one of its decisions. Suppose that at time 0 task i is the enabled task with the most successors (is at the highest level), and that processes task j i instead of task i at time 0, on processor P1 say, and processes task i at time d > 0, on processor P2 say. Processors P1 and P2 may be the same. The case in which idles will be analyzed later. Without loss of generality because of condition (A4), assume that jS (i)j > jS (j )j. Let ^ process task i at 8

time 0 on processor P1 and task j at time d on processor P2 and let it otherwise agree with whenever possible.

We couple the processing times of the tasks other than i and j so that the kth task to be assigned under has the same processing time as the kth task to be assigned under ^ . Let T (l) (resp. Tb(l)) be the completion time of task l under policy (resp. ^ ). We couple (T (i); T (j )) with (Tb(i); Tb(j )) as follows. Let us generate a processing time, X , which is the potential processing time of the task assigned at time 0 for both policies. Let us also generate d, given X , which will be the same for both policies. Refer to gure 1 for the Gantt chart showing the coupling described below, where S is the system under and S is the system under . Case I: If X d, we let X be the actual processing time of the task assigned to processor P1 at time 0 for both policies. We also couple the processing time of the task assigned at time d so that it is the same under both policies. Note that in this case Tb(i) = T (j ) T (i) = Tb(j ). 0

0

Case II: If X > d, let R = Xd be the remaining processing time of the task assigned at

time 0, let Y be the processing time of the task assigned at time d under policy , and let Rb b Yb ) such and Yb be similarly de ned for ^ . From corollary 6.3 we can couple (R; Y ) with (R; b Yb ), max(R; Y ) = max(R;b Yb ), and Rb Y . Indeed, as shown in the that min(R; Y ) = min(R; comments after corollary 6.3, either R = Yb and Y = Rb or R = Rb Y = Yb . In the rst scenario, in which the realizations of the random variables are interchanged, T (i) = Tb(i) and T (j ) = Tb(j ) (subcases (a) and (b) in gure 1). In the second scenario, in which R = Rb Y = Yb , the task assigned at time 0 will complete before the task assigned at time d under both policies, i.e., Tb(i) = T (j ) T (i) = Tb(j ) (subcase (c) in gure 1). Thus, combining cases I and II, we have coupled our systems so that task completions occur at the same times and either the same task completes at the same time in both systems (so the interchange has no eect), or the task that is started rst completes rst. That is, either T (i) = Tb(i) and T (j ) = Tb(j ) (X > d, R = Yb , Rb = Y ; cases II(a) and (b)) or Tb(i) = T (j ) T (i) = Tb(j ) ((X d) or (X > d, R = Rb Y = Yb ); cases I and II(c)). If T (i) = Tb(i) and T (j ) = Tb(j ) then letting ^ agree with MS (and ), the sets of uncompleted tasks will be identical at all times.

If s := Tb(i) = T (j ) T (i) = Tb(j ) =: u (case II(c)) then, except for the interchanged tasks i and j , before time s we let ^ agree with , and between times s and u we let ^ agree with if possible (in the sense that precedence constraints are satis ed). We de ne ^ when it cannot agree with below. Our de nition of ^ therefore assumes that ^ can distinguish between cases II(b) and II(c) in gure 1, since in case II(b) ^ simply follows the MS policy. We can think of ^ as a randomized policy as follows. If either case II(b) or II(c) occurs, that is, when task 9

i completes task j is still being processed under ^ , policy ^ randomly decides, according to a Bernoulli variable, B , whether to follow the MS policy (case II(b), B = 1), or to follow the policy described below for case II(c) (B = 0). We couple the value of B with the case that occurs, so that B = 1 if case II(b) occurs and B = 0 if case II(c) occurs. The only eect of this coupling on ^ is that now knowing B will give ^ information on the remaining processing time of job j , but that doesn't matter since we assume that ^ already knows the remaining processing time of job j . Also ^ as described below may idle, but by induction there will then be a nonidling MS policy that has stochastically smaller cost than ^ . Since agrees with MS after the rst assignment, if ^ knows it is in case II(c), it will know what will do. In case II(c), if ^ can agree with between times s and u it will, and at time u the systems under both policies will be in the same state. Before time s and after time u the sets of uncompleted tasks under both policies will be the same. At all times t between times s and u, ct [ fjg = Ubt; Ut = Wt [ fig Wt [ fj g = W ct) is the set of uncompleted tasks excluding tasks i and j at time t under where Wt (resp. W policy (resp. ^ ). Suppose that due to precedence constraints, ^ cannot agree with at some time between times s and u in case II(c). We will construct ^ so that and ^ have task completions at the same times, and therefore have the same number of uncompleted tasks, and so that the set of uncompleted tasks is atter under ^ than under . Let r be the rst time at which ^ cannot agree with . It must be that at time r, begins processing the immediate successor to task j , s1(j ), on processor P1 say. Since agrees with MS after time 0, task s1 (j ) must be the enabled task with the most successors (at the highest level) under . In particular, task i must be in progress under on the other processor, call it P2, and all tasks at the level of task i or higher must have completed (because there are at most two processors and one is available to process task s1 (j )). Since ^ has agreed with until time r except for the interchange of tasks i and j , and since task i has completed under ^ , the immediate successor to task i, s1 (i), must be enabled under ^ . We let ^ process task s1 (i) on P1 while is processing task s1 (j ), so the processing time of task s1 (i) under ^ equals the processing time of task s1 (j ) under . Refer to gure 2. Recall that s1 (i) has more successors than s1 (j ), i.e., s1 (i) < s1 (j ), because i has more successors than j . If ^ cannot agree with again between times r and u then it must be when processes s2 (j ), in which case ^ will process s2 (i), and s2 (i) < s2 (j ). We can continue this argument until time u. Thus, between times s and u, Ut Ubt . Suppose that (resp. ^ ) assigns k successors of task j (resp. i) on processor P1 between times s and u. After time u we let ^ process task sl (j ) whenever processes task sl (i), 10

l = 1; : : : ; k, and otherwise agree with . Recall that T (l) (resp. Tb(l)) is the completion time of task l under policy (resp. ^ ), and except for the rst k successors of i and j , T (l) = Tb(l). If T (sk (j )) = Tb(sk (i)) u then T (sl (j )) = Tb(sl (i)) T (sl (i)) = Tb(sl (j )) for l = 1; : : : ; k, and so for t s, Ut Ubt and we are done. If task sk (j ) (resp. sk (i)) is still in progress on processor P1 at time u under (resp. ^ ), then, as long as P1 remains busy with task sk (j ) (resp. sk (i)), (resp. ^ ) will assign tasks s1 (i); s2 (i); : : : ; sk (i) (resp. s1 (j ); s2 (j ); : : : ; sk (j )), in that order, to processor P2. Refer again to gure 2. If sk (j ) (resp. sk (i)) completes on P1 before task sk (i) (resp. sk (j )) is assigned to P2 under policy (resp. ^ ), then we again have Ut Ubt for t u. Otherwise, suppose at time v policy (resp. ^ ) assigns task sk (i) (resp. sk (j )) to P2 and task sk (j ) (resp. sk (i)) has been in progress since time u on P1 and has not yet completed. Then we can do the same coupling construction that we did for the processing times of tasks i and j under policies and ^ in case II, using lemma 6.2, so that either T (sk (i)) = Tb(sk (i)) and T (sk (j )) = Tb(sk (j )) or Tb(sk (i)) = T (sk (j )) T (sk (i)) = Tb(sk (j )). (The latter case is shown in gure 2.) In both cases we again have Ut Ubt for t u. Now we show that the optimal policy is nonidling. Suppose a processor, say processor 1, is idle and available at time 0, but that an MS policy does not assign task i until some time d > 0. If processor 2 is unavailable or processing a task with fewer successors than i at time 0 then task i will be the rst task to be assigned since is an MS policy. If processor 2 is processing a task with more successors than i at time 0 then when that task completes it will enable at most one task with more successors than i because of the in-forest structure of precedences. This new enabled task, and similarly any successors that are assigned before time d, can therefore be assigned to processor 2, and we may assume, without loss of generality, that processor 1 remains idle until time d at which time assigns task i to processor 1. Let ^ assign task i to processor 1 at time 0, and let it agree with for assignments to processor 2 until time d. Let X be the processing time of task i under both policies, and let ^ idle processor 1 from time X to time d + X , and let it otherwise agree with . Then at time d + X the systems under both policies will be in the same state, and before that time ^ will have smaller cost by assumption (A2). By the induction hypothesis there will be a nonidling MS policy with smaller cost than policy ^ , so we are done. 2 When we have out-forests or interval-order task graphs and preemption is not permitted it may be desirable to idle. For example, consider the interval-order task graph of gure 3 . If task 1 is being processed and another processor is available, we may prefer to wait until task 1 has nished and then assign tasks 2 and 3 to processors, rather than assigning task 6 now. This is especially true if the other processor will become available soon. Using the basic approach of the proof above, we can show for out-forests and interval-order task graphs that whenever it is decided not to idle, a task should be assigned according to the MS policy, assuming that we know the processing time of a task as soon as it is assigned to a processor. That is, an idling 11

MS policy is optimal. However this policy, and in particular the decision on whether or not to idle at a certain time, will depend on the time at which other processors become available. Therefore, unlike the out-forest case above, we cannot conclude that an idling MS policy is still optimal when we don't have information on when processors will become available.

Theorem 4.2 For any interval-order task graph and an arbitrary number of identical proces-

sors, and for any uniform (resp. r-uniform) out-forest task graph and two (resp. an arbitrary number of) processors, an idling MS policy stochastically minimizes fg(Ut )gt=0 among nonpreemptive policies if the cost function satis es conditions (A1-A3) and if processing times are i.i.d. and ILR and the times at which processors will next become available are known. 1

Proof. The basic outline of the proof is as in the proof to theorem 4.1, with and ^ as de ned there. Again, because of the ILR processing times, we can couple the processing times of the interchanged tasks i and j so that either the interchange has no eect or the task that is started rst completes rst. If the interchange has no eect then both policies will be in the same state and we are done. If the task that is started rst completes rst, we can argue as we did in case I(b) in the proof of theorem 3.1 to get the result for interval-order task graphs. The result for out-forest task graphs follows along the lines of the proof of Coman and Liu (1992).

2

5 Acknowledgement The presentation of our results has bene tted greatly from helpful feedback from the referee.

6 Appendix: Likelihood Ratio Ordering and ILR random variables We say that the r.v. X is larger than the r.v. Y in the stochastic ordering sense, X st Y , if F (t) G(t) for all t, where F (t) and G(t) are cumulated distribution functions of X and Y , respectively. We say that X is larger than Y in the likelihood ratio ordering, X lr Y , if fg((tt)) is increasing in t, where f (t) and g(t) are the densities or probability mass functions of X and Y respectively. We have the following results. For lemma 6.1, see e.g., Ross (1983), and for lemma 6.2 and corollary 6.3 see Righter (1994).

Lemma 6.1 X st Y if and only if there are X and Y de ned on the same probability space such that X and X (resp. Y and Y ) have the same distribution and that X Y almost 0

0

0

0

0

surely (a.s.).

12

0

Lemma 6.2 Assume X and Y are independent. X lr Y if and only if (X j min(X; Y ) = m; max(X; Y ) = M ) st (Y j min(X; Y ) = m; max(X; Y ) = M ) for all m M . In other words, given m = min(X; Y ) and M = max(X; Y ), we have that X lr Y if and only if P fX = mjm; M g = P fY = M jm; M g P fX = M jm; M g = P fY = mjm; M g. Let Xt = (X ? tjX > t). We say that X is increasing in likelihood ratio (ILR) if Xs lr Xt for all s t. Exponentially distributed random variables are, of course, ILR (and DLR). A random variable that is ILR is also IHR (increasing in hazard rate). We have the following corollary, which follows from lemma 6.2.

Corollary 6.3 Assume X and X are i.i.d. and ILR, and Xb and Xb are i.i.d. with the same 1

2

1

2

distributions as X 1 and X 2 . Then we can couple (Xt1 ; X 2 ) with (Xbt1 ; Xb 2 ) so that min(Xt1 ; X 2 ) = min(Xbt1 ; Xb 2 ), max(Xt1 ; X 2 ) = max(Xbt1 ; Xb 2 ); and Xb t1 X 2 .

Let p = P fXt1 = M j min(Xt1 ; X 2 ) = m; max(Xt1 ; X 2 ) = M g. Then our corollary says that given m = min(Xt1 ; X 2 ) = min(Xbt1 ; Xb 2 ) and M = max(Xt1 ; X 2 ) = max(Xbt1 ; Xb 2 ), we can further couple (Xt1 ; X 2 ) with (Xbt1 ; Xb 2 ) so that one of the following three cases occurs.

Xbt1 = X 2 = m and Xb 2 = Xt1 = M , with probability p,

Xbt1 = X 2 = M and Xb 2 = Xt1 = m, with probability p,

Xbt1 = Xt1 = m and Xb 2 = X 2 = M , with probability 1 ? 2p.

Thus, our random variables can be coupled so that either Xbt1 = X 2 and Xb 2 = Xt1 , or Xbt1 = Xt1 Xb 2 = X 2 .

References [1] J. Bruno (1985). On scheduling tasks with exponential service times and in-tree precedence constraints. Acta Informatica. 22: 139-148. [2] K.M. Chandy and P.F. Reynolds (1975). Scheduling partially ordered tasks with probabilistic execution times. Operating System Review. 9: 169-177. 13

[3] E.G. Coman, Jr. and Z. Liu (1992). On the optimal stochastic scheduling of out-forests Opns. Res. 40: S67-S75. [4] E. Frostig (1988). A stochastic scheduling problem with intree precedence constraints. Opns. Res. 36: 937-943. [5] V.G. Kulkarni and P. F. Chimento, Jr. (1992). Optimal scheduling of exponential tasks with in-tree precedence constraints on two parallel processors subject to failure and repair. Opns. Res. 40: S263-271. [6] Z. Liu and E. Sanlaville (1997). Stochastical scheduling with variable pro le and precedence constraints. To appear in SIAM J. on Computing, 26 (1). [7] C.H. Papadimitriou and J. N. Tsitsiklis (1987). On stochastic scheduling with in-tree precedence constraints. SIAM J. Comp. 16: 1-6. [8] C.H. Papadimitriou and M. Yannakakis (1979). Scheduling interval-ordered tasks. SIAM J. Comp. 8: 405-409. [9] M. Pinedo, and G. Weiss (1985). Scheduling jobs with exponentially distributed processing times and intree precedence constraints on two parallel machines. Opns. Res. 33: 1381-1388.

14

Abstract We consider preemptive and nonpreemptive scheduling of partially ordered tasks on parallel processors, where the precedence relations have an interval order, an in-forest, or a uniform out-forest structure. Processing times of tasks are random variables with an ILR (increasing in likelihood ratio) distribution in the nonpreemptive case and an exponential distribution in the preemptive case. We consider a general cost that is a function of time and of the uncompleted tasks and show that the Most Successors policy (MS) stochastically minimizes the cost function when it satis es certain agreeability conditions. A consequence is that MS stochastically minimizes makespan, weighted owtime, and weighted number of late jobs. PARALLEL PROCESSORS, PRECEDENCE CONSTRAINTS, SCHEDULING, MAKESPAN, FLOWTIME, LATENESS. AMS 1991 SUBJECT CLASSIFICATION: PRIMARY 90B35, SECONDARY 68M20

0

1 Introduction We consider the problem of scheduling tasks on parallel processors. The execution of tasks are constrained by precedence relations represented by a task graph which is a directed acyclic graph. A task can start execution only if all its predecessors in the task graph have completed execution. Processing requirements of tasks are independent and identically distributed (i.i.d.) random variables (r.v.'s). Three types of task graphs will be under consideration: interval order, in-forest, out-forest. For an interval-ordered task graph each task corresponds to an interval on the real line, and task a precedes task b if all points in the interval corresponding to task a are smaller than all points in the interval corresponding to task b. For an in-forest all tasks have at most one immediate successor, and for out-forests they have at most one immediate predecessor. We impose certain other conditions on the out-forest ordering that will be described later. Most prior work for scheduling tasks with precedence constraints and random task processing times on parallel processors has assumed that task executions can be preempted (and resumed), task processing times exponentially distributed, and the objective is to minimize makespan. When the task graph has an in-forest structure, the highest level rst - HLF policy has been shown to minimize makespan in various senses when there are two processors by Chandy and Reynolds (1975), Bruno (1985), Pinedo and Weiss (1985), Kulkarni and Chimento (1992). Papadimitriou and Tsitsiklis (1987) showed that HLF is asymptotically optimal when there are arbitrarily xed number of processors. Frostig (1988) showed that when task processing times are independent r.v. with (possibly dierent) ILR (increasing likelihood ratio) distributions, preemptive HLF (with ties broken by the task with the smallest hazard rate) stochastically minimizes makespan for in-forest precedences on two processors. Coman and Liu (1992) showed that the most successors (MS) policy stochastically minimizes makespan when precedence constraints have a special out-forest structure and processing times are i.i.d. and exponentially distributed. Liu and Sanlaville (1997) considered stochastic scheduling of interval-order task graphs and showed that MS stochastically minimizes makespan for any number of processors. They also extended results for in-forests and out-forests to variable pro les, i.e., they permitted the number of processors to vary over time. In this paper, we consider both preemptive and nonpreemptive stochastic scheduling. We assume that the task processing requirements have a common ILR distribution when preemption is not permitted and a common exponential distribution when preemption is permitted. When preemption is not permitted we assume that once a task is assigned to a processor its processing time becomes known. Thus, when decisions are made the times at which other processors will become available are known. In the case of nonpreemptive execution, all processors are identical and have speed 1. In 1

the case of preemptive execution, speeds of processors can be dierent and can vary in time according to some stochastic processes which are independent of task processing requirements. These speeds can be zero which represents the case that some processors are unavailable for the execution of this set of tasks during some time intervals. We will consider a general cost process which is a function of the uncompleted tasks at any time, provided the function satis es certain agreeability conditions. In particular, this cost function includes makespan, weighted owtime and lateness and weighted number of late tasks, with agreeable weights and due dates. We show that whenever a task is to be processed, the enabled task with the most successors should be processed (the MS policy) to stochastically minimize the cost process. In some cases a nonidling MS policy is optimal, where by nonidling we mean that a processor never idles if there is a task that could be done on it. A nonidling MS policy is optimal in the following cases (uniform and r-uniform will be de ned later).

preemptive execution, interval-order precedence constraints, exponential task processing times, and an arbitrary number of processors;

preemptive execution, in-forest precedence constraints, exponential task processing times, and two processors;

preemptive execution, uniform (r-uniform) out-forest precedence constraints, exponential task processing times, and two (an arbitrary number of) processors;

nonpreemptive execution, in-forest precedence constraints, ILR task processing times, and two identical processors.

In other nonpreemptive cases, an idling MS policy is optimal. Such policies may idle, but when it is decided to process a task, the task with the most successors is processed.

nonpreemptive execution, interval-order precedence constraints, ILR task processing times, and an arbitrary number of processors;

nonpreemptive execution, uniform (r-uniform) out-forest precedence constraints, ILR task processing times, and two (an arbitrary number of) processors.

Thus, for nonpreemptive scheduling the optimal policy is nonidling only when there are in-forest precedence constraints. Since a nonidling MS policy does not use any information on the times at which busy processors will become available, it will also be optimal when that information is not known. 2

Note that the HLF policy coincides with MS for in-forests. Our results are thus extensions (for the case of i.i.d. processing requirements) to a general cost function. Our proofs are also simpler than those in earlier work. In what follows, we begin with a description of precedence graphs and the permissible cost functions. Then we show the optimality of the MS policy for nonpreemptive and preemptive scheduling problems. The appendix contains preliminaries on the likelihood ratio ordering. Throughout we will use larger, increasing, etc. in the nonstrict sense.

2 Basic Model 2.1

Precedence Constraints

We assume precedence constraints are described by a directed acyclic task graph G = (V; E ), where V is the set of n vertices, or tasks, and E is the set of edges, or precedence constraints. Let s(i) = fj : (i; j ) 2 E g and p(i) = fj : (j; i) 2 E g be the set of immediate successors and predecessors respectively of a task i in V , and let

[ [

S (i) = s(i) (

j si 2

S (j ))

( )

be the set of all successors (including non-immediate successors) of task i. Thus, a task is in S (i) if and only if it cannot be started until task i has nished processing, We say a task is enabled if all of its predecessors have completed processing. A task cannot begin processing until it is enabled. Particular attention will be paid to the following three classes of task graphs.

Interval order : Each vertex i corresponds to an interval bi in the real line such that (i; j ) 2 E if and only if x 2 bi and y 2 bj imply x < y. In-forest : Each vertex has at most one immediate successor: js(i)j 1, i 2 V . Out-forest : Each vertex has at most one immediate predecessor: jp(i)j 1, i 2 V . 2.2

Cost Functions

The cost rate at time t is gt (Ut ) where Ut is the set of tasks present (uncompleted though possibly in process) at t, so U0 = V . Let U i = U [ fig. Consider the following conditions on 3

gt and the task graph G. The rst condition orders the tasks according to the MS policy, the second insures that adding tasks increases the cost, and the third is an agreeability condition that says that the marginal cost of a task is larger if it has more successors.

(A1) (A2) (A3)

jS (i)j jS (j )j 8 i; j 2 V; i j: gt (U i ) gt (U ) 8U V; i 2 U; and 8t: gt (U i ) ? gt (U ) gt (U j ) ? gt(U ) (i.e., gt (U i) gt (U j )) 8U V; i; j 2= U; i < j , and 8t:

Note that a special case is when gt (U i ) ? gt (U ) = ci 8U V; i 2= U , and ci cj 8 i; j 2 V; i j , and 8t, so our objective is weighted owtime with agreeable weights. Another special case is when gt (U i ) ? gt (U ) = ci I fdi < tg; 8U V; i 2= U , and ci cj and di dj 8 i; j 2 V; i j , where di is the deadline for task i, and I fg is the indicator function. For the makespan case g(U ) = 1 if U is not empty, so conditions (A2) and (A3) hold trivially. In the sequel when we refer to the MS policy, we will mean the MS/GC (most successors and greatest cost) policy. That is, the enabled task with the greatest marginal cost among the tasks with the most successors is processed rst on the fastest processor. Under assumptions (A1-A3), this is equivalent to processing the task with the smallest index rst. For in-forest precedence constraints we will need the following additional condition which guarantees that tasks with the same number of successors (i.e., they are at the same level) have the same marginal costs.

(A4) If jS (i)j = jS (j )j then gt (U i ) ? gt (U ) = gt (U j ) ? gt (U ) (i.e., gt (U i) = gt (U j )) 8U V and 8t:

3 Preemptive Scheduling In this section, we consider the preemptive scheduling problem. We will assume that speeds of processors can be dierent and can vary in time according to some stochastic processes which are independent of task processing requirements, the policy and the state of the system. The speed could be 0, which would correspond to the processor being unavailable or having failed. We further assume that processors do not change the speeds in nitely often during any nite time interval. Task processing times are independent and have exponential distributions. When a task is assigned to a processor, its remaining processing time is exponentially distributed with rate if the processor has speed at that time instant. The scheduling policy is allowed to preempt tasks and idle processors when there are enabled tasks. It is also permitted to be 4

randomized, that is, to make dierent decisions at an arbitrary point in time according to some probabilities. An MS policy processes the task with the greatest marginal cost among all those enabled tasks that have the most successors on the fastest processor. There is an equivalent de nition of interval-order graphs due to Papadimitriou and Yannakakis (1979): for all i; j 2 V , either S (i) S (j ) or S (j ) S (i). Such graphs include certain series-parallel graphs. We can relabel tasks according to the MS policy, so that for i j , S (i) S (j ) 8 i; j 2 V . This will imply condition (A1) for our cost function.

Theorem 3.1 For any interval-order task graph and any arbitrary number of processors, the nonidling MS policy stochastically minimizes fgt (Ut )gt among all preemptive policies, where gt () satis es conditions (A1-A3). 1

=0

Proof. Because of the exponential processing times, decision points occur only when the state of the system changes, that is, when a processor changes speed or a task completes processing. We will assume that the number of times processors change speeds is bounded, say by B in the whole time interval. The general case (B = 1) can be obtained by a limiting argument. Our proof is by induction on the number of decision points, T , where we assume that the problem stops and no further costs are incurred after time T . The statement trivially true for T = 0. Assume that it is true for T ? 1 and consider T . Let be an arbitrary policy and let the time of the rst decision point be time 0. It is easy to see by induction that can be improved upon if it is not an MS policy after time 0, so let us assume, without loss of generality, that it does agree with MS after time 0, but disagrees with MS at time 0. We will construct a new policy ^ that agrees with MS at time 0 and has stochastically smaller cost than . Then, by induction, there is an MS policy with stochastically smaller cost than ^ . Case I: First suppose that at time 0 task i is the enabled task with the most successors and that processes task j and does not process task i at time 0 (so S (i) S (j )). (We discuss the case when idles or is randomized at time 0 later.) Let ^ process task i instead of task j at time 0, and let it otherwise agree with at time 0. Let us couple the processes of processor speed changes and the processing times on each processor so that they are the same under both policies.

Case I(a): If the event at the next decision point is not the completion of task j (resp. i)

under policy (resp. ^ ), then the states will be the same under both policies, and letting ^ agree with from that point on, the result follows by induction. 5

Case I(b): If the event at the next decision point, at time s say, is the completion of task j

(resp. i) under policy (resp. ^ ), we let ^ agree with from that point on except that ^ treats task i in the same way as treats task j . Then because of our coupling of processing times and the machine pro le, the states under the two policies will be the same at all times except for tasks i and j , and task j under policy ^ will complete at the same time as task i under policy , at time u say. Recall that S (i) S (j ) so that during time interval (s; u) no successor of i nor of j is processed under , since task i has not yet nished under . Hence policy ^ thus constructed is feasible in the sense that the precedence constraints are satis ed, since it will not be processing successors of j while it is processing task j . With this construction, the set of uncompleted tasks, and hence the cost, will be the same under both policies before time s and after time u. From condition (A3), at all times t between times s and u,

ct [ fjg) = gt(Ubt); gt (Ut ) = gt (Wt [ fig) gt (Wt [ fj g) = gt (W

ct) is the set of uncompleted where Ut (resp. Ubt ) is the set of uncompleted tasks, and Wt (resp. W tasks excluding tasks i and j , at time t under policy (resp. ^ ). Case II: Assume now that both tasks i and j (with S (i) S (j )) are processed under at time 0, but i is assigned to a processor of speed 1 and j to a processor of speed 2 with 1 < 2 . Let ^ interchange the assignments of tasks i and j and otherwise agree with at time 0. If the event at the next decision point is not the completion of one of the tasks i and j , then the states will be the same under both policies, and letting ^ agree with from that point on, the result follows by induction. If the event at the next decision point, at time s say, is the completion of one of these two tasks, then the states under the two policies will be the same at all times before s. At time s, we couple the task that completes as follows. With probability 1 =(1 + 2 ) we let task j complete at time s under both policies (case II(a)). With probability 1 =(1 + 2 ) we let task i complete at time s under both policies (case II(b)). Finally, with probability (2 ? 1 )=(1 + 2 ) we let task j (resp. i) complete at time s under policy (resp. ^ ) (case II(c)). In the rst two cases the systems are identical. In the third case, case II(c), the argument proceeds as in case I(b), by letting ^ agree with from time s on except that ^ treats task i in the same way as treats task j . As in case I(b), the set of uncompleted tasks, and hence the cost, will be the same under both policies before time s and after time u, and the cost between times s and u will be greater under than under ^ . Case III: If idles at time 0 it is easy to construct a policy ^ that does not idle at time 0 and that has stochastically smaller cost, using condition (A2). Case IV: If is randomized at time 0, so that it makes dierent decisions with given

probabilities, we can construct a randomized ^ that is coupled to the decisions of such that for each possible decision of , ^ has stochastically smaller cost. 2 6

The proof provided above is simpler than that of Liu and Sanlaville (1997) who established the optimality of MS policy for the minimization of makespan. Consider now in-forests. In this case all tasks have at most one immediate successor, and we say that a task is at level l if it has l (not necessarily immediate) successors. Thus the MS policy corresponds to HLF (highest level rst), where by highest we mean largest.

Theorem 3.2 Let there be two processors. For any in-forest task graph, a nonidling MS policy stochastically minimizes fg(Ut )gt among all preemptive policies if the cost function satis es 1

conditions (A1-A4).

=0

This theorem can be shown using the ideas of the proofs of theorem 3.1 above and theorem 4.1 in the next section. The basic outline of the proof is as in the proof of theorem 3.1, with the only dierence being that in cases I(b) and II(c) it will now be possible for to process a successor of task j during the interval (s; t). Thus, ^ may not be able to agree with during that interval. In the proof of theorem 4.1 we show how to construct ^ during (s; t), when this is the case, so that the cost is less under ^ than under . For out-forest task graphs, a vertex is called a root if it has no predecessors, and a vertex and all its successors is called a subtree. We consider two special classes of such graphs, uniform and r-uniform out-forests, introduced in Coman and Liu (1992). Let T1 and T2 be two out-trees. The out-tree T2 is said to embed the out-tree T1 if T1 is isomorphic to a subgraph of T2 . That is, there is an injective embedding function f from T1 into T2 such that 8u; v 2 T1 , if v is a successor of u then f (v) is a successor of f (u). If T2 embeds T1 and if f (r1 ) = r2 where ri is the root of Ti, then we say f is a root-embedding function. An out-forest is uniform (r-uniform) if all its subtrees can be ordered by the embedding (rootembedding) function.

Theorem 3.3 For any uniform (resp. r-uniform) out-forest task graph, a nonidling MS policy stochastically minimizes fg(Ut )gt among all preemptive policies if the cost function satis es 1

=0

conditions (A1-A3) and if there are two (resp. arbitrary number of) processors.

The above theorem can be shown using the arguments of theorem 3.1 and of Coman and Liu (1992). The detailed proof is left to the interested reader.

4 Nonpreemptive Scheduling Now we consider nonpreemptive scheduling. We suppose in this section that as soon as a task is assigned to a processor, we learn its processing time. That is, the next time we make a 7

decision about task assignment to an available processor, we know when the other processor will become available. For the case of in-forest task graphs the optimal policy will not depend on this information and so will still be optimal even when the information is not available. For out-forest and interval-order task graphs the optimal policy will use the information on when processors will become available. First consider in-forest task graphs. We assume that there are two identical processors which have the same constant speed. Following Chandy and Reynolds (1975) we say that two sets of tasks satisfy U Ue (Ue is atter than U ) if Ue has smaller size than U and has at most as many tasks as U at or above level k for all k. Note that under assumptions (A2-A4) if U Ue then g(U ) g(Ue ). We will show that the nonidling MS policy is optimal in this case. Then, since the nonidling MS policy does not use the information on time to processor availability, it will still be optimal even when that information is unknown.

Theorem 4.1 Assume there are two identical processors. For any in-forest task graph, a nonidling MS policy stochastically minimizes fg(Ut )gt among nonpreemptive policies if the cost 1

=0

function satis es conditions (A1-A4) and if processing times are i.i.d. and ILR.

Proof. For this proof when we refer to an MS policy we mean a nonidling MS policy.

Let s(i) = s1 (i) be the immediate successor to task i, let s2 (i) = s(s(i)), etc. For time t let Ut (resp. Ubt , UtMS ) be the set of uncompleted tasks for policy (resp. ^ , MS).

Our proof is by induction on the number of tasks, n. We will prove that for an arbitrary policy , we can couple the sets of uncompleted tasks under and under MS for all time t so that Ut UtMS with probability 1. The result is trivially true for 1 task. Assume that it holds for n ? 1 and consider n. Let be a policy that disagrees with MS for the rst assignment. It is easy to see by induction that can be improved upon if it does not agree with MS after the rst assignment, so let us assume that agrees with MS after the rst assignment. We will construct a new policy ^ that agrees with MS for the rst assignment and such that Ut Ubt with probability 1. Then by induction Ubt UtMS with probability 1, so the result will then follow. If is randomized, we can construct a policy ^ for each of its decisions, so let us consider just one of its decisions. Suppose that at time 0 task i is the enabled task with the most successors (is at the highest level), and that processes task j i instead of task i at time 0, on processor P1 say, and processes task i at time d > 0, on processor P2 say. Processors P1 and P2 may be the same. The case in which idles will be analyzed later. Without loss of generality because of condition (A4), assume that jS (i)j > jS (j )j. Let ^ process task i at 8

time 0 on processor P1 and task j at time d on processor P2 and let it otherwise agree with whenever possible.

We couple the processing times of the tasks other than i and j so that the kth task to be assigned under has the same processing time as the kth task to be assigned under ^ . Let T (l) (resp. Tb(l)) be the completion time of task l under policy (resp. ^ ). We couple (T (i); T (j )) with (Tb(i); Tb(j )) as follows. Let us generate a processing time, X , which is the potential processing time of the task assigned at time 0 for both policies. Let us also generate d, given X , which will be the same for both policies. Refer to gure 1 for the Gantt chart showing the coupling described below, where S is the system under and S is the system under . Case I: If X d, we let X be the actual processing time of the task assigned to processor P1 at time 0 for both policies. We also couple the processing time of the task assigned at time d so that it is the same under both policies. Note that in this case Tb(i) = T (j ) T (i) = Tb(j ). 0

0

Case II: If X > d, let R = Xd be the remaining processing time of the task assigned at

time 0, let Y be the processing time of the task assigned at time d under policy , and let Rb b Yb ) such and Yb be similarly de ned for ^ . From corollary 6.3 we can couple (R; Y ) with (R; b Yb ), max(R; Y ) = max(R;b Yb ), and Rb Y . Indeed, as shown in the that min(R; Y ) = min(R; comments after corollary 6.3, either R = Yb and Y = Rb or R = Rb Y = Yb . In the rst scenario, in which the realizations of the random variables are interchanged, T (i) = Tb(i) and T (j ) = Tb(j ) (subcases (a) and (b) in gure 1). In the second scenario, in which R = Rb Y = Yb , the task assigned at time 0 will complete before the task assigned at time d under both policies, i.e., Tb(i) = T (j ) T (i) = Tb(j ) (subcase (c) in gure 1). Thus, combining cases I and II, we have coupled our systems so that task completions occur at the same times and either the same task completes at the same time in both systems (so the interchange has no eect), or the task that is started rst completes rst. That is, either T (i) = Tb(i) and T (j ) = Tb(j ) (X > d, R = Yb , Rb = Y ; cases II(a) and (b)) or Tb(i) = T (j ) T (i) = Tb(j ) ((X d) or (X > d, R = Rb Y = Yb ); cases I and II(c)). If T (i) = Tb(i) and T (j ) = Tb(j ) then letting ^ agree with MS (and ), the sets of uncompleted tasks will be identical at all times.

If s := Tb(i) = T (j ) T (i) = Tb(j ) =: u (case II(c)) then, except for the interchanged tasks i and j , before time s we let ^ agree with , and between times s and u we let ^ agree with if possible (in the sense that precedence constraints are satis ed). We de ne ^ when it cannot agree with below. Our de nition of ^ therefore assumes that ^ can distinguish between cases II(b) and II(c) in gure 1, since in case II(b) ^ simply follows the MS policy. We can think of ^ as a randomized policy as follows. If either case II(b) or II(c) occurs, that is, when task 9

i completes task j is still being processed under ^ , policy ^ randomly decides, according to a Bernoulli variable, B , whether to follow the MS policy (case II(b), B = 1), or to follow the policy described below for case II(c) (B = 0). We couple the value of B with the case that occurs, so that B = 1 if case II(b) occurs and B = 0 if case II(c) occurs. The only eect of this coupling on ^ is that now knowing B will give ^ information on the remaining processing time of job j , but that doesn't matter since we assume that ^ already knows the remaining processing time of job j . Also ^ as described below may idle, but by induction there will then be a nonidling MS policy that has stochastically smaller cost than ^ . Since agrees with MS after the rst assignment, if ^ knows it is in case II(c), it will know what will do. In case II(c), if ^ can agree with between times s and u it will, and at time u the systems under both policies will be in the same state. Before time s and after time u the sets of uncompleted tasks under both policies will be the same. At all times t between times s and u, ct [ fjg = Ubt; Ut = Wt [ fig Wt [ fj g = W ct) is the set of uncompleted tasks excluding tasks i and j at time t under where Wt (resp. W policy (resp. ^ ). Suppose that due to precedence constraints, ^ cannot agree with at some time between times s and u in case II(c). We will construct ^ so that and ^ have task completions at the same times, and therefore have the same number of uncompleted tasks, and so that the set of uncompleted tasks is atter under ^ than under . Let r be the rst time at which ^ cannot agree with . It must be that at time r, begins processing the immediate successor to task j , s1(j ), on processor P1 say. Since agrees with MS after time 0, task s1 (j ) must be the enabled task with the most successors (at the highest level) under . In particular, task i must be in progress under on the other processor, call it P2, and all tasks at the level of task i or higher must have completed (because there are at most two processors and one is available to process task s1 (j )). Since ^ has agreed with until time r except for the interchange of tasks i and j , and since task i has completed under ^ , the immediate successor to task i, s1 (i), must be enabled under ^ . We let ^ process task s1 (i) on P1 while is processing task s1 (j ), so the processing time of task s1 (i) under ^ equals the processing time of task s1 (j ) under . Refer to gure 2. Recall that s1 (i) has more successors than s1 (j ), i.e., s1 (i) < s1 (j ), because i has more successors than j . If ^ cannot agree with again between times r and u then it must be when processes s2 (j ), in which case ^ will process s2 (i), and s2 (i) < s2 (j ). We can continue this argument until time u. Thus, between times s and u, Ut Ubt . Suppose that (resp. ^ ) assigns k successors of task j (resp. i) on processor P1 between times s and u. After time u we let ^ process task sl (j ) whenever processes task sl (i), 10

l = 1; : : : ; k, and otherwise agree with . Recall that T (l) (resp. Tb(l)) is the completion time of task l under policy (resp. ^ ), and except for the rst k successors of i and j , T (l) = Tb(l). If T (sk (j )) = Tb(sk (i)) u then T (sl (j )) = Tb(sl (i)) T (sl (i)) = Tb(sl (j )) for l = 1; : : : ; k, and so for t s, Ut Ubt and we are done. If task sk (j ) (resp. sk (i)) is still in progress on processor P1 at time u under (resp. ^ ), then, as long as P1 remains busy with task sk (j ) (resp. sk (i)), (resp. ^ ) will assign tasks s1 (i); s2 (i); : : : ; sk (i) (resp. s1 (j ); s2 (j ); : : : ; sk (j )), in that order, to processor P2. Refer again to gure 2. If sk (j ) (resp. sk (i)) completes on P1 before task sk (i) (resp. sk (j )) is assigned to P2 under policy (resp. ^ ), then we again have Ut Ubt for t u. Otherwise, suppose at time v policy (resp. ^ ) assigns task sk (i) (resp. sk (j )) to P2 and task sk (j ) (resp. sk (i)) has been in progress since time u on P1 and has not yet completed. Then we can do the same coupling construction that we did for the processing times of tasks i and j under policies and ^ in case II, using lemma 6.2, so that either T (sk (i)) = Tb(sk (i)) and T (sk (j )) = Tb(sk (j )) or Tb(sk (i)) = T (sk (j )) T (sk (i)) = Tb(sk (j )). (The latter case is shown in gure 2.) In both cases we again have Ut Ubt for t u. Now we show that the optimal policy is nonidling. Suppose a processor, say processor 1, is idle and available at time 0, but that an MS policy does not assign task i until some time d > 0. If processor 2 is unavailable or processing a task with fewer successors than i at time 0 then task i will be the rst task to be assigned since is an MS policy. If processor 2 is processing a task with more successors than i at time 0 then when that task completes it will enable at most one task with more successors than i because of the in-forest structure of precedences. This new enabled task, and similarly any successors that are assigned before time d, can therefore be assigned to processor 2, and we may assume, without loss of generality, that processor 1 remains idle until time d at which time assigns task i to processor 1. Let ^ assign task i to processor 1 at time 0, and let it agree with for assignments to processor 2 until time d. Let X be the processing time of task i under both policies, and let ^ idle processor 1 from time X to time d + X , and let it otherwise agree with . Then at time d + X the systems under both policies will be in the same state, and before that time ^ will have smaller cost by assumption (A2). By the induction hypothesis there will be a nonidling MS policy with smaller cost than policy ^ , so we are done. 2 When we have out-forests or interval-order task graphs and preemption is not permitted it may be desirable to idle. For example, consider the interval-order task graph of gure 3 . If task 1 is being processed and another processor is available, we may prefer to wait until task 1 has nished and then assign tasks 2 and 3 to processors, rather than assigning task 6 now. This is especially true if the other processor will become available soon. Using the basic approach of the proof above, we can show for out-forests and interval-order task graphs that whenever it is decided not to idle, a task should be assigned according to the MS policy, assuming that we know the processing time of a task as soon as it is assigned to a processor. That is, an idling 11

MS policy is optimal. However this policy, and in particular the decision on whether or not to idle at a certain time, will depend on the time at which other processors become available. Therefore, unlike the out-forest case above, we cannot conclude that an idling MS policy is still optimal when we don't have information on when processors will become available.

Theorem 4.2 For any interval-order task graph and an arbitrary number of identical proces-

sors, and for any uniform (resp. r-uniform) out-forest task graph and two (resp. an arbitrary number of) processors, an idling MS policy stochastically minimizes fg(Ut )gt=0 among nonpreemptive policies if the cost function satis es conditions (A1-A3) and if processing times are i.i.d. and ILR and the times at which processors will next become available are known. 1

Proof. The basic outline of the proof is as in the proof to theorem 4.1, with and ^ as de ned there. Again, because of the ILR processing times, we can couple the processing times of the interchanged tasks i and j so that either the interchange has no eect or the task that is started rst completes rst. If the interchange has no eect then both policies will be in the same state and we are done. If the task that is started rst completes rst, we can argue as we did in case I(b) in the proof of theorem 3.1 to get the result for interval-order task graphs. The result for out-forest task graphs follows along the lines of the proof of Coman and Liu (1992).

2

5 Acknowledgement The presentation of our results has bene tted greatly from helpful feedback from the referee.

6 Appendix: Likelihood Ratio Ordering and ILR random variables We say that the r.v. X is larger than the r.v. Y in the stochastic ordering sense, X st Y , if F (t) G(t) for all t, where F (t) and G(t) are cumulated distribution functions of X and Y , respectively. We say that X is larger than Y in the likelihood ratio ordering, X lr Y , if fg((tt)) is increasing in t, where f (t) and g(t) are the densities or probability mass functions of X and Y respectively. We have the following results. For lemma 6.1, see e.g., Ross (1983), and for lemma 6.2 and corollary 6.3 see Righter (1994).

Lemma 6.1 X st Y if and only if there are X and Y de ned on the same probability space such that X and X (resp. Y and Y ) have the same distribution and that X Y almost 0

0

0

0

0

surely (a.s.).

12

0

Lemma 6.2 Assume X and Y are independent. X lr Y if and only if (X j min(X; Y ) = m; max(X; Y ) = M ) st (Y j min(X; Y ) = m; max(X; Y ) = M ) for all m M . In other words, given m = min(X; Y ) and M = max(X; Y ), we have that X lr Y if and only if P fX = mjm; M g = P fY = M jm; M g P fX = M jm; M g = P fY = mjm; M g. Let Xt = (X ? tjX > t). We say that X is increasing in likelihood ratio (ILR) if Xs lr Xt for all s t. Exponentially distributed random variables are, of course, ILR (and DLR). A random variable that is ILR is also IHR (increasing in hazard rate). We have the following corollary, which follows from lemma 6.2.

Corollary 6.3 Assume X and X are i.i.d. and ILR, and Xb and Xb are i.i.d. with the same 1

2

1

2

distributions as X 1 and X 2 . Then we can couple (Xt1 ; X 2 ) with (Xbt1 ; Xb 2 ) so that min(Xt1 ; X 2 ) = min(Xbt1 ; Xb 2 ), max(Xt1 ; X 2 ) = max(Xbt1 ; Xb 2 ); and Xb t1 X 2 .

Let p = P fXt1 = M j min(Xt1 ; X 2 ) = m; max(Xt1 ; X 2 ) = M g. Then our corollary says that given m = min(Xt1 ; X 2 ) = min(Xbt1 ; Xb 2 ) and M = max(Xt1 ; X 2 ) = max(Xbt1 ; Xb 2 ), we can further couple (Xt1 ; X 2 ) with (Xbt1 ; Xb 2 ) so that one of the following three cases occurs.

Xbt1 = X 2 = m and Xb 2 = Xt1 = M , with probability p,

Xbt1 = X 2 = M and Xb 2 = Xt1 = m, with probability p,

Xbt1 = Xt1 = m and Xb 2 = X 2 = M , with probability 1 ? 2p.

Thus, our random variables can be coupled so that either Xbt1 = X 2 and Xb 2 = Xt1 , or Xbt1 = Xt1 Xb 2 = X 2 .

References [1] J. Bruno (1985). On scheduling tasks with exponential service times and in-tree precedence constraints. Acta Informatica. 22: 139-148. [2] K.M. Chandy and P.F. Reynolds (1975). Scheduling partially ordered tasks with probabilistic execution times. Operating System Review. 9: 169-177. 13

[3] E.G. Coman, Jr. and Z. Liu (1992). On the optimal stochastic scheduling of out-forests Opns. Res. 40: S67-S75. [4] E. Frostig (1988). A stochastic scheduling problem with intree precedence constraints. Opns. Res. 36: 937-943. [5] V.G. Kulkarni and P. F. Chimento, Jr. (1992). Optimal scheduling of exponential tasks with in-tree precedence constraints on two parallel processors subject to failure and repair. Opns. Res. 40: S263-271. [6] Z. Liu and E. Sanlaville (1997). Stochastical scheduling with variable pro le and precedence constraints. To appear in SIAM J. on Computing, 26 (1). [7] C.H. Papadimitriou and J. N. Tsitsiklis (1987). On stochastic scheduling with in-tree precedence constraints. SIAM J. Comp. 16: 1-6. [8] C.H. Papadimitriou and M. Yannakakis (1979). Scheduling interval-ordered tasks. SIAM J. Comp. 8: 405-409. [9] M. Pinedo, and G. Weiss (1985). Scheduling jobs with exponentially distributed processing times and intree precedence constraints on two parallel machines. Opns. Res. 33: 1381-1388.

14