International Conference on Mathematical and Statistical Modeling in Honor of Enrique Castillo. June 28-30, 2006

Stochastic Online Scheduling Deniz TÜRSEL ELİİYİ* and Umay UZUNOĞLU KOÇER Department of Statistics, Dokuz Eylül University.

Abstract Scheduling is a crucial phase of operations control in both manufacturing and service industries. With increased emphasis on time-to-market as well as superior customer satisfaction, efficient scheduling has gained considerable interest by operations researchers. The main challenge in tackling real life problems is to develop the mathematical model that reflects the system as close as possible. Stochastic scheduling and online scheduling have been the two major approaches in literature to cope with uncertain scenarios. Stochastic scheduling models, in which job processing times are assumed to be stochastic, have been addressed mainly since the 1980’s. In online scheduling, which is encountered especially in service systems, it is assumed that the arrival times of jobs yet to come are not known in advance. However, once a job arrives, its processing time is disclosed with certainty. In this study, we focus on a model that generalizes stochastic scheduling and online scheduling. We assume that the jobs arrive online. Once a job arrives, its expected processing time is revealed, but the actual processing time remains unknown until the job is completed. The generalized problem is important since it may be used to model practical problems. We model the problem, provide related literature, and identify application areas. We also review some solution procedures that can be utilized for the optimal solution of this problem.

Key Words: Stochastic Scheduling, Online Scheduling.

1. INTRODUCTION Scheduling deals with the allocation of scarce resources such as machines in a workshop or crew in an organization. Scheduling determines to which machine a part will be routed for processing, which worker will operate a machine that produces a part, and the order in which the parts are to be processed. Scheduling also determines which patient to assign to an operating room, which doctors and nurses are to care for a patient during certain hours of the day, the order in which a doctor is too see patients, and when meals should be delivered or medications dispensed. In other words, scheduling is a decision making process with the goal of optimizing the objective which may relate to both manufacturing and *

Correspondence to: Deniz Türsel Eliiyi. Department of Statistics. Dokuz Eylül University. Turkey. [email protected]

service industries. There are many possible objectives including meeting customer due dates; minimizing job lateness, the completion time of the last task, idle time or work-in-process inventory; maximizing machine or labor utilization. Scheduling in manufacturing environment is difficult because jobs arrive at varying time intervals, require different resources and sequences of operations, and are due at different times. Scheduling in service industry is also difficult because of the variety of options available and special requirement of workers. Besides these difficulties, when we attempt to model the real world circumstances, two major themes make it challenging. One is the randomness phenomenon. Production setting in the real world is subject to many sources of randomness because of the machine breakdowns or unexpected releases of high-priority jobs. Processing times that are not known in advance is another source of randomness. Stochastic scheduling problems have been used to deal with this randomness. In stochastic scheduling, the concern is to allocate a limited amount of resources to a set of jobs that need to be serviced. Stochastic scheduling problems occur in a diversity of practical situations, such as manufacturing and construction. Many decisions in daily life are made on-line. Particularly in service business, there are many examples in which the assignment of workers to the jobs should be made online. For instance, in a machine repair center when a customer makes a request for repair, which technician is assigned to this customer is an online decision. Hence, the other difficulty in modeling the real world conditions is the concept of online decisions, which is handled through online scheduling models. The context of stochastic scheduling problems is incredibly large and diverse. Stochastic models involve the scheduling of a set of jobs on identical parallel machines while minimizing a given performance measure. Job processing times are assumed to be random variables, which follow some probability distribution. There are various possible solution methodologies that could be used in solving stochastic scheduling problems. A somewhat naive approach is to take the expected processing times and use the algorithms for the deterministic problems. Unfortunately, it is easy to construct counterexamples showing that this approach can produce poor solutions. The type of stochastic scheduling problems in which both the number of jobs and the number of identical parallel machines are fixed can be easily expressed as a Markov decision process (MDP) and therefore can be tackled by dynamic programming methods. The purpose of this paper is to generalize stochastic scheduling and online scheduling models. We assume that jobs arrive online. Once a job arrives, its expected processing time is revealed, but the actual processing time remains unknown until the job is completed. Since there are many situations that can be considered as both stochastic and online in practice, the generalized problem is crucial. In section 2, we present a literature review concerning stochastic scheduling and online scheduling. The problem definition is introduced in section 3. The solution techniques that may be utilized for this problem

will be proposed in section 4. Application areas and concluding remarks will constitute section 5 and section 6, respectively.

2. LITERATURE REVIEW In this section, we provide a brief literature on the structures developed to deal with the uncertainty about the future in the theory of scheduling; i.e., stochastic scheduling and on-line scheduling. 2.1. Stochastic Scheduling In stochastic scheduling, the population of jobs is assumed to be known beforehand whereas the processing times of jobs are random variables. Stochastic scheduling models have been addressed mainly since the 1980’s. Rothkopf (1966) shows that for one machine without precedence constraints, an index rule minimizes the expected sum of weighted completion times for arbitrary processing time probability distributions. Möhring, Radermacher and Weiss (1984, 1985) study the analytic properties of various classes of scheduling policies, and determine optimal policies for special cases. Weiss (1990) and Weiss (1992) derive additive performance bounds for the stochastic parallel machine model without release dates, P E ∑ w j C j . In addition to this, Möhring et al. (1999) develops approximation algorithms for a variety of stochastic scheduling problems using techniques from combinatorial optimization. The authors examine the power of linear programming based priority policies, and compare to the expected performance of an optimal stochastic scheduling policy. Bertsekas and Castanon (1999) focus on a different class of stochastic scheduling problems, i.e. the quiz problem and its variations. They show how rollout algorithms can be implemented efficiently, and present experimental evidence that the performance of rollout policies is nearoptimal. Koole (2000) apply event-based dynamic programming to stochastic scheduling problems. Very recently, Skutella et al. (2005) derive approximation policies for stochastic machine scheduling with precedence constraints. However, these papers do not deal with the situation where jobs arrive on-line.

[

]

2.2. Online Scheduling In online scheduling, there exists no information about the jobs that are about to arrive in the future. However, once a job becomes known, its weight wj and its actual processing time pj are revealed. In literature, competitive ratios are generally used to assess the quality of online algorithms. A ρ-competitive algorithm provides for any instance a solution

with an objective function value of at most ρ times the value of an optimal offline solution. Examples are the studies by Sleator et al. (1985) and Karlin et al. (1988). There are two different types of online scheduling models. In the online-time model, the jobs become known upon their release dates rj, which resembles the arrivals in a queuing system. In the online-list model, all jobs are present in the system at time zero; however, the current job on the list has to be scheduled immediately before the information of the next job can be seen. Borodin et al. (1998) and Pruhs et al. (2004) provide further details on the types of online scheduling problems. In the online-time model, Anderson et al. (2004) provide a 2competitive online algorithm for the single machine problem 1|rj| ∑ C j . For the single machine model, Lu et al. (2003) also introduced a class of 2-competitive algorithms for the on-line variant of the single-machine problem 1|rj| ∑ C j . This result is the best possible result for an online algorithm, since Hoogeveen et al. (1996) prove a lower bound of 2 on the competitive ratio of any deterministic online algorithm. In the online-list model, Fiat et al. (1999) show that this problem does not allow a deterministic or randomized online algorithm with a competitive ratio of log n, where n is the number of jobs. Phillips et al. (1998) presented another on-line algorithm for 1|rj| ∑ C j , which converts a preemptive schedule into a nonpreemptive one of objective function value at most twice that of the preemptive schedule. Since Schrage’s shortest remaining processing time (SRPT) rule (1968) works on-line and produces an optimal preemptive schedule for the singlemachine problem, it follows that Phillips, Stein and Wein’s algorithm has competitive ratio 2 as well. For settings with parallel machines, Vestjens (1997) proves a lower bound of 1.309 for the competitive ratio of any deterministic online algorithm for P|rj| ∑ w j C j . The best deterministic algorithm developed for this problem is 2.62-competitive, proposed by Correa et al. (2005). The currently best-known randomized algorithm for the problem has an expected competitive ratio of 2 (Schulz et al., 2002b). Schulz and Skutella (2002a) and Goemans et al. (2002) give comprehensive reviews of the development of on-line algorithms for the preemptive and nonpreemptive single-machine problems, respectively; Hall et al. (1997) do the same for the parallel machine case.

2.3. Stochastic Online Scheduling Stochastic Online Scheduling is very recently recognized by researchers. A model that combines the features of stochastic and online scheduling has been considered by Chou et al. (2005). They prove asymptotic optimality of the online WSEPT (weighted shortest expected processing time) rule for the single machine problem 1|rj| ∑ w j C j ,

assuming that the weights and the processing times can be bounded from above and below by constants. Megow et al. (2005) consider the models 1|| E[∑ w j C j ] and 1|rj| E[∑ w j C j ] , where the objective represents the expectation of the weighted completion times of the jobs. They propose a simple online scheduling policy for the first model, and prove a performance guarantee that matches the currently best-known performance guarantee for stochastic parallel machine scheduling. They derive an analogous result for the second model.

3. PROBLEM DEFINITION A set J = {1, . . . , n} jobs with nonnegative weights wj , j ∈ J is given. However note that the number of jobs n is not known in advance. The release date for job j, rj , denotes the earliest point in time when job j can be started. There are m identical parallel machines, and each job must be processed on any of the machines in a non-preemptive manner. That is, switching machines during the operation is not allowed for any job. Each machine can only process one job at a time. The objective is to find a schedule that minimizes the total weighted completion time ∑ w j C j , j

where Cj denotes the completion time of job j. Pj denotes the random variable representing the processing time of job j. Also, let E [Pj] be the expected processing time for job j, and pj a particular realization of Pj. The processing time distributions of the variables Pj are assumed to be independent. Hence, the main characteristic of a stochastic scheduling model is the fact that the processing times of jobs are subject to fluctuations, and they become known only upon completion of the jobs. The notion of a scheduling policy is therefore required in stochastic scheduling instead of simple schedules. Möhring et al. (1984) defines such policies. He compares the expected outcome of a certain stochastic scheduling policy against the expected outcome of an optimal scheduling policy. The definition of a stochastic online scheduling policy should extend the traditional definition of stochastic scheduling policies to the setting where jobs arrive online. We assume that the jobs are indexed in chronological order of their release dates, i.e. rj ≤ rk for j < k for j and k ∈ J. When a job j arrives at time rj, the information about its weight and its expected processing time is revealed. There is no knowledge about jobs that might appear in the future, or their number. The assignments made cannot be revised later. Once all jobs have been assigned to the machines, the schedule on each machine can be modified independently in a nonpreemptive manner. In this model of Stochastic Online Scheduling, the goal is to find a policy that minimizes the expected value of the weighted completion times of jobs, E [ ∑ w j C j ] (Megow et al., 2005). j

A scheduling policy should specify an action at any decision time t. An action is defined as a set of jobs that is started at time t (scheduled to

start its processing on one of the parallel machines at that time), and a next decision time t1 at which the next action is taken, where t1 > t. However, if a new job is released, or if the processing of a job is completed at a time t2 sooner than t1, then the next decision time becomes t2. The complete information contained in the partial schedule up to time t can be utilized to form a stochastic online scheduling policy, since that information is already revealed to the decision maker. There is also the information about the unscheduled jobs that have arrived before t (jobs with rj ≤ t). However, the information not revealed yet at time t cannot be used in forming a policy. This contains the knowledge about the jobs that will be released in the future, or the actual processing times of jobs that are scheduled (or unscheduled) but not yet completed. An optimal scheduling policy is defined as a policy that satisfies the above characteristics and minimizes the objective function, which is the defined as the expected value of the weighted completion times. An optimal policy needs not to be online; it may use future information about the jobs that will be released. However, it must be non-anticipatory in the sense that it cannot use the information on the actual processing times of jobs that are scheduled (or unscheduled) but not yet completed. Hence, even an optimal scheduling policy may fail to yield an optimal solution for all realizations of the processing times. For any problem instance I, let Sπj(I) and Cπj(I) denote the random variables for starting and completion times of jobs under policy π. Sπj and Cπj can be used for short. Then, the expected performance of a scheduling policy π on instance I becomes: E[π ( I )] =

∑w

π j E[C j

( I )]

j∈J

This stochastic online scheduling model can be considered a twophase model, where the first phase consists of an online assignment of jobs to machines. The second phase consists of the actual process of scheduling the jobs over time, processing times being realized according to the respective distributions.

4. POSSIBLE SOLUTION APPROACHES Almost all interesting scheduling problems are computationally intractable and near-optimal or approximate solutions have to be established. In a stochastic scheduling problem, one solution approach is to take the expectations of the processing times and to use the algorithms for deterministic problems. However there are many examples, which show this approach is fairly poor. When the problem is to characterize the models in such a way that to clarify their dynamic properties, dynamic programming provides suitable techniques. Dynamic programming is a general type of approach to problem solving, and the particular equations must be developed to fit each individual situation. The type of stochastic scheduling problems in

which both the number of jobs and the number of identical parallel machines is fixed can be easily expressed as a Markov decision process (MDP) and therefore can be tackled by dynamic programming methods. In online scheduling, however, the application of Markov approach is not easy, as the number of jobs in the job set is not known in advance. The stochastic online scheduling problem can be formulated as a finite horizon MDP with a finite state space as the total number of jobs to be scheduled is known. If we assume there is only one machine, the state of the system is characterized by the set of jobs to be processed and the maximum completion time of the jobs completed so far. For each state there is a set of available actions. An action may be defined as the job to be scheduled next for the related state. However, the problem with dynamic programming or MDP is that the state space grows exponentially as the number of jobs increases. Even in deterministic scheduling problems, the number of states may be too large and this makes the problem exponential in complexity. In stochastic scheduling problems, the uncertainty makes the state space even larger. In this case, a solution may be achieved by linear programming approximation to dynamic programming. For example, consider a single machine problem with an additive objective function. The problem can be restructured as a finitehorizon Markov decision problem, or as a stochastic shortest path problem, which can be approximately solved by the approximate linear programming approach. Among the basic approximate methods we usually distinguish between constructive methods and local search methods. Constructive algorithms generate solutions from scratch by adding to an initially empty partial solution, until a solution is complete. They are typically the fastest approximate methods, yet they often return solutions of inferior quality when compared to local search algorithms. Local search algorithms start from some initial solution and iteratively try to replace the current solution by a better solution in an appropriately defined neighborhood of the current solution. Simple heuristic procedures that are common in simpler scheduling schemes can be adapted for the stochastic online scheduling problem. For example, Megow et al. (2005) consider the non-preemptive parallel machine version of the problem with the objective of minimizing the total weighted expected completion time of the jobs. The authors analyze simple online scheduling policies for the model. In particular, the authors apply a modified version of the WSEPT rule, which is known to be the best-known constructive scheduling policy in stochastic scheduling. They obtain performance guarantees that match performance guarantees previously known for stochastic and online parallel machine scheduling. Due to the complex nature of the problem, metaheuristic approaches could also be used for obtaining fast approximate solutions for stochastic online scheduling. Metaheuristics are widely used to solve important practical combinatorial optimization problems. A metaheuristic is an iterative master process that guides and modifies the operations of subordinate heuristics to efficiently produce high-quality solutions (Voß et

al., 1999). It may manipulate a complete (or incomplete) single solution or a collection of solutions at each iteration. The subordinate heuristics may be high (or low) level procedures, or a simple local search, or just a construction method. Examples of metaheuristics include simulated annealing, tabu search, genetic algorithm, iterated local search, evolutionary algorithms, and ant colony optimization. In short we could say that metaheuristics are high-level strategies for exploring search spaces by using different methods. Of great importance hereby is that a dynamic balance is given between diversification and intensification. The term diversification generally refers to the exploration of the search space, whereas the term intensification refers to the exploitation of the accumulated search experience. These terms stem from the Tabu Search field (Glover and Laguna, 1996). The balance between diversification and intensification as mentioned above is important, on one side to quickly identify regions in the search space with high quality solutions and on the other side not to waste too much time in regions of the search space which are either already explored or which do not provide high quality solutions. Metaheuristic approaches are widely used for scheduling problems. However, in stochastic online scheduling, identification of the search space is difficult due to the nature of the problem. Since the jobs are presented to the scheduler one by one, the set of incomplete solutions will constitute a dynamic search space at any stage, i.e. at any time t.

5. APPLICATION AREAS Many examples can be recognized for stochastic online scheduling for both manufacturing and service industries. A common issue in manufacturing industry is the uncertainty related with the processing times of jobs. In practice, it is possible that exact completion time of one or more jobs may not be known with certainty. For instance, in a computer software support center where the customers call and make requests for support or repair services, the assignment decisions of technicians to customers should be made online. Service durations in such a system are not known with certainty, although expected service times may be estimated based on previous experience. The service times may follow Exponential or Erlang distributions. A car repair center can be given as another example from service industry. Unless there is a reservation system, neither the times that customers arrive, nor the service durations can be known with certainty in this kind of an environment. In cases that include reservation systems, the problem becomes interval scheduling, or fixed job scheduling (Eliiyi, 2003). In hospitals, stochastic online scheduling can save lives and improve patient care. The admission process of a hospital through the emergency service constitutes a nice example to this type of problem. The patients’ arrival times are not known, and the treatment times are stochastic, as

well. One can count numerous real-life examples to this problem because of the strong representation ability of the model.

6. CONCLUSION AND FUTURE RESEARCH DIRECTIONS In this study, we consider a model that combines the characteristics of stochastic and online scheduling models. These models have been considered separately in literature. We assume online arrivals of the jobs. The expected processing time of a job is disclosed as it arrives. The actual processing time is not known until the job is completed. The generalized problem is important since it may be used to model real-life problems. Although the model provides a nice representation of many practical systems, development of exact solution procedures are expected to be cumbersome. Dynamic programming approach can be used in constructing an optimal online scheduling policy, but the approach is known to be exponential in complexity. Hence, development of heuristic approaches to the problem seems as a promising future research topic. Application of metaheuristics like Genetic Algorithm or Tabu Search can also be considered as future research opportunities.

REFERENCES ANDERSON, E.J. and POTTS, C.N., (2004), On-line scheduling of a single machine to minimize total weighted completion time. Mathematics of Operations Research, 29:686–697. BERTSEKAS, D., and CASTANON, D., (1999). Rollout algorithms for stochastic scheduling problems. Journal of Heuristics, 5:89-108. BORODIN, A. and EL-YANIV, R., (1998). Online Computation and Competitive Analysis. Cambridge University Press. CERNY, V., (1985). A thermodynamical approach to the travelling salesman problem: An efficient simulation algorithm. Journal of Optimization Theory and Applications, 45:41.51. CHOU, M.C., LIU, H., QUEYRANNE, M. and SIMCHI-LEVI, D., (2005). On the asymptotic optimality of a simple online algorithm for the stochastic single machine weighted completion time problem and its extensions. Operations Research, to appear. CORREA, J. and WAGNER, M., (2005). LP-based online scheduling: From single to parallel machines, Integer Programming and Combinatorial Optimization. In M. Jünger and V. Kaibel, eds., Lecture Notes in Computer Science, 3509:196–209, Springer.

ELIIYI, D.T., (2003). Operational Fixed Job Scheduling Problem. Ph.D. thesis, Department of Industrial Engineering, Middle East Technical University, Turkey. FIAT, A. and WOEGINGER, G.J., (1999). On-line scheduling on a single machine: Minimizing the total completion time. Acta Informatica, 36:287–293. GLOVER, F., and LAGUNA, M., (1997). Tabu Search. Kluwer Academic Publishers. GOEMANS, M.X., QUEYRANNE, M., SCHULZ, A.S., SKUTELLA, M., and WANG, Y., (2002). Single machine scheduling with release dates. SIAM Journal of Discrete Mathematics, 15:165–192. HALL, L.A., SCHULZ, A.S., SHMOYS, D.B., and WEIN, J., (1997). Scheduling to minimize average completion time: off-line and on-line approximation algorithms. Mathematics of Operations Research, 22:513–544. HOOGEVEEN, H. and VESTJENS, A.P.A., (1996). Optimal on-line algorithms for single-machine scheduling, Integer Programming and Combinatorial Optimization. In W.H. Cunningham, S.T. McCormick, and M. Queyranne, eds., Lecture Notes in Computer Science, 1084:404–414, Springer. KARLIN, A., MANASSE, M., RUDOLPH, L. and SLEATOR, D., (1988). Competitive snoopy paging. Algorithmica, 3:70–119. KIRKPATRICK, S., GELATT, C.D., and VECCHI, M.P., (1983). Optimization by simulated annealing. Science, 220:671.680. KOOLE G., (2000). Stochastic Scheduling with event-based dynamic programming. ZOR-Mathematical Methods of Operations Research, 51:249-261. VAN LAARHOVEN, P.J.M., AARTS, E.H.L., and LENSTRA, J.K., (1992). Job Shop Scheduling by Simulated Annealing. Operations Research, 40:113.125. LU, X., SITTERS, R.A., and STOUGIE, L., (2003). A class of on-line scheduling algorithms to minimize total completion time. Operations Research Letters, 31:232–236. MEGOW, N., UETZ, M. and VREDEVELD, T., (2005). Stochastic online scheduling on parallel machines. In G. Persiano and R. Solis-Oba, eds., Lecture Notes in Computer Science, 3351:67-180.

MÖHRING, R.H., RADERMACHER, F.J. and WEISS, G., (1984). Stochastic scheduling problems I: General strategies. ZOR - Zeitschrift für Operations Research, 28:193–260. MÖHRING, R.H., SCHULZ, A. S. and UETZ, M., (1999). Approximation in stochastic scheduling: the power of LP-based priority policies. Journal of the ACM, 46:924–942. PHILLIPS, C.A., STEIN, C., and WEIN, J., (1998). Minimizing average completion time in the presence of release dates. Mathematical Programming, 82:199–223. PINEDO, M., (2002). Scheduling- Theory, algorithms, and systems. Prentice-Hall. PRUHS, K., SGALL, J. and TORNG, E., (2004). Online scheduling. In J. Leung, ed., Handbook of Scheduling: Algorithms, Models, and Performance Analysis, CRC Press. ROTHKOPF, M. H. (1966). Scheduling with random service times. Management Science, 12:703-713. SCHRAGE, L., (1968). A proof of the optimality of the shortest remaining processing time discipline. Operations Research, 16:687–690. SCHULZ, A.S., and SKUTELLA, M., (2002a). The power of α-points in preemptive single machine scheduling. Journal of Scheduling, 5:121– 133. SCHULZ, A.S. and SKUTELLA, M., (2002b). Scheduling unrelated machines by randomized rounding. SIAM Journal on Discrete Mathematics, 15:450–469. SKUTELLA, M. and UETZ, M., (2005). Stochastic machine scheduling with precedence constraints. SIAM Journal on Computing, 34:788– 802. SLEATOR, D. and TARJAN, R., (1985). Amortized efficiency of list update and paging rules. Communications of the ACM, 28:202–208. VESTJENS, A.P.A., (1997). On-line machine scheduling, Ph.D. thesis, Eindhoven University of Technology, Netherlands. VOß, S., MARTELLO, S., OSMAN, I.H., and ROUCAIROL, C. eds., (1999). Meta-Heuristics - Advances and Trends in Local Search Paradigms for Optimization. Kluwer Academic Publishers.

WEISS, G., (1990). Approximation results in parallel machines stochastic scheduling. Annals of Operations Research, 26:195–242. WEISS, G., (1992). Turnpike optimality of Smith’s rule in parallel machines stochastic scheduling. Mathematics of Operations Research, 17:255–270.

Stochastic Online Scheduling Deniz TÜRSEL ELİİYİ* and Umay UZUNOĞLU KOÇER Department of Statistics, Dokuz Eylül University.

Abstract Scheduling is a crucial phase of operations control in both manufacturing and service industries. With increased emphasis on time-to-market as well as superior customer satisfaction, efficient scheduling has gained considerable interest by operations researchers. The main challenge in tackling real life problems is to develop the mathematical model that reflects the system as close as possible. Stochastic scheduling and online scheduling have been the two major approaches in literature to cope with uncertain scenarios. Stochastic scheduling models, in which job processing times are assumed to be stochastic, have been addressed mainly since the 1980’s. In online scheduling, which is encountered especially in service systems, it is assumed that the arrival times of jobs yet to come are not known in advance. However, once a job arrives, its processing time is disclosed with certainty. In this study, we focus on a model that generalizes stochastic scheduling and online scheduling. We assume that the jobs arrive online. Once a job arrives, its expected processing time is revealed, but the actual processing time remains unknown until the job is completed. The generalized problem is important since it may be used to model practical problems. We model the problem, provide related literature, and identify application areas. We also review some solution procedures that can be utilized for the optimal solution of this problem.

Key Words: Stochastic Scheduling, Online Scheduling.

1. INTRODUCTION Scheduling deals with the allocation of scarce resources such as machines in a workshop or crew in an organization. Scheduling determines to which machine a part will be routed for processing, which worker will operate a machine that produces a part, and the order in which the parts are to be processed. Scheduling also determines which patient to assign to an operating room, which doctors and nurses are to care for a patient during certain hours of the day, the order in which a doctor is too see patients, and when meals should be delivered or medications dispensed. In other words, scheduling is a decision making process with the goal of optimizing the objective which may relate to both manufacturing and *

Correspondence to: Deniz Türsel Eliiyi. Department of Statistics. Dokuz Eylül University. Turkey. [email protected]

service industries. There are many possible objectives including meeting customer due dates; minimizing job lateness, the completion time of the last task, idle time or work-in-process inventory; maximizing machine or labor utilization. Scheduling in manufacturing environment is difficult because jobs arrive at varying time intervals, require different resources and sequences of operations, and are due at different times. Scheduling in service industry is also difficult because of the variety of options available and special requirement of workers. Besides these difficulties, when we attempt to model the real world circumstances, two major themes make it challenging. One is the randomness phenomenon. Production setting in the real world is subject to many sources of randomness because of the machine breakdowns or unexpected releases of high-priority jobs. Processing times that are not known in advance is another source of randomness. Stochastic scheduling problems have been used to deal with this randomness. In stochastic scheduling, the concern is to allocate a limited amount of resources to a set of jobs that need to be serviced. Stochastic scheduling problems occur in a diversity of practical situations, such as manufacturing and construction. Many decisions in daily life are made on-line. Particularly in service business, there are many examples in which the assignment of workers to the jobs should be made online. For instance, in a machine repair center when a customer makes a request for repair, which technician is assigned to this customer is an online decision. Hence, the other difficulty in modeling the real world conditions is the concept of online decisions, which is handled through online scheduling models. The context of stochastic scheduling problems is incredibly large and diverse. Stochastic models involve the scheduling of a set of jobs on identical parallel machines while minimizing a given performance measure. Job processing times are assumed to be random variables, which follow some probability distribution. There are various possible solution methodologies that could be used in solving stochastic scheduling problems. A somewhat naive approach is to take the expected processing times and use the algorithms for the deterministic problems. Unfortunately, it is easy to construct counterexamples showing that this approach can produce poor solutions. The type of stochastic scheduling problems in which both the number of jobs and the number of identical parallel machines are fixed can be easily expressed as a Markov decision process (MDP) and therefore can be tackled by dynamic programming methods. The purpose of this paper is to generalize stochastic scheduling and online scheduling models. We assume that jobs arrive online. Once a job arrives, its expected processing time is revealed, but the actual processing time remains unknown until the job is completed. Since there are many situations that can be considered as both stochastic and online in practice, the generalized problem is crucial. In section 2, we present a literature review concerning stochastic scheduling and online scheduling. The problem definition is introduced in section 3. The solution techniques that may be utilized for this problem

will be proposed in section 4. Application areas and concluding remarks will constitute section 5 and section 6, respectively.

2. LITERATURE REVIEW In this section, we provide a brief literature on the structures developed to deal with the uncertainty about the future in the theory of scheduling; i.e., stochastic scheduling and on-line scheduling. 2.1. Stochastic Scheduling In stochastic scheduling, the population of jobs is assumed to be known beforehand whereas the processing times of jobs are random variables. Stochastic scheduling models have been addressed mainly since the 1980’s. Rothkopf (1966) shows that for one machine without precedence constraints, an index rule minimizes the expected sum of weighted completion times for arbitrary processing time probability distributions. Möhring, Radermacher and Weiss (1984, 1985) study the analytic properties of various classes of scheduling policies, and determine optimal policies for special cases. Weiss (1990) and Weiss (1992) derive additive performance bounds for the stochastic parallel machine model without release dates, P E ∑ w j C j . In addition to this, Möhring et al. (1999) develops approximation algorithms for a variety of stochastic scheduling problems using techniques from combinatorial optimization. The authors examine the power of linear programming based priority policies, and compare to the expected performance of an optimal stochastic scheduling policy. Bertsekas and Castanon (1999) focus on a different class of stochastic scheduling problems, i.e. the quiz problem and its variations. They show how rollout algorithms can be implemented efficiently, and present experimental evidence that the performance of rollout policies is nearoptimal. Koole (2000) apply event-based dynamic programming to stochastic scheduling problems. Very recently, Skutella et al. (2005) derive approximation policies for stochastic machine scheduling with precedence constraints. However, these papers do not deal with the situation where jobs arrive on-line.

[

]

2.2. Online Scheduling In online scheduling, there exists no information about the jobs that are about to arrive in the future. However, once a job becomes known, its weight wj and its actual processing time pj are revealed. In literature, competitive ratios are generally used to assess the quality of online algorithms. A ρ-competitive algorithm provides for any instance a solution

with an objective function value of at most ρ times the value of an optimal offline solution. Examples are the studies by Sleator et al. (1985) and Karlin et al. (1988). There are two different types of online scheduling models. In the online-time model, the jobs become known upon their release dates rj, which resembles the arrivals in a queuing system. In the online-list model, all jobs are present in the system at time zero; however, the current job on the list has to be scheduled immediately before the information of the next job can be seen. Borodin et al. (1998) and Pruhs et al. (2004) provide further details on the types of online scheduling problems. In the online-time model, Anderson et al. (2004) provide a 2competitive online algorithm for the single machine problem 1|rj| ∑ C j . For the single machine model, Lu et al. (2003) also introduced a class of 2-competitive algorithms for the on-line variant of the single-machine problem 1|rj| ∑ C j . This result is the best possible result for an online algorithm, since Hoogeveen et al. (1996) prove a lower bound of 2 on the competitive ratio of any deterministic online algorithm. In the online-list model, Fiat et al. (1999) show that this problem does not allow a deterministic or randomized online algorithm with a competitive ratio of log n, where n is the number of jobs. Phillips et al. (1998) presented another on-line algorithm for 1|rj| ∑ C j , which converts a preemptive schedule into a nonpreemptive one of objective function value at most twice that of the preemptive schedule. Since Schrage’s shortest remaining processing time (SRPT) rule (1968) works on-line and produces an optimal preemptive schedule for the singlemachine problem, it follows that Phillips, Stein and Wein’s algorithm has competitive ratio 2 as well. For settings with parallel machines, Vestjens (1997) proves a lower bound of 1.309 for the competitive ratio of any deterministic online algorithm for P|rj| ∑ w j C j . The best deterministic algorithm developed for this problem is 2.62-competitive, proposed by Correa et al. (2005). The currently best-known randomized algorithm for the problem has an expected competitive ratio of 2 (Schulz et al., 2002b). Schulz and Skutella (2002a) and Goemans et al. (2002) give comprehensive reviews of the development of on-line algorithms for the preemptive and nonpreemptive single-machine problems, respectively; Hall et al. (1997) do the same for the parallel machine case.

2.3. Stochastic Online Scheduling Stochastic Online Scheduling is very recently recognized by researchers. A model that combines the features of stochastic and online scheduling has been considered by Chou et al. (2005). They prove asymptotic optimality of the online WSEPT (weighted shortest expected processing time) rule for the single machine problem 1|rj| ∑ w j C j ,

assuming that the weights and the processing times can be bounded from above and below by constants. Megow et al. (2005) consider the models 1|| E[∑ w j C j ] and 1|rj| E[∑ w j C j ] , where the objective represents the expectation of the weighted completion times of the jobs. They propose a simple online scheduling policy for the first model, and prove a performance guarantee that matches the currently best-known performance guarantee for stochastic parallel machine scheduling. They derive an analogous result for the second model.

3. PROBLEM DEFINITION A set J = {1, . . . , n} jobs with nonnegative weights wj , j ∈ J is given. However note that the number of jobs n is not known in advance. The release date for job j, rj , denotes the earliest point in time when job j can be started. There are m identical parallel machines, and each job must be processed on any of the machines in a non-preemptive manner. That is, switching machines during the operation is not allowed for any job. Each machine can only process one job at a time. The objective is to find a schedule that minimizes the total weighted completion time ∑ w j C j , j

where Cj denotes the completion time of job j. Pj denotes the random variable representing the processing time of job j. Also, let E [Pj] be the expected processing time for job j, and pj a particular realization of Pj. The processing time distributions of the variables Pj are assumed to be independent. Hence, the main characteristic of a stochastic scheduling model is the fact that the processing times of jobs are subject to fluctuations, and they become known only upon completion of the jobs. The notion of a scheduling policy is therefore required in stochastic scheduling instead of simple schedules. Möhring et al. (1984) defines such policies. He compares the expected outcome of a certain stochastic scheduling policy against the expected outcome of an optimal scheduling policy. The definition of a stochastic online scheduling policy should extend the traditional definition of stochastic scheduling policies to the setting where jobs arrive online. We assume that the jobs are indexed in chronological order of their release dates, i.e. rj ≤ rk for j < k for j and k ∈ J. When a job j arrives at time rj, the information about its weight and its expected processing time is revealed. There is no knowledge about jobs that might appear in the future, or their number. The assignments made cannot be revised later. Once all jobs have been assigned to the machines, the schedule on each machine can be modified independently in a nonpreemptive manner. In this model of Stochastic Online Scheduling, the goal is to find a policy that minimizes the expected value of the weighted completion times of jobs, E [ ∑ w j C j ] (Megow et al., 2005). j

A scheduling policy should specify an action at any decision time t. An action is defined as a set of jobs that is started at time t (scheduled to

start its processing on one of the parallel machines at that time), and a next decision time t1 at which the next action is taken, where t1 > t. However, if a new job is released, or if the processing of a job is completed at a time t2 sooner than t1, then the next decision time becomes t2. The complete information contained in the partial schedule up to time t can be utilized to form a stochastic online scheduling policy, since that information is already revealed to the decision maker. There is also the information about the unscheduled jobs that have arrived before t (jobs with rj ≤ t). However, the information not revealed yet at time t cannot be used in forming a policy. This contains the knowledge about the jobs that will be released in the future, or the actual processing times of jobs that are scheduled (or unscheduled) but not yet completed. An optimal scheduling policy is defined as a policy that satisfies the above characteristics and minimizes the objective function, which is the defined as the expected value of the weighted completion times. An optimal policy needs not to be online; it may use future information about the jobs that will be released. However, it must be non-anticipatory in the sense that it cannot use the information on the actual processing times of jobs that are scheduled (or unscheduled) but not yet completed. Hence, even an optimal scheduling policy may fail to yield an optimal solution for all realizations of the processing times. For any problem instance I, let Sπj(I) and Cπj(I) denote the random variables for starting and completion times of jobs under policy π. Sπj and Cπj can be used for short. Then, the expected performance of a scheduling policy π on instance I becomes: E[π ( I )] =

∑w

π j E[C j

( I )]

j∈J

This stochastic online scheduling model can be considered a twophase model, where the first phase consists of an online assignment of jobs to machines. The second phase consists of the actual process of scheduling the jobs over time, processing times being realized according to the respective distributions.

4. POSSIBLE SOLUTION APPROACHES Almost all interesting scheduling problems are computationally intractable and near-optimal or approximate solutions have to be established. In a stochastic scheduling problem, one solution approach is to take the expectations of the processing times and to use the algorithms for deterministic problems. However there are many examples, which show this approach is fairly poor. When the problem is to characterize the models in such a way that to clarify their dynamic properties, dynamic programming provides suitable techniques. Dynamic programming is a general type of approach to problem solving, and the particular equations must be developed to fit each individual situation. The type of stochastic scheduling problems in

which both the number of jobs and the number of identical parallel machines is fixed can be easily expressed as a Markov decision process (MDP) and therefore can be tackled by dynamic programming methods. In online scheduling, however, the application of Markov approach is not easy, as the number of jobs in the job set is not known in advance. The stochastic online scheduling problem can be formulated as a finite horizon MDP with a finite state space as the total number of jobs to be scheduled is known. If we assume there is only one machine, the state of the system is characterized by the set of jobs to be processed and the maximum completion time of the jobs completed so far. For each state there is a set of available actions. An action may be defined as the job to be scheduled next for the related state. However, the problem with dynamic programming or MDP is that the state space grows exponentially as the number of jobs increases. Even in deterministic scheduling problems, the number of states may be too large and this makes the problem exponential in complexity. In stochastic scheduling problems, the uncertainty makes the state space even larger. In this case, a solution may be achieved by linear programming approximation to dynamic programming. For example, consider a single machine problem with an additive objective function. The problem can be restructured as a finitehorizon Markov decision problem, or as a stochastic shortest path problem, which can be approximately solved by the approximate linear programming approach. Among the basic approximate methods we usually distinguish between constructive methods and local search methods. Constructive algorithms generate solutions from scratch by adding to an initially empty partial solution, until a solution is complete. They are typically the fastest approximate methods, yet they often return solutions of inferior quality when compared to local search algorithms. Local search algorithms start from some initial solution and iteratively try to replace the current solution by a better solution in an appropriately defined neighborhood of the current solution. Simple heuristic procedures that are common in simpler scheduling schemes can be adapted for the stochastic online scheduling problem. For example, Megow et al. (2005) consider the non-preemptive parallel machine version of the problem with the objective of minimizing the total weighted expected completion time of the jobs. The authors analyze simple online scheduling policies for the model. In particular, the authors apply a modified version of the WSEPT rule, which is known to be the best-known constructive scheduling policy in stochastic scheduling. They obtain performance guarantees that match performance guarantees previously known for stochastic and online parallel machine scheduling. Due to the complex nature of the problem, metaheuristic approaches could also be used for obtaining fast approximate solutions for stochastic online scheduling. Metaheuristics are widely used to solve important practical combinatorial optimization problems. A metaheuristic is an iterative master process that guides and modifies the operations of subordinate heuristics to efficiently produce high-quality solutions (Voß et

al., 1999). It may manipulate a complete (or incomplete) single solution or a collection of solutions at each iteration. The subordinate heuristics may be high (or low) level procedures, or a simple local search, or just a construction method. Examples of metaheuristics include simulated annealing, tabu search, genetic algorithm, iterated local search, evolutionary algorithms, and ant colony optimization. In short we could say that metaheuristics are high-level strategies for exploring search spaces by using different methods. Of great importance hereby is that a dynamic balance is given between diversification and intensification. The term diversification generally refers to the exploration of the search space, whereas the term intensification refers to the exploitation of the accumulated search experience. These terms stem from the Tabu Search field (Glover and Laguna, 1996). The balance between diversification and intensification as mentioned above is important, on one side to quickly identify regions in the search space with high quality solutions and on the other side not to waste too much time in regions of the search space which are either already explored or which do not provide high quality solutions. Metaheuristic approaches are widely used for scheduling problems. However, in stochastic online scheduling, identification of the search space is difficult due to the nature of the problem. Since the jobs are presented to the scheduler one by one, the set of incomplete solutions will constitute a dynamic search space at any stage, i.e. at any time t.

5. APPLICATION AREAS Many examples can be recognized for stochastic online scheduling for both manufacturing and service industries. A common issue in manufacturing industry is the uncertainty related with the processing times of jobs. In practice, it is possible that exact completion time of one or more jobs may not be known with certainty. For instance, in a computer software support center where the customers call and make requests for support or repair services, the assignment decisions of technicians to customers should be made online. Service durations in such a system are not known with certainty, although expected service times may be estimated based on previous experience. The service times may follow Exponential or Erlang distributions. A car repair center can be given as another example from service industry. Unless there is a reservation system, neither the times that customers arrive, nor the service durations can be known with certainty in this kind of an environment. In cases that include reservation systems, the problem becomes interval scheduling, or fixed job scheduling (Eliiyi, 2003). In hospitals, stochastic online scheduling can save lives and improve patient care. The admission process of a hospital through the emergency service constitutes a nice example to this type of problem. The patients’ arrival times are not known, and the treatment times are stochastic, as

well. One can count numerous real-life examples to this problem because of the strong representation ability of the model.

6. CONCLUSION AND FUTURE RESEARCH DIRECTIONS In this study, we consider a model that combines the characteristics of stochastic and online scheduling models. These models have been considered separately in literature. We assume online arrivals of the jobs. The expected processing time of a job is disclosed as it arrives. The actual processing time is not known until the job is completed. The generalized problem is important since it may be used to model real-life problems. Although the model provides a nice representation of many practical systems, development of exact solution procedures are expected to be cumbersome. Dynamic programming approach can be used in constructing an optimal online scheduling policy, but the approach is known to be exponential in complexity. Hence, development of heuristic approaches to the problem seems as a promising future research topic. Application of metaheuristics like Genetic Algorithm or Tabu Search can also be considered as future research opportunities.

REFERENCES ANDERSON, E.J. and POTTS, C.N., (2004), On-line scheduling of a single machine to minimize total weighted completion time. Mathematics of Operations Research, 29:686–697. BERTSEKAS, D., and CASTANON, D., (1999). Rollout algorithms for stochastic scheduling problems. Journal of Heuristics, 5:89-108. BORODIN, A. and EL-YANIV, R., (1998). Online Computation and Competitive Analysis. Cambridge University Press. CERNY, V., (1985). A thermodynamical approach to the travelling salesman problem: An efficient simulation algorithm. Journal of Optimization Theory and Applications, 45:41.51. CHOU, M.C., LIU, H., QUEYRANNE, M. and SIMCHI-LEVI, D., (2005). On the asymptotic optimality of a simple online algorithm for the stochastic single machine weighted completion time problem and its extensions. Operations Research, to appear. CORREA, J. and WAGNER, M., (2005). LP-based online scheduling: From single to parallel machines, Integer Programming and Combinatorial Optimization. In M. Jünger and V. Kaibel, eds., Lecture Notes in Computer Science, 3509:196–209, Springer.

ELIIYI, D.T., (2003). Operational Fixed Job Scheduling Problem. Ph.D. thesis, Department of Industrial Engineering, Middle East Technical University, Turkey. FIAT, A. and WOEGINGER, G.J., (1999). On-line scheduling on a single machine: Minimizing the total completion time. Acta Informatica, 36:287–293. GLOVER, F., and LAGUNA, M., (1997). Tabu Search. Kluwer Academic Publishers. GOEMANS, M.X., QUEYRANNE, M., SCHULZ, A.S., SKUTELLA, M., and WANG, Y., (2002). Single machine scheduling with release dates. SIAM Journal of Discrete Mathematics, 15:165–192. HALL, L.A., SCHULZ, A.S., SHMOYS, D.B., and WEIN, J., (1997). Scheduling to minimize average completion time: off-line and on-line approximation algorithms. Mathematics of Operations Research, 22:513–544. HOOGEVEEN, H. and VESTJENS, A.P.A., (1996). Optimal on-line algorithms for single-machine scheduling, Integer Programming and Combinatorial Optimization. In W.H. Cunningham, S.T. McCormick, and M. Queyranne, eds., Lecture Notes in Computer Science, 1084:404–414, Springer. KARLIN, A., MANASSE, M., RUDOLPH, L. and SLEATOR, D., (1988). Competitive snoopy paging. Algorithmica, 3:70–119. KIRKPATRICK, S., GELATT, C.D., and VECCHI, M.P., (1983). Optimization by simulated annealing. Science, 220:671.680. KOOLE G., (2000). Stochastic Scheduling with event-based dynamic programming. ZOR-Mathematical Methods of Operations Research, 51:249-261. VAN LAARHOVEN, P.J.M., AARTS, E.H.L., and LENSTRA, J.K., (1992). Job Shop Scheduling by Simulated Annealing. Operations Research, 40:113.125. LU, X., SITTERS, R.A., and STOUGIE, L., (2003). A class of on-line scheduling algorithms to minimize total completion time. Operations Research Letters, 31:232–236. MEGOW, N., UETZ, M. and VREDEVELD, T., (2005). Stochastic online scheduling on parallel machines. In G. Persiano and R. Solis-Oba, eds., Lecture Notes in Computer Science, 3351:67-180.

MÖHRING, R.H., RADERMACHER, F.J. and WEISS, G., (1984). Stochastic scheduling problems I: General strategies. ZOR - Zeitschrift für Operations Research, 28:193–260. MÖHRING, R.H., SCHULZ, A. S. and UETZ, M., (1999). Approximation in stochastic scheduling: the power of LP-based priority policies. Journal of the ACM, 46:924–942. PHILLIPS, C.A., STEIN, C., and WEIN, J., (1998). Minimizing average completion time in the presence of release dates. Mathematical Programming, 82:199–223. PINEDO, M., (2002). Scheduling- Theory, algorithms, and systems. Prentice-Hall. PRUHS, K., SGALL, J. and TORNG, E., (2004). Online scheduling. In J. Leung, ed., Handbook of Scheduling: Algorithms, Models, and Performance Analysis, CRC Press. ROTHKOPF, M. H. (1966). Scheduling with random service times. Management Science, 12:703-713. SCHRAGE, L., (1968). A proof of the optimality of the shortest remaining processing time discipline. Operations Research, 16:687–690. SCHULZ, A.S., and SKUTELLA, M., (2002a). The power of α-points in preemptive single machine scheduling. Journal of Scheduling, 5:121– 133. SCHULZ, A.S. and SKUTELLA, M., (2002b). Scheduling unrelated machines by randomized rounding. SIAM Journal on Discrete Mathematics, 15:450–469. SKUTELLA, M. and UETZ, M., (2005). Stochastic machine scheduling with precedence constraints. SIAM Journal on Computing, 34:788– 802. SLEATOR, D. and TARJAN, R., (1985). Amortized efficiency of list update and paging rules. Communications of the ACM, 28:202–208. VESTJENS, A.P.A., (1997). On-line machine scheduling, Ph.D. thesis, Eindhoven University of Technology, Netherlands. VOß, S., MARTELLO, S., OSMAN, I.H., and ROUCAIROL, C. eds., (1999). Meta-Heuristics - Advances and Trends in Local Search Paradigms for Optimization. Kluwer Academic Publishers.

WEISS, G., (1990). Approximation results in parallel machines stochastic scheduling. Annals of Operations Research, 26:195–242. WEISS, G., (1992). Turnpike optimality of Smith’s rule in parallel machines stochastic scheduling. Mathematics of Operations Research, 17:255–270.