A Predictive Algorithm for Adaptive Resource ... - Semantic Scholar

3 downloads 0 Views 188KB Size Report
http://www.dyncorp-is.com/darpa/meetings/quo98jul/index.html. [RSYJ97] D. Rosu, K. Schwan, S. Yalamanchili, and R. Jha, “On Adaptive Resource. Allocation ...
A Predictive Algorithm for Adaptive Resource Management of Periodic Tasks in Asynchronous Real-Time Distributed Systems∗ Binoy Ravindran and Tamir Hegazy The Bradley Department of Electrical and Computer Engineering Virginia Polytechnic Institute and State University Blacksburg, VA 24061 Phone: 540-231-3777, Fax: 540-231-3362, E-mail: [email protected], [email protected] Abstract Real-time distributed applications such as those that are emerging for managing the entire mission of a system are characterized by significant execution-time uncertainties in the application environment and system resource state. Thus, such systems require adaptive resource management that dynamically monitor the system for adherence to the desired real-time requirements and perform run-time adaptation of the application to changing workloads when unacceptable timeliness behavior is observed. In this paper, we present a “predictive” resource management algorithm that forecasts the timeliness behavior of periodic tasks during the resource allocation process and select allocations that yield the optimal forecasted timeliness. The algorithm uses statistical regression theory-based techniques for predicting task timeliness. The algorithm uses regression equations that are based on external load parameters such as number of sensor reports and internal resource load parameters such as CPU utilization. The regression equations are determined from application profile data that is obtained by measuring the timeliness of the application for a set of external and internal load situations. The performance of the predictive algorithm is studied by comparing with a non-predictive resource management algorithm that uses heuristic rules for allocating resources. The experimental results indicate that the predictive algorithm outperforms the non-predictive algorithm when the workload shows fluctuating behavior. Keywords - real-time systems, asynchronous systems, adaptive resource management, regression theory, periodic tasks 1

Introduction

Real-time computer systems that are emerging for the purpose of strategic mission management such as coordination of multiple entities that are manufacturing a vehicle, repairing a damaged reactor, or conducting combat are subject to great uncertainties at the mission and system levels. The computations in the system are “asynchronous” in the sense that processing and communication latencies do not have known upper bounds and event arrivals have nondeterministic distributions. Such real-time mission management applications require decentralization because of the physical distribution of application resources and for achieving survivability in the sense of continued availability of application functionality that is situationspecific. Because of their physical dispersal, most real-time distributed computing systems are “loosely” coupled using communication paradigms that employ links, buses, rings, etc., resulting in additional uncertainties e.g., variable communication latencies, regardless of the bandwidth. ∗

Supported by the U.S. Office of Naval Research under Grant Numbers N00014-99-1-0158 and N000014-00-10549. 1

Asynchronous, decentralized real-time applications are usually at a supervisory level, which means that their two primary functions are usually system-wide resource management and mission management. Resource management is the application-specific aspect of the distributed execution environment that augments the operating system to compose the constituent subsystems into a coherent unit that is cost-effective to program and deploy for the intended mission. Mission management is the core functionality of the application that utilizes the resource management abstractions to conduct a particular mission. The approaches of the mission and even its objectives are highly dependent on the current external environment of the application and internal resource situation. Computational parameters of the system such as task execution times, communication delays, periods, and event arrivals are non-deterministic as they are dependent upon the conditions of the external environment where the system is deployed and the operational scenario of the system that is situation-specific. The characteristics of the application combined with the physical laws involved in distribution thus contribute to the nondeterministic and asynchronous behavior of the system [Jen92]. Recent advances in real-time distributed systems research have produced resource management technologies that model operating requirements of asynchronous real-time distributed systems as desired quality-of-service (QoS) [Quo97, Quo98]. The QoS technologies allow applications to specify or negotiate service expectations along multiple dimensions such as real-time, survivability, and quality of results. The properties are mapped to a set of coordinated enforcement mechanisms across system software layers to meet end-to-end requirements. QoS is maintained through run-time monitoring, feedback, and dynamic adaptation. The resource management algorithms used in these works for the most part uses heuristic strategies for computing resource allocations that will achieve the real-time requirements of the application. In this paper, we advance the adaptive resource management technology by presenting a “predictive” resource management algorithm that achieves the timeliness requirements of periodic tasks. The algorithm monitors the timeliness behavior of periodic tasks to detect situations when the application workload changes which may affect task timeliness. When workload changes are detected, the algorithm performs resource allocation through adaptation mechanisms such as replication of task processes (or subtasks). Process replication is employed as a means to exploit concurrency and share workloads among replicas to satisfy higher application workloads. The algorithm determines the candidate subtasks of the periodic task that needs to be replicated, the number of replicas, and the processor resources that are required for executing the replicas that will achieve the task timeliness requirements. The algorithm determines the number of replicas that are needed using a predictive technique that forecasts task timeliness when subtasks of the task are replicated and the replicas are assumed to execute on some computing nodes in the system. The algorithm incrementally adds replicas to the task subtasks until the forecasted timeliness of the task is found to be acceptable. The performance of the algorithm is studied through a combination of benchmarking and simulation. We use a real-time benchmark application that has resulted from our past work [SWR99] for obtaining application profile data.1 The performance of the algorithm is evaluated by comparing with a non-predictive, adaptive resource management algorithm that uses heuristic rules for allocating resources. The non-predictive algorithm uses simple rules for determining 1

The benchmark application was developed after a detailed study of a military, air engagement surface combatant system − the Anti-Air Warfare (AAW) system of the U.S. Navy [WRSB98]. 2

resource allocations that will achieve the real-time requirements, given a certain resource availability and application workload. The experimental results indicate that the predictive algorithm outperforms the non-predictive algorithm when the workload shows fluctuating behavior. The rest of the paper is organized as follows: Section 2 reviews the related work on resource management of periodic tasks. Section 3 discusses the application and system model that is used in this paper. We present the predictive and non-predictive adaptive resource management algorithms in Section 4. Experimental evaluation of the algorithms is discussed in Section 5. Finally, we conclude the paper in Section 6. 2. Related Efforts The work presented in this paper primarily focuses on real-time applications that have asynchronous behavior due to external loads. Therefore, to compare and contrast our work with that of related efforts, we focus on how the workload of the system is modeled in the related efforts. In existing real-time computing models, execution times of application tasks are often used to characterize system workloads. Hence, we focus on how task execution times are characterized in the related efforts for comparison. In most of the existing real-time scheduling and resource management models, task execution times are assumed to be a-priori known. Typically, execution time is assumed to be an integer “worst-case” execution time (WCET), as in [Bak91, LL73, RSZ89, SKG91, VWHL96, WSM95, XP90]. While [SKG91] establishes the utility of WCET-based approaches by listing some of its domains of successful application, others [AB98, AB98a, BN+98, JRR94, HS90, KM97, Leh96, LL+91, RSZ89, SK97, SG97, SL96, TD+95] cite the drawbacks, and in some cases the inapplicability of the approaches in certain domains. In [AB98, HS90, Leh96, RSZ89, TD+95], it is mentioned that characterizing workloads of real-time systems using a-priori worst-case execution times can lead to poor resource utilization, particularly when the difference between WCET and normal execution time is large. It is stated in [AB98, SK97] that accurately measuring WCET is often difficult and sometimes impossible. In response to such difficulties, techniques for detection and handling of deadline violations have been developed [JRR94, SK97, SG97]. Paradigms that generalize the execution time model have also been developed. Execution time is modeled as a set of discrete values in [KM97], as an interval in [SL96], and as a probability distribution in [AB98a, KG97, Leh96, SG97, TD+95]. Most models consider execution time to apply to the job atomically; however, some paradigms [LL+91, SG97] view jobs as consisting of mandatory and optional portions; the mandatory portion has an a-priori known execution time in [LL+91], and the optional portion has an a-priori known execution time in [SG97]. Most of these approaches assume that the execution characteristics (set, interval or distribution) are known a-priori. Others have taken a hybrid approach; for example, in [HS90] apriori worst case execution times are used to perform scheduling, and a hardware monitor is used to measure posteriori task execution times for achieving adaptive behavior. The approach most similar to the one presented in this paper is described in [BN+98, RSYJ97] where resource requirements are observed a-posteriori, allowing applications that have not been characterized apriori to be accommodated. Also, for those applications with a-priori characterizations, the observations are used to refine the a-priori estimates. These characterizations are then used to drive resource availability based algorithmic and period variation within the applications. 3

Thus, to the best of our knowledge, we are not aware of any work where adaptive real-time resource management is performed based on predictions of task timeliness during the resource allocation process. 3. The Application and System Model In this section, we present the application and the system model that is used in this paper.2 We introduce some simple notations to describe the application characteristics. The notations are later used in the presentation of the algorithms. The application properties and the notations are described as follows: 1. Let the set of periodic tasks in the application be denoted by the set T = {T1 , T2 , T3 ...} . 2. Each periodic task Ti is assumed to consists of a set of subtasks (executable programs), which execute in a serial fashion. We use the notation Ti = [ st1i , m1i , st2i , m2i ,...., st ni , mni ] to represent the periodic task Ti that consists of n subtasks and n messages to be executed in series. That is, stik (i > 1) cannot execute before message mik−1 arrives. 3. Let cy (Ti ) denote the current period of the periodic task Ti. 4. Let ds (Ti , c) denote the number of data items processed by the task Ti during period c. 5. Let the set of subtasks of a task Ti be denoted as ST (Ti ) = {st1i , st2i , st3i ,..., st ni } and the set of inter-subtask messages of the task Ti be denoted as MS (Ti ) = {m1i , m2i , m3i , m4i ,..., mni } . 6. An important assumption that is made regarding subtasks is that they can be replicated at run-time. The idea behind replication of subtasks is that once a subtask is replicated, the replicas of the subtask can share the data stream that was processed by the original subtask. Further, concurrency can be exploited by the executing the replicas on different processors and thereby the end-to-end latency of the task can be reduced. Thus, replication is allowed as a means to reduce task latencies and achieve their timeliness requirements when the data steam size increases and causes task latencies to increase at run-time. 7. The states of the replicas of the subtasks and their consistency is not addressed in this work, as we assume that the tasks process data objects that are “continuous” in the sense that their values are obtained directly from a sensor in the application environment, or computed from values of other such objects. The replicas are thus assumed to be temporally consistent (e.g., sufficiently up-to-date) without applying every change in value, due to the continuity of physical phenomena. 8. The set of replicas of a subtask st ij at a time t is denoted by rl ( st ij , t ) = {st ij ,1 , st ij ,2 , st ij ,3 ,...} . 9. Let eex( st ij , d , u ) denote the estimated execution time of the subtask st ij for processing a data size d on a processor with utilization u. 10. Let ecd (mij , d , c) denote the estimated communication delay of the inter-subtask message mij that carries data of size d during period c.

2

The work presented in this paper is part of a prototyping effort toward engineering the future surface combatants of the U.S. Navy (emerging generation of AEGIS ships and the SC-21 Naval Surface Combatant systems [HiPerD]). Thus, we are strongly motivated by the characteristics of Navy combatant systems in our effort. The application model that is presented here is based on the AAW system [WRSB98]. 4

11. Let dl (Ti ) denote the deadline of task Ti. 12. The application hardware is assumed to consist of a set of distributed processors that share a common communication medium such as an Ethernet segment (IEEE 802.3). The hardware is assumed to consist of a set of m processors denoted by the set PR = { p1 , p2 , p3 ,..., pm } . The processors are assumed to be homogenous and each processor is assumed to have a piece of private memory that can only be accessed by the processor itself. Further, the clocks of the processors are synchronized using an algorithm such as [Mills95]. 13. The CPU utilization of a processor pi at a time t is denoted by ut ( pi , t ) . 14. The processor that is executing a subtask st ij at time t is denoted by pr (st ij , t ) . 4. The Predictive and Non-Predictive Adaptive Resource Management Algorithms The steps involved in the adaptive resource management process of the periodic tasks include (1) run-time monitoring to determine when should subtasks of the periodic tasks be replicated and (2) determining the number of subtask replicas that are required to satisfy the real-time requirements of the tasks and processors to execute the replicas.

Application performance data and resource utilization metrics on a global time scale

1. Run-time Monitoring and Determining Candidate Subtasks for Replication

Candidate subtasks for replication

Real-time Distributed Application Candidate subtasks for replication, number of replicas, their processors

2. Determining Number of Replicas and Their Processors

Figure 1. The Adaptive Resource Management Process The overall resource management process is summarized in Figure 1. The proposed predictive and non-predictive resource management algorithms use the same technique for run monitoring of application workloads and for identifying candidate subtasks of the periodic tasks for replication (i.e., step 1). The algorithms differ in the way they determine the number of replicas that are needed and the processors for executing them (i.e., step 2). We describe each of the two steps involved in the adaptive resource management process in the subsections 4.1 and 4.2, respectively. Since the algorithms differ only in step 2, we discuss the predictive and non-predictive algorithms in Section 4.2.

5

4.1 Run-Time Monitoring and Determining Candidate Subtasks for Replication The objective of run-time monitoring is to determine when resources should be allocated so that the application can be adapted to workload changes. In this paper, we focus on replication of application subtasks for exploiting concurrency and load sharing as the adaptive resource allocation action. The adaptive resource management algorithms monitor the execution time of subtasks to determine the time at which resource allocation needs to be performed. To identify the candidate subtasks for replication, the algorithms assign individual deadlines to the subtasks and messages of the tasks from the end-to-end task deadlines. To assign deadlines to the subtasks and messages from the end-to-end deadlines, we use a variant of the equal flexibility (EQF) strategy proposed in [KG97]. Observe that EQF requires knowledge of the execution times and communication delays to compute subtask deadlines. Therefore, the adaptive resource management algorithms use estimates of the initial operating conditions of the system to derive the initial values of execution times and communication delays. Let dinit denote the initial data size processed by a subtask stik and transmitted by a message mik , respectively. Let uinit denote the initial CPU utilization of the processor on which the subtask is assumed to execute and let cinit denote a previously profiled period of Ti such that the task has processed dinit number of data items during this period. The deadline of the subtask stik is therefore given by: m m   k k k dl ( sti ) = eex( sti , dinit , uinit ) +  dl (Tk ) − ∑ eex( st j , dinit , uinit ) − ∑ ecd (m kj , dinit , cinit )  × j =i j = i +1     (1)   k eex st d u ( , , ) i init init   m  m    ∑ eex( st kj , dinit , uinit ) + ∑ ecd (mkj , d init , cinit )   j = i +1     j =i The deadline of the message mik is given by: m m   dl (mik ) = ecd ( mik , dinit , cinit ) +  dl (Tk ) − ∑ ecd (m kj , dinit , cinit ) − ∑ eex ( st kj , dinit , uinit )  × j =i j = i +1     (2)   k ecd (mi , d init , cinit )   m  m    ∑ ecd (m kj , dinit , cinit ) + ∑ eex( st kj , dinit , uinit )   j = i +1   j =i  

We describe the computation of the estimated execution time eex( st ij , d , u ) and estimated communication delay ecd (mij , d , c) in Section 4.2.1.1 and Section 4.2.1.2, respectively.

6

Once the individual subtask deadlines are determined, the algorithm monitors the execution time of each subtask at run-time. Each subtask is required to maintain a minimum slack value on its individual deadline. The algorithm considers subtasks that have slack values that are lower than the desired value as candidates for replication. Also, subtasks that miss their individual deadlines are also identified as candidates for replication. For both the cases, the algorithm increments the number of the replicas of the candidate subtasks iteratively until the predicted overall latency reaches the desired value. The algorithm also “shuts down” undesired replicas of subtasks if the subtasks exhibit very high slack values. Such subtasks are considered as candidates for de-allocating the system resources—replicas—that have been allocated. At each time a resource management action such as replication or de-allocation is taken, the subtask deadlines are re-assigned using the variant of EQF described here. 4.2 Determining Number of Replicas and Processors In this section, we discuss the predictive and non-predictive algorithms that determine the number of replicas of candidate subtasks and their processors. The algorithms are discussed in the subsections that follow. 4.2.1 Predictive Algorithm The predictive algorithm determines the number of replicas that are needed by forecasting subtask timeliness when subtasks of the task are replicated and the replicas are assumed to execute on some computing nodes in the system. The algorithm incrementally adds replicas to the subtasks until the forecasted timeliness is found to be acceptable. The algorithm uses statistical regression theory for predicting subtask timeliness. The resource usage characteristics such as subtask execution times and message communication delays are profiled at a set of internal resource utilizations (e.g., CPU utilization, network utilization) and external workloads (e.g., sensor reports). The application profile data is then used to determine regression equations that compute resource usage characteristics at arbitrary internal resource utilization levels and external workloads. The regression equations are used to forecast resource needs and determine resource allocations that satisfy the timeliness requirements. We discuss how the execution times of subtasks (i.e., eex( st ij , d , u ) ) and communication delays of subtask messages (i.e., ecd ( mij , d , c) ) are determined in subsections 4.2.1.1 and 4.2.1.2, respectively. Section 4.2.1.3 describes the process of determining the subtask replicas and processors for executing them. 4.2.1.1 Estimating Subtask Execution Latency as a Function of Workload and Processor Utilization The algorithm estimates the execution latency of a subtask as function of the subtask workload and the utilization of the processor on which it is assumed to execute. We characterize the workload of a subtask as the number of data items that it needs to process, since the number of data items processed by the subtasks constitute the most significant part of the application workload. The processor utilization is characterized by the utilization of the CPU, since we have 7

observed that application programs in systems such as AAW are mostly CPU intensive. Thus, utilization of other resources such as memory is assumed to be relatively insignificant. To determine the function that computes subtask execution latency as a function of the subtask workload and resource usage, we use application profile data and regression theory. The execution latencies of the application subtasks are profiled for a number of resource utilization conditions and workloads. We use the real-time benchmark application that has resulted from our past effort [SWR99] as the example application for profiling and measurement of execution latencies. The application profile data is then used to compute the regression equations. 

400



300

Execution Latency (ms

Execution Latency (ms)

80% CPU Utilization 500

y Y

200

Y-

100

 

y



Y



Y-



0 0

5

10

15

20

% CPU Utilization

25











Data Size ( scale unit =  tracks)

Data Size (1 scale unit = 300 tracks)

Figure 3. Execution Latencies of EvalDecide Program at 60% CPU Utilization and Different Data Sizes

Figure 2. Execution Latencies of Filter Program at 80% CPU Utilization and Different Data Sizes

Execution Latency of Filterat Different CPU Utilizations and Data Sizes

  

Execution Latency (ms)

 









6



CPU Utilization





Data Size

Figure 4. Execution Latencies of Filter at Different CPU Utilizations and Data Sizes Figures 2 and 3 show sample plots of execution latency measurements of two application subtasks of the benchmark called Filter and EvalDecide. The latency measurements are made for increasing number of data items processed by the subtasks at a certain level of CPU utilization (blue lines called “y” in the figures). For the latency measurements made at a CPU-utilization, we first determine a second order non-linear regression equation that computes execution latency as a function of data size (red lines called “Y” in the figures). Finally, we combine the equations 8

for latency measurements made at a set of different CPU utilizations into a single regression equation that computes execution latency as a function of data size and CPU utilization (green lines called “Y-” in the figures). Figure 4 combines the execution latency measurements of Filter at different CPU utilizations and data sizes into a single 3-dimensional plot. The regression equation thus obtained for a subtask st ij is given by:

eex( st ij , d , u ) = f ( d , u ) = ( a1u 2 + a2 u + a3 ) d 2 + (b1u 2 + b2 u + b3 ) d

(3)

where the execution latency is obtained in milliseconds, d is the data size in hundreds of data items, u is the CPU utilization in percentage, and ai 's and bi 's are constants that are dependent upon the application subtask. Thus, the regression equation (3) determines the execution latency of an application subtask for a given number of data items that it needs to process and CPU utilization of the processor in which it is assumed to execute. 4.2.1.2 Estimating Subtask Communication Delay as a Function of Periodic Workload To predict the communication delay between a pair of communicating subtasks, we divide the problem into two parts. The first part determines how long data stays in host and network buffers before getting transmitted. The second part determines how long it takes a message to be transmitted from a processor node that is executing the subtask that is generating the message to the processor node that is executing the subtask that receives the message. Thus, the overall communication delay incurred for transmitting a message mij of size d during cycle c of Ti is given by: ecd (mij , d , c ) = Dbuf (d , c) + Dtrans (d )

(4)

where Dbuf (d , c ) is the buffer delay and Dtrans ( d ) is the transmission delay incurred. By simulating the execution of the benchmark application on a distributed system under a number of different periodic workload situations, we noticed that Dbuf increases with the increase in the workload. Moreover, we found that a simple linear approximation of this delay is reasonable for the purpose of estimating communication delays. (5) Dbuf (d , c) = k × ∑ ds (Ti , c) ∀i

where k is the slope of the regression line. The second part of the problem, which is determining the transmission delay, is straightforward. The transmission delay of a subtask is independent of the other subtask messages and will depend only on the message size that needs to be transmitted. If ls denotes the link transmission speed, then the transmission delay for a message of size d is given by: d (6) Dtrans (d ) = ls

9

4.2.1.3 Determining Number of Subtask Replicas and Their Processors Once the candidate subtasks for replication are identified, the algorithm determines the number of replicas that are needed for each subtask and processors for executing them that will achieve the end-to-end timing requirements of the task. The algorithm determines the number of replicas that are needed in an incremental manner. For each of the candidate subtasks, the algorithm first considers adding one more replica and determines whether adding the replica can satisfy the subtask deadline. This is done by estimating the execution latency of the subtask replica and the communication delay incurred in transporting the message from the predecessor subtask (of the original subtask) to the replica. For example, let Ti = [st1i , m1i ,...., st ip , mip , stqi , mqi , stri , mri ,...., stni , mni ] be a periodic task that is considered for resource allocation. Let the subtask stqi of Ti be the candidate subtask that is considered by the algorithm for replication. Let the subtask has (already) k replicas when the algorithm considers the subtask for replication. The algorithm first determines whether adding one more replica sqi , k +1 will satisfy the subtask deadline dl ( stqi ) . To determine this, the algorithm estimates the execution latency of all the replicas stqi ,1 , stqi ,2 ,...., st qi , k , stqi , k +1 . Each replica will process 1 k +1 of the total data size that the subtask should process. The execution latency of subtask replica eex( stqi ,r , d , u ), ∀r :1 ≤ r ≤ k + 1 is estimated on the least utilized processor on which the replica can be potentially executed. The algorithm uses the regression equation (3) to determine the subtask execution latency. Note that the observed CPU utilization of the processor and the data size that the replica will need to process are used in the regression equation. The algorithm also estimates the communication delay incurred in transporting the messages mqi ,1 , mqi ,2 ,...., mqi , k , mqi , k +1 between the subtask replicas stqi ,1 , stqi ,2 ,...., st qi , k , stqi , k +1 and their predecessor subtask st ip .3 Observe that the messages between the subtask replicas and the predecessor subtask will now transport 1 k +1 of the total data size instead of 1 k . The algorithm uses the regression equations (4), (5), and (6) to determine the subtask message communication delay ecd (mip ,r , d , c)∀r :1 ≤ r ≤ k + 1 . Note that the current periodic workload is used in the regression equation. Once the subtask execution latency and the message communication latency are determined, the end-to-end subtask latency is determined as the sum of the two latencies and is compared with the subtask deadline. The process is repeated in an iterative fashion, increasing the number of replicas with each iterative step until the subtask deadline is satisfied. Observe that with each additional replica, the data size load of each of the existing replicas is reduced and thus their forecasted execution times and communication delays are also reduced. Figure 5 shows the pseudo-code of the algorithm that determines the (additional) number of replicas needed for a candidate subtask. The procedure returns the value SUCCESS if it determines that the subtask deadline can be satisfied. Otherwise, it returns the value FAILURE. 3

i

i

i

The communication delay incurred for the message mq ,1 between stq ,1 and its successor subtask str is i

incorporated in the deadline of the str . 10

ReplicateSubtask( st ij , t) /* st ij denotes the jth subtask of periodic task Ti and t denotes the current time */ 1. Set PT := PR − PS ( st ij ) ; /* PR is the set of all processors */ 2. If PT := ∅ 2.1. Return FAILURE; End if 3. Determine pmin ∈ PT : (¬∃ pk ∈ PT : ut ( pk , t ) < ut ( pmin , t ) ); /* pmin is the least utilized processor in PT */

4. PT := PT − { p};

5. PS ( st ij ) := PS ( st ij ) * { p}; /* PS ( st ij ) denote the set of processors for executing the replicas of subtask st ij */ 6.

For every q ∈ PS ( st ij ) 6.1. Set u := ut ( q, t ) and c := cy(Ti ); /* u: utilization of processor q; c: current period of pt (Ti ) */ 6.2. Set d :=

ds (Ti , c) /* d is the data size that each replica will process */ | PS ( st ij ) |

(

)

(

)

6.3. Determine eex( st j ,k , d , u ) := a1u 2 + a2u + a3 d 2 + b1u 2 + b2 u + b3 d ; i

6.4 Determine ecd (mij ,k , d , c ) := Dbuf (d , c) + Dtrans (d ) where

Dbuf (d , c) = k × ∑ ds (Ti , c) and Dtrans (d ) = ∀i

6.5. Set TotalDelay := eex( st

i j ,k

d ; ls

, d , u ) + ecd (mij ,k , d , c );

6.6. If TotalDelay > dl ( st ij ) − sl /* sl denotes the latency slack value that is desired to be maintained on the subtask deadline. In this paper, this is set as sl = 0.2 × dl (st ij ) , thus desiring a 20% slack on the deadline*/ 6.6.1 Goto step 2; /*need another replica*/ End if End for 7. Return SUCCESS;

Figure 5. Predictive Algorithm for Determining Additional Number of Subtask Replicas As described in Section 4.1, it may be desired to shut down replicas of a subtask if it shows a very high slack value. In such cases, the algorithm shuts down the last added replica to the subtask. Figure 6 shows the pseudo-code of the algorithm that shuts down the undesired replicas of a subtask.

11

ShutDownAReplica( st ij ) /* st ij denotes the jth subtask of periodic task Ti */ 1.

If | PS ( st ij ) | := 1

2.

else

1.1 Return; 2.1 Let p be the last added element to PS ( st ij ) ; 2.2 PS ( st ij ) := PS ( st ij ) − { p}; 2.3 Return; End if

Figure 6. Algorithm that Shuts Down Subtask Replicas 4.2.2 Non-predictive Algorithm The non-predictive algorithm determines the number of replicas based upon the available resources in the system. The algorithm identifies processors that are exhibiting utilization levels below a threshold value and replicates the candidate subtasks that are executing on such processors. Figure 7 shows the pseudo-code of the algorithm that determines the number of replicas and the processors that are needed for a subtask. ReplicateSubtask( st ij , t) /* st ij denotes the jth subtask of periodic task Ti and t denotes the current time */ 1. For every p ∈ PR − PS ( st ij ) 1.1 Set u := ut ( p, t ); 1.2 If u < UT /* UT is the processor utilization threshold; above this threshold, the processor is considered highly utilized*/ 1.2.1 PS ( st ij ) = PS ( st ij ) * { p} End if End for

Figure 7. Non-predictive Algorithm for Determining Additional Number of Subtask Replicas The non-predictive algorithm also uses the algorithm shown in Figure 6 to shut down the unnecessary replicas for the candidate subtasks. 5. Experimental Evaluation We evaluate the performance of the algorithms using a combination of benchmarking and simulation. We summarize the baseline parameters of the experimental study in Section 5.1. Section 5.2 presents the performance results of the algorithm.

12

5.1 Baseline Parameters Table 1 summarizes the baseline parameters of the experimental study. The baseline parameters are derived from the real-time benchmark that has resulted from our past work [SWR99]. The structure of the periodic tasks used here corresponds directly to that of the benchmark. Table 1. Baseline Parameters Number of nodes CPU scheduler at each node Network Data item (track) size Data arrival period Relative end-to-end deadline Number of periodic Tasks Number of subtasks per task Number of replicable subtasks per task CPU Utilization Threshold (For non-predictive algorithm)

6 Round-Robin (Time slice = 1 ms) Ethernet (Transmission Speed = 100Mbps) 80 bytes 1 sec 990 ms 1 5 2

20%

Recall that the regression equations (3) and (5) use a set of coefficients to determine the subtask execution latency and buffer delay, respectively. The values of the coefficients for the two replicable subtasks of the periodic task are shown in Table 2 and Table 3. The values are obtained by measurement. Table 2. Coefficients of the Execution Latency Regression Equation Subtask number 3 5

a1 -0.00155 0.002123

a2 1.535E-05 -1.596E-05

a3 0.11816174 0.022324

b1 0.0298276 -0.023927

b2 -0.000285 0.000108

b3 0.983699 1.443762

Table 3. Coefficients of the Buffer Delay Regression Equation Subtask number 3 5

k 0.7 0.7

5.2 Simulation Results Increasing Ramp

Workload Patterns

Decreasing Ramp Triangular

Workload

Maximum Workload













Period Number

Figure 8. Workload Patterns for Evaluating Performance of Predictive and Non-predictive Algorithms 13

          

Missed Deadlines PREDICTIVE NON-

%

%

We evaluated the performance of the proposed algorithms for a task set that consisted of a single periodic task. For this task set, a set of experiments was conducted with three different workload patterns (see Figure 8). The patterns include an increasing ramp pattern, a decreasing ramp pattern, and a triangular pattern. The patterns use a workload interval that is defined by a minimum and maximum workload. The increasing ramp pattern starts with the minimum workload and gradually increases the workload until it reaches the maximum workload. The decreasing ramp pattern starts with the maximum workload and then gradually decreases the workload until it reaches the minimum workload. The triangular pattern is a combination of the increasing and the decreasing ramp patterns. The pattern alternates between workload increases and decreases.















         



Average CPU Utilization PREDICTIVE NON-PREDICTIVE





Max Workload ( scale unit =  Track)



%









(b)

(a) 



Max Workload ( scale unit =  Track)



Average Network Utilization

Average Subtask Replicas



PREDICTIVE



PREDICTIVE



NON-PREDICTIVE



NON-PREDICTIVE





 







 











Max Workload ( scale unit =  Track)

(c)





 











Max Workload ( scale unit =  Track)





(d)

Figure 9. Performance of the Algorithms for the Triangular Workload Pattern: (a) Missed Deadline Ratio, (b) Average CPU Utilization, (c) Average Network Utilization, (d) Average Number of Subtask Replicas Figure 9(a), 9(b), 9(c), and 9(d) show the missed deadline ratio, average CPU utilization, average network utilization, and the average number of subtask replicas used by the two algorithms for the triangular pattern. Each data point in the figures is obtained by a single experiment and corresponds to the maximum workload used in the experiment for the workload pattern. The 14

figure indicates that the non-predictive algorithm has a smaller missed deadline percentage and CPU utilization than the predictive algorithm. However, the non-predictive algorithm uses much larger number of subtask replicas and this causes larger utilization of the network. To evaluate the combined performance of the algorithms for all the metrics, we define a “combined performance metric” that simply aggregates the individual metrics as: R C = MD + U CPU + U Net + Max ( R) where C denotes the combined performance metric, MD is the missed deadline percentage, UCPU denotes the average utilization of processors, and UNet denotes the average utilization of the network. The term R Max( R) denotes the ratio of the average number of replicas used ( R ) to the maximum possible number of replicas that will exploit the maximum concurrency ( Max ( R ) ). The total number of processors in the system limits the maximum possible number of replicas that can exploit the maximum concurrency. Thus, the ratio defines the percentage replica use of an algorithm. Combined Performance Metric PREDICTIVE NON-PREDICTIVE

0

5

10

15 20 Max Workload (1 scale unit = 500 Track)

25

30

35

Figure 10. Combined Performance of the Algorithms for the Triangular Pattern Figure 10 shows the combined performance of the algorithms for the triangular pattern. The figure shows that for smaller workloads where no replication is needed, the performance of both algorithms is the same. However, for larger workloads, the predictive algorithm shows a bettercombined performance than the non-predictive algorithm (for the combined performance metric, the smaller the value, the better is the performance of the algorithm). The performance of the algorithms for the same set of metrics for the increasing ramp and decreasing ramp workload patterns is shown in Figures 11(a) through 11(d) and Figures 12(a) through 12(d), respectively. Figures 13(a) and 13(b) show the combined performance of the algorithms for the increasing ramp and decreasing ramp workload patterns, respectively. The figures are given in the Appendix. From the figures, we observe that the predictive algorithm performs better than the nonpredictive algorithm for the workload range 0-28. However, when the workload increases 15

further, the non-predictive algorithm outperforms the predictive algorithm by a slight margin. To study this phenomenon, we continued the experiment for larger workload ranges for both the increasing and decreasing ramp patterns. The results of this study are not shown here. We observed that as the workload increases further, the performance of the two algorithms fluctuates i.e., after a threshold value of 28, the non-predictive performs better than the predictive for some interval and the predictive performs better than the non-predictive for some other intervals. Thus, we conclude that the predictive algorithm performs better than the non-predictive when the workload pattern shows fluctuating behavior as the triangular pattern studied here. Further, for monotonically increasing or decreasing workload patterns, the predictive algorithm performs better than the non-predictive until the workload reaches a threshold value. Beyond the threshold, the performance of the two algorithms fluctuates. 6. Conclusions In this paper, we present a predictive adaptive resource management algorithm that forecasts the timeliness behavior of periodic tasks during the resource allocation process and select allocations that yield the optimal forecasted timeliness. The algorithm uses statistical regression theorybased techniques for predicting task timeliness. The algorithm uses regression equations that are based on external load parameters such as number of sensor reports and internal resource load parameters such as CPU utilization. The regression equations are determined from application profile data that is obtained by measuring the timeliness of the application for a set of external and internal load situations. We study the performance of the algorithm through a combination of benchmarking and simulation study. The performance of the predictive algorithm is evaluated by comparing with a non-predictive resource management algorithm that uses heuristic rules for allocating resources. The experimental results indicate that the predictive algorithm outperforms the non-predictive algorithm when the workload shows fluctuating behavior. Furthermore, the predictive algorithm performs better than the non-predictive when the workload monotonically increases or decreases until it reaches a threshold value. Beyond the threshold, the performance of the two algorithms fluctuates. References [AB98]

[AB98a] [Bak91] [BN+98]

[HS90]

L. Abeni and G. Buttazzo, “Integrating Multimedia Applications in Hard Real Time Systems,” Proceedings of The IEEE Real-Time Systems Symposium, pages 3-13, December, 1998. A. Atlas and A. Bestavros, “Statistical Rate Monotonic Scheduling,” Proceedings of The IEEE Real-Time Systems Symposium, pages 123-132, December 1998. T. P. Baker, “Stack-based scheduling of real-time processes, Journal of Real-Time Systems,” Volume 3, Number 1, pages 67-99, March 1991. S. Brandt, G. Nutt, et. al., “A Dynamic Quality of Service Middleware Agent for Mediating Application Resource Usage,” Proceedings of The IEEE Real-Time Systems Symposium, pages 307-317, December, 1998. D. Haban and K. G. Shin, “Applications of Real-Time Monitoring for Scheduling Tasks with Random Execution Times,” IEEE Transactions on Software Engineering, Volume 16, Number 12, pages 1374-1389, December, 1990. 16

[Jen92]

[JRR94]

[KG97]

[KM97]

[Leh96] [LL73]

[LL+91]

[Mills95] [Quo97] [Quo98] [RSYJ97]

[RSZ89]

[SG97]

[SK97]

[SKG91]

[SL96]

[SWR99]

E. D. Jensen, “Asynchronous Decentralized Real-Time Computer Systems,” RealTime Computing, W. A. Halang and A. D. Stoyenko (Editors), Proceedings of the NATO Advanced Study Institute, St. Martin, Springer-Verlag, October, 1992. F. Jahanian, R. Rajkumar, and S. Raju, “Run-time Monitoring of Timing Constraints in Distributed Real-time Systems,” Journal of Real-Time Systems, 1994. B. Kao and H. Garcia-Molina, “Deadline Assignment in a Distributed Soft Realtime System,” IEEE Transactions on Parallel and Distributed Systems, Volume: 8 Issue: 12, pages 1268 –1274, December 1997. T.-W. Kuo and A. K. Mok, “Incremental Reconfiguration and Load Adjustment in Adaptive Real-Time Systems,” IEEE Transactions on Computers, Volume 46, Number 12, pages 1313-1324, December, 1997. J. P. Lehoczky, “Real-time Queueing Theory,” Proceedings of The IEEE RealTime Systems Symposium, pages 186-195, December 1996. C. L. Liu and J. W. Layland, “Scheduling algorithms for multiprogramming in a hard real-time environment,” Journal of the ACM, Volume 20, Number 1, pages 46-61, 1973. J. W. S. Liu and K. J. Lin, et. al., “Algorithms for Scheduling Imprecise Computations,” IEEE Computer, Volume 24, Number 5, pages 129-139, May 1991. D. L. Mills, “Improved Algorithms for Synchronizing Computer Network Clocks,” IEEE/ACM Transactions on Networks, pages 245 – 254, June 1995. DARPA ITO, “Quorum,” Available at http://www.ito.darpa.mil/research/quorum/projlist.html, August 1997. DARPA ITO, “Quorum/High Confidence Computing PI Meeting,” Available at http://www.dyncorp-is.com/darpa/meetings/quo98jul/index.html D. Rosu, K. Schwan, S. Yalamanchili, and R. Jha, “On Adaptive Resource Allocation for Complex Real-time Applications,” Proceedings of The 18th IEEE Real-Time Systems Symposium, pages 320-329, December 1997. K. Ramamritham, J. A. Stankovic, and W. Zhao, “Distributed scheduling of tasks with deadlines and resource requirements,” IEEE Transactions on Computers, Volume 38, Number 8, pages 1110-1123, August 1989. H. Streich and M. Gergeleit, “On the Design of a Dynamic Distributed Real-Time Environment,” Proceedings of The Workshop on Parallel and Distributed RealTime Systems, pages 251-256, 1997. D.B. Stewart and P.K. Khosla, “Mechanisms for detecting and handling timing errors,” Communications of the ACM, Volume 40, Number 1, pages 87-93, January 1997. L. Sha, M. H. Klein, and J. B. Goodenough, “Rate Monotonic Analysis for RealTime Systems,” In A. M. van Tilborg and G. M. Koob, Editors, Scheduling and Resource Management, pages 129-156, 1991. J. Sun and J. W. S. Liu, “Bounding Completion Times of Jobs with Arbitrary Release Times and Variable Execution Times,” Proceedings of The IEEE RealTime Systems Symposium, 1996. B. Shirazi, L. R. Welch, B. Ravindran, “DynBench: A Benchmark Suite for Dynamic Real-Time Systems,” Journal of Parallel and Distributed Computing Practices, 1999, Accepted for publication, To appear. 17

[TD+95]

[VWHL96]

[WSM95]

[XP90]

T. S. Tia, Z. Deng, et. al., “Probabilistic Performance Guarantee for Real-Time Tasks with Varying Computation Times,” Proceedings of The IEEE Real-Time Technology and Applications Symposium, pages 164-173, 1995. J. Verhoosel, L. R. Welch, D. Hammer, and E. J. Luit, “Incorporating temporal considerations during assignment and pre-run-time scheduling of objects and processes,” Journal of Parallel and Distributed Computing, Volume 36, Number 1, pages 13-31, July 1996. L. R. Welch, A. D. Stoy enko, and T. J. Marlowe, “Modeling resource contention among distributed periodic processes specified in cart-spec,” Control Engineering Practice, Volume 3, Number 5, pages 651-664, May 1995. J. Xu and D. L. Parnas, “Scheduling processes with release times, deadlines, precedence, and exclusion relations,” IEEE Transactions on Software Engineering, Volume 16, Number 3, pages 360-369, March 1990.

        

Missed Deadlines PREDICTIVE NON-PREDICTIVE

%

%

Appendix



  Max Workload ( scale unit =  Track)

        

Average CPU Utilization PREDICTIVE NON-PREDICTIVE









Average Network Utilization



PREDICTIVE

%



NON-PREDICTIVE

    













Max Workload ( scale unit =  Track)

(c)









(b)

(a) 



Max Workload ( scale unit =  Track)





 .  .  .  .  . 

Average Subtask Replicas PREDICTIVE NON-PREDICTIVE

















Max Workload ( scale unit =  Track)

(d)

Figure 11. Performance of the Algorithms for the Increasing Ramp Workload Pattern: (a) Missed Deadline Ratio, (b) Average CPU Utilization, (c) Average Network Utilization, (d) Average Number of Subtask Replicas 18

         

Missed Deadlines PREDICTIVE NON-PREDICTIVE

Average CPU Utilization PREDICTIVE NON-PREDICTIVE

%

%

         















Max Data Size ( scale unit = Track)











(a)

%









(b) 

Average Network Utilization



PREDICTIVE





NON-PREDICTIVE





Average Subtask Replicas









PREDICTIVE



 



Max Data Size ( scale unit =  Track)

 











Max Data Size ( scale unit =  Track)

(c)





NON-PREDICTIVE













Max Data Size ( scale unit =  Track)





(d)

Figure 12. Performance of the Algorithms for the Decreasing Ramp Workload Pattern: (a) Missed Deadline Ratio, (b) Average CPU Utilization, (c) Average Network Utilization, (d) Average Number of Subtask Replicas

19

Combined Performance Metric

Combined Performance Metric

PREDICTIVE NON-PREDICTIVE

PREDICTIVE NON-PREDICTIVE

0

5

10

15

20

25

Max Workload (1 scale unit = 500 Track)

30

35

0

5

10

15

20

25

30

35

Max Data Size (1 scale unit = 500 Track)

(b) (a) Figure 13. Combined Performance of the Algorithms for (a) Increasing Ramp Pattern and (b) Decreasing Ramp Pattern

20