Optimization of Supply Voltage Assignment for Power ... - CiteSeerX

0 downloads 0 Views 202KB Size Report
Reduction on Processor-Based Systems. Tohru ISHIHARA ... tery powered devices such as digital cellular tele- phones and ... power optimization techniques at various levels of abstraction are ..... //www.intel.com/IAL/powermgm/apmv12.pdf.
Optimization of Supply Voltage Assignment for Power Reduction on Processor-Based Systems

Tohru ISHIHARA and Hiroto YASUURA Department of Computer Science and Communication Engineering Graduate School of Information Science and Electrical Engineering Kyushu University 6{1 Kasuga-koen, Kasuga-shi, Fukuoka 816 Japan E-mail: fishihara,[email protected]

Abstract

In this paper we propose a system level power optimization problem : A problem to assign optimal VDD to each job under a time constraint. The objective function is total energy for the processing of all jobs. The proposed voltage assignment method consists of following two concepts.



If some jobs are processed on single processor system under a time constraint, assigning lower supply voltage to the set of jobs whose time constraint is loose.



If some jobs are processed under a time constraint and the switched capacitances ( (load capacitance) 2 (switching activity)) of jobs are di erent each other, assigning lower supply voltage to the heavier jobs and higher voltage to the lighter jobs.

We de ne the VDD assignment problem and formulate it as an integer linear programming (ILP) problem. An optimal assignment can be obtained by solving the ILP problem. Experimental results show that the proposed method reduces total energy consumption dramatically without being noticed a performance reduction. I. Introduction

With recent popularizations in portable, battery powered devices such as digital cellular telephones and personal digital assistants, minimizing power consumption of CMOS VLSI circuits becomes more and more important issue. These wireless communications and imaging systems demand

high-speed computations and complex functionalities with low power consumption. Recently many power optimization techniques at various levels of abstraction are proposed such as at circuit, logic, architectural and system levels. As the system level power optimization techniques, the choice of optimal supply voltage(VDD ) has strong e ects for power reduction. In recent researches, multiple supply voltage scheduling problem were proposed [4, 8]!%The proposed scheduling problem refers to the assignment of a supply voltage level to each operation in a data ow graph so as to minimize the average energy consumption for given computation time or throughput constraints or both. Since in such techniques, supply voltage is statically assigned to each functional module, these techniques become ineffective when the time constraints di er from application to application. In the past few years, some power optimizations by dynamic voltage scaling have been studied[1, 2, 5, 9]. Intel and Microsoft have proposed a speci cation called Advanced Power Management (APM)[5]. This speci cation de nes an interface between power management software which resides in BIOS, and a hardware-independent power management driver in operating system. This driver can manage APM-aware applications, by notifying them of processor state changes, and it provides an API that allows applications to directly employ power management. Furthermore, Microsoft proposed a concept of power management called \OnNow" in [1]. These approach show great promise for further reduction

in energy consumption, but in these proposed techniques, a strategy to optimize the VDD assignment to each application is not established. It is hard to say that such techniques nd optimal VDD assignment for each application. In addition to ON/OFF control of VDD , detailed selection of the VDD from various kinds of voltage level is important for energy reduction, because performance requirements di er from application to application, and required energy is also quite di erent for each application. However, most architects decide the speci cation of microprocessor to satisfy the maximum required performance of application programs. This design strategy leads to needless increase of energy consumption for the application programs which need not so high performance. If the VDD of microprocessor can be dynamically scaled, total energy consumption for the processing of various application programs will be reduced dramatically. The power reduction techniques which make it possible to scale VDD of processor just to meet the performance of the application program have been proposed as follows. L. Nielsen et al. show a self-timed system in [9]. The self-timed system achieves maximum savings by lowering the power supply voltage until the chip can just meet the speci c performance requirement. Their approach is scaling VDD according to the consistency of input data to processing circuit, but the consistency of input data is not an optimal measure of required performance. Thomas D. Burd et al. proposed metrics for energy eciency and power optimization techniques[2]. Similar to our approach, their approach nd optimal VDD which is tting to the desired performance, but their problem de nition is quite di erent from our approach. In this paper, we address a concept of system level power optimization as follows: for given jobs including information of switched capacitance and the number of execution cycles, nding optimal VDD assignment which minimizes total energy consumption of all jobs under a time constraint. This paper is organized as follows. In section 2, we show assumptions for problem de nition. In section 3, we de ne and formulate the VDD assign-

ment optimization problem. Experimental results are shown in section 4. In section 5, we discuss on the extension of the proposed problem. Section 6 concludes this paper. II. Assumptions A. Preliminary

The dominant source of power dissipation in a digital CMOS circuit is the dynamic power dissipation,

Pdynamic =

Xn CL(k) 1 Swit(k) 1 VDD 2

k=1

(1)

where CL (k) is the load capacitance of a gate gk , Swit(k) the switching count of gk per second, VDD the supply voltage and n the number of gates in the circuit. It can be said that reduction of the VDD is

the most e ective for power reduction. But reducing power supply voltage causes increase of circuit delay. For given process technology, the circuit delay can be estimated using a formula(2)[3],



/ VDD =(VG 0 VTH )2

(2)

where  is the propagation delay of CMOS transistor, VTH the threshold voltage, and VG the voltage of input gate. Formulae(1) and (2) indicate that there is power-delay trade-o under the VDD is variable. Because of the power delay trade-o in CMOS devices, obtaining high performance with low power consumption is very dicult. For the portable electronics which have to process a various application programs with low power consumption, nding optimal VDD assignment for each job has strong impact on energy reduction and is very interesting problem. B. Assumptions for the

VDD

Assignment

The voltage assignment method proposed in this paper targets systems which satisfy the following assumptions:



Some jobs are processed on a single processor system under a time constraint. The Job means fragment of application programs. Only one time constraint is speci ed

to the set of given jobs. All of the given jobs must be nished processing until the time constraints.



Next, we discuss on assumptions of microprocessor which embedded in the target system. Following microprocessor must be necessary to assign optimal VDD to each job.





The VDD and the clock frequency of the microprocessor[6] can be varied by an instruction. A mechanism to vary the VDD by the instruction of microprocessor is necessary to vary the VDD dynamically. Only to have an instruction to vary the VDD and a register to save a state of VDD , VDD scaling is simply realized. In this paper we call this register and this instruction, Power control register and Power control instruction respectively. Of course, a compiler support which inserts the Power control instruction into compiled assembly code to vary VDD dynamically is necessary.

A energy consumption of power converter itself such as DC-DC converter must be negligible. The proposed technique needs a support of a controller to vary the VDD such as DC-DC converter. A power consumption of the controller must be negligible compared with that of the microprocessor. Current research in low power portable electronics includes design of low voltage on-chip DC-DC converters, and ef ciencies above 90% have been reported [12]. But we needs still higher eciencies to achieve lower power consumption of microprocessor.

The number of execution cycles of each job can be statically estimated. Since the proposed method targets applications whose execution cycle count per unit time must be predicted statically, one of the applications of the proposed method is an realtime processing. In the real-time system, the number of execution cycles which must be nished processing until speci ed time constraint.

A clock frequency tracks over the variation of the VDD synchronously. We need the variable clock scheme whose clock frequency tracks over the VDD . A number of researchers have proposed using a delay ring oscillator[10] to match the critical path of a circuit, and using the frequency of the ring as an indicator of the chip's performance. The chip and the oscillator are built on the same die so that they closely track over process, temperature and voltage.

Required performances (execution cycle count per second) and the switched capacitance ((load capacitance) 2 (switching activity)) of each given job are basically di erent each other. The variance of execution cycles and the switched capacitance ((load capacitance) 2 (switching activity)) is strongly related to the e ect of power reduction. It seems that the variance of execution cycles and the switched capacitance becomes bigger, the e ect of energy reduction by proposed technique becomes also bigger.







Overhead time to vary the VDD and clock frequency is negligible compared with execution time of jobs. It is necessary for the VDD assignment optimization technique that an overhead time to vary the VDD and clock frequency is negligible compared with the execution time of the given jobs. The biggest latency to restart the PLL(Phase Lock Loop) takes 10-100 sec is reported[11]. However this is for an analog PLL. A digital PLL has been implemented with a reported lock time of under 2  [sec], drastically reducing the start-up time when fully powered-down mode[7]. It is better for the proposed technique to vary the VDD as faster as possible.

C. Our Approach

In this paper we address following two basic concepts to reduce power consumption of given jobs under time constraint.





If given jobs have di erent performance requirement each other, assigning lower supply voltage to the jobs which require not so high performance. If given jobs have di erent switched capacitance ((load capacitance) 2 (switching activity)) each other, assigning lower supply voltage to the heavier jobs and higher supply voltage to the lighter jobs.

We illustrate an example of the optimal VDD assignment in this subsection. At rst, we show a mechanism that the power reduction is achieved by assigning the lower voltage to the jobs which need not so performance. When Job-1, Job-2 and Job3 are given as shown in the Table I, total power consumption for the execution of all jobs is 40% reduced by assigning a suitable VDD to each job, compared with execution of xed VDD at 5.0V. For simpli cation, we have following assumptions on target systems.

 

The energy of DC-DC converter and overhead time to vary the VDD is negligible. We can choose the following two kinds of execution modes. This assumption properly obeys Formula(2). We assume that the VTH in the Formula(2) is 0.7[V] in this example. { A supply voltage and a clock frequency are 1.8V and 25MHz respectively. { A supply voltage and a clock frequency are 5.0V and 100MHz respectively.



A time constraint is 50 [sec].



Three kinds of jobs as shown in the Table I are processed in the target system(Cyclej and Cj represent the number of total execution cycles of j th job and the switched capacitance of j th job respectively). TABLE I Assumption for an example

Jobs Job-1 Job-2 Job-3

Cyclej [cycle] Cj [nF] 1 2 109 10 2 109 15 2 109

2.0 0.8 0.4

Total energy consumption for the execution of all given jobs is following



When the VDD is xed at 5.0V, the total energy consumption is 400[J ] (case-1 in gure1)!%



When the processor can vary the VDD dynamically and assign the VDD satisfying the time constraint, the total energy consumption is 330:368[J ] (case-2 in gure1).



When the processor can vary the VDD dynamically and assign the VDD considering the variance of switched capacitance, the total energy consumption is 234:624[J ] (case-3 in gure1).

5.0V

Job1

Case-1

10x109 cycle

Time Constraint

15x109 cycle

Job3

Job2

20

10

30

40 Time [sec]

5.0V

Job1

Case-2

10x109 cycle

7x109 cycle

8x109 cycle

@100MHz

@25MHz

1.8V

Job2

Job3 20

10

30

40 Time [sec]

Case-3

5.0V

50sec.

1x109 cycle

7x109 cycle

@25MHz

@25MHz

3x109 cycle @100MHz

15x109 cycle @100MHz

1.8V

Job2

Job1 4

10

Job3 20

30

50

40 Time [sec]

Fig. 1. VDD optimization under time constraints

Energy reduction from case-1 to case-2 illustrates that the total energy for processing can be dramatically reduced by assigning lower voltage to the job whose allowable latency is bigger. If the performance of the processor is higher than the performance required from application programs, energy consumption can be reduced by voltage saving. Energy reduction from case-2 to case-3 show a typical case where energy for processing is reduced without reducing the performance by assigning lower voltage to the heavier jobs and assigning higher voltage to the light jobs. If there is

variance of switched capacitance, total energy consumption is reduced by assigning the optimal VDD to each job without performance reduction. When the VDD can not so widely cut down by the limitation of process technology, considering the variety of switched capacitance seems to have an bigger e ect on the energy reduction. In the succeeding subsection, we show a voltage assignment problem and formulate it as an integer linear programming (ILP) problem. The proposed problem consider both variety of required performance(the number of execution cycles per second) and variety of switched capacitance of given jobs. III.

VDD

Assignment Problem

In this section, we de ne the VDD assignment optimization problem under a time constraint. The

problem de nition obeys the assumptions which mentioned previous section. For simpli cation, we assume the time constraint is only one in this problem de nition, and all given jobs must be nished processing until this time constraint. Our goal is to minimize the total energy consumption for given jobs under the time constraint. At rst, we formulate the optimal VDD assignment problem. In the formulation, we use following notation.

N L

The number of jobs.

The number of variable voltage levels of target processor.

 Cyclej

The number of execution cycles of the j th job.

 Cj

The average switched capacitance of the 2 (load capacitance)). [F]

j th job((switching activity)

T

The time constraint until which all given jobs must be nished. [sec]

 vi  fi

The ith voltage level (1  i  L).

The clock frequency when the supply voltage vi . [Hz]

fi =

1 (vi 0 VTH )2 vi

= constant

 xij

The number of execution cycle of the j th job while VDD is vi .

The energy consumption while all jobs are executed, E , is expressed by Formula(3) under the constraints of Formula(4).

E=

0  xij

XN XL Cj 1 xij 1 vi

2

j =1 i=1

XL xij = Cyclej XNi XL xij  T

 Cyclej ;

(3)

=1

j=1 i=1 fi

(4)

The optimal VDD assignment problem is formally de ned as follows.

\For given N , L, Cj , Cyclej , T , vj and fj , nd xij which minimize E satisfying time constraint." This problem can clearly be formulated as an ILP problem. The objective function of ILP problem is Formula(3). Constraint is Formula(4). Both of objective function and constraint are liner function of xij . An optimal VDD assignment can be obtained by solving an ILP problem. The computation time to solve the proposed problem is the computation time for the ILP problem of N 2 L variables. IV. Experimental Results

To evaluate the VDD assignment technique, we use three jobs as shown in TableII and III. When three jobs are given and processed until the time constraint, we plot the total energy consumption varying the time constraint in Figure 2 and 3. In the experiments in this section, we assume that di erent voltages can be assigned to every 1:0 2 108 cycles for convenience. This assumption makes accuracy of optimal solution worse a little, but this decrease of accuracy is negligible compared with energy reduction by proposed technique. At rst, we show the experimental result for the jobs as shown in the table II. The numbers in the

TABLE II Specification of given jobs for case 1-4

cases case-1 case-2 case-3 case-4

Cycles(j 1; j 2; j 3) Cj (j 1; j 2; j 3) [nF] (50, 50, 50)2108 (50, 50, 50)2108 (50, 50, 50)2108 (50, 50, 50)2108

(10, 10, 10) (8, 10, 12) (4, 10, 16) (2, 4, 24)

The result in Figure2 shows that if the switched capacitance ((load capacitance) 2 (switching activity)) di er from case to case, assigning lower supply voltage to the heavier jobs and higher supply voltage to the lighter jobs leads to power reduction satisfying the time constraint. Comparing case-1 with case-4, drastic energy reduction by 30% can be obtained even when the time constraint is same at 190[sec]. Of course, energy consumptions are decreased according to the time constraint is increased. This is clearly because, when the time constraint is increased, VDD can be reduced under time constraint. Next, we show the experimental result for the jobs as shown in the table III. This experiment makes clear the impact of variations of variable VDD upon energy reduction. In this example we prepare two variations of set of variable VDD . One set of VDD 's includes two levels : 5.0[V] and 3.5[V], and the other set includes three levels : 5.0[V], 4.2[V], and 3.5[V]. The combinations of the execution count of each job are same in this exam-

40

Total Energy Consumption [J]

left side parentheses represent the number of execution cycles of job1, job2, and job3 respectively, and right side parentheses represent switched capacitance of job1, job2, and job3 respectively. The variable VDD 's, L, is two : 5.0[V] and 3.5[V], and the number of execution cycles of jobs are same in this example. The assumption for clock frequency properly obeys Formula(2). We assume that the VTH in the Formula(2) is 0.7[V] in this example. If the load of processing(switched capacitance) can be biased intentionally, task scheduling to reduce energy consumption can be achieved. Also the switched capacitances di er from case to case, sum of switched capacitances of each job in this example is constant. Experimental results are shown in Figure2.

case1 case2 case3

35

case4 30

25

20

15 140

160

180

200

220

240

260

280

Time Constraint [sec]

Fig. 2. Relation between variety of switched capacitance and energy consumption

ple. Although the combinations of switched capacitances of each job are di er from case to case, total switched capacitances are constant. Experimental results are shown in Figure3. The L in Figure3 represents the kinds of variable voltage levels. For example if L = 3, the processor can choose the supply voltage from 5.0[V], 4.2[V], and 3.5[V]. TABLE III Specification of given jobs for case a-c

cases case-a case-b case-c

Cycles(j 1; j 2; j 3) Cj (j 1; j 2; j 3) [nF] (10, 10, 10)2108 (10, 10, 10)2108 (10, 10, 10)2108

(10, 10, 10) (5, 10, 15) (2, 4, 24)

The result in Figure 3 shows that if the kinds of variable voltage level is increased, energy consumption of any cases are decreased at least. Comparing (L=2: case-a) with (L=3: case-a) in Figure 3, energy consumption is reduced by 10% when the time constraint is 38[sec]. Summary of all experimental results mentioned above are as follows. 1. If the VDD of processor is selectable from 5.0[V] and 3.5[V], the energy consumption is halved at maximum case when time constraint is enough long. Increase of time constraint has strong impact on the energy reduction, if the

Total Energy Consumption [J]

8

L=2: case-a L=2: case-b L=2: case-c 7

L=3: case-a L=3: case-b L=3: case-c 6

5

4

3 25

30

35

40

45

50

55

Time Constraint [sec]

Fig. 3. E ect of power reduction when variable VDD level is 3

microprocessor equips with variable VDD and clock scheme. Energy for the processing of application program which need only low performance can be halved. 2. Even if the time constraint is constant, assigning lower VDD to the job whose switched capacitance is bigger reduces total energy consumption by 30% at maximum case. For the application program whose load is widely varied by operation, optimal VDD assignment reduces energy consumption. 3. Increase of variety of variable VDD weakly affect on energy reduction. If the kinds of variable VDD are increased from 2 to 3, total energy consumption is reduce by 10% at most. It can be said that variable voltage level L=2 is appropriate assumption considering the computation time to solve the ILP problem, because the number of variables in ILP is N 2 L, where N and L represent the number of jobs and the number of variable VDD respectively.

naive algorithm which assigns higher voltage to the job which has lowest switched capacitance prior nds optimal solution. But for the large L and N , algorithms for reduction of computation time is required. For large N which represents the number of given jobs, computation time to nd best solution can be very large. If the N is large for solving the ILP problem within practical time, we can solve LP problem and nd quasi-optimal solution instead. If we formulate the proposed problem as LP problem, we must get round the solution of LP problem into an integer. But if the numbers of cycles of jobs are increased, rounding error is negligible compared with energy reduction. For large L, searching the whole search space can be inecient in terms of computation time. But considering practical use of power converter such as DC-DC converter, L will not exceed 5 or thereabouts. Since L and N are small enough considering practical use of the proposed method, most of VDD assignment optimization problem must be solved in practical time. Rather than employing to establish techniques for speed-up, incorporating more precise model is important. We must be paying attention to following expansion of proposed model.



This expansion is easily achieved. Only to re ect the a ect of overhead time and the energy consumption of DC-DC converter in the Formulae(3) and (4), accuracy of the optimization becomes better. Actual values will be measured from actual design of DC-DC converter.



Assuming multiple time constraints. It is impractical to assume single time constraint for actual applications. The oder of execution of jobs, and beginning and end time of each job must take into account. We may have to consider the scheduling of jobs simultaneously with the VDD assignment.

V. Discussion

If the variable VDD level L is 2, the proposed problem is solved easily by greedy algorithm. A

Incorporating overhead time and energy of DC-DC converter into the model.



Establishing a technique to estimate the execution cycles of application programs.

Estimating execution cycle of application programs statically by compiler is our future work. VI. Conclusion

In this paper, we proposed and formally de ned a system level power optimization problem : A problem to assign optimal VDD to each job under a time constraint, and formulate this optimization problem as an integer linear programming (ILP) problem. To evaluate the e ects of power reduction by the proposed method, we use some examples of jobs and estimate the energy consumption under variable time constraints. Experimental results show as follows. 1. If the VDD of processor is selectable from 5.0[V] and 3.5[V], the energy consumption is halved at maximum case when time constraint is enough loosened. To loosen the time constraint has strong e ect on the energy reduction, if the microprocessor equips with variable VDD and clock scheme. 2. Even if the time constraint is constant, assigning lower VDD to the heavier jobs and higher VDD to the lighter jobs reduces total energy consumption by 30% at maximum case. For the application program whose load is widely varied by operation, proposed method reduces energy consumption. 3. Increase of variety of variable VDD has weak impact on energy reduction. If variable VDD level is increased from L=2 to L=3 total energy consumption is reduce by 10% at most. It can be said that variable voltage level L=2 or 3 is appropriate. Our future work will be devoted to incorporate the overhead time and the energy consumption of DC-DC converter into the proposed optimization problem and crating an actual compiler which estimates execution cycles of application programs statically.

References

[1] http://www.microsoft.com/hwdev/onnow.html. [2] Thomas D. Burd and Robert W. Brodersen, \Processor design for portable systems," Journal of International VLSI Signal Processing, 1996. [3] A. P. Chandrakasan and R. W. Brodersen, LOW POWER DIGITAL CMOS DESIGN, Kluwer Academic Publishers, 1995. [4] Jui-Ming Chang and Massoud Pedram, \Energy Minimization Using Multiple Supply Voltage," In Proc. of ISLPED'96, pp.157{162, 1996. [5] Intel Corporation and Microsoft Corporation, Ad-

vanced Power Management (APM) BIOS Interface Speci cation Revision 1.2, Februaly 1996 http

: //www.intel.com/IAL/powermgm/apmv12.pdf. [6] T. Ishihara and H. Yasuura, \Power-Pro : Programmable Power Management Architecture (in Japanese)," Technical Report VLD96{72, IEICE, December 1996. [7] J. Lundberg, et al., \A 15-150 MHz All-Digital Phase-Lock Loop with 50-Cycle Lock Time for High-Performance Low-Power Microprocessors," In Proc. of Symposium on VLSI Circuits, June 1994. [8] Mark C. Johnson and Kaushik Roy, \Optimal Selection of Sypply Voltages and Level Conversions During Data Path Scheduling Under Resource Constraints," In Proc. of 1996 ICCD, pp.72{77, 1996. [9] Lars S. Nielsen, Cees Niessen, Jens Spars, and Kees van Berkel, \Low-Power Operation Using Self-Timed Circuits and Adaptive Scaling of the Supply Voltage," IEEE Trans. on VLSI system, vol.2, no.4, pp.391{397, December 1994. [10] M. Degrauwe P. Machen, M. Van Paemel, and M. Oguey, \A Voltage Reduction Technique for Digital Systems," In Proc. of IEEE ISSCC, pp.238{239, 1990. [11] S. Gray, et al., \The PowerPC 603 Microprocessor: A Low-Power Design for Portable Applications," In Proc. of the 39th IEEE Computer Society International Conference, March 1994. [12] Gu-Yeon Wei and Mark Horowitz, \A Low Power Switching Power Supply for Self-Clocked Systems, " In Proc. of ISLPED'96, pp.313{317, 1996.