Power Estimation from Hierarchical Netlists - Semantic Scholar

2 downloads 0 Views 114KB Size Report
an index j, the ordered pair (Cj; j) uniquely char- acterizes the ..... The transition probability technique enables us to char- .... [15] V. Tiwari, P.Ashar, and S.Malik.
Power Estimation from Hierarchical Netlists C.P. Ravikumar Mukul R. Prasad y Department of EE-Systems Department of EECS University of Southern California University of California Los Angeles, CA, 90089-2562 Berkeley, CA, 94720 [email protected] [email protected]

Abstract We describe a power estimation tool that works at the register-transfer level of abstraction. Such a tool is useful when a transformational based approach is employed to minimize the energy and/or power dissipation in a digital system. Accurate power estimation techniques available in the literature for logic circuits described at gate level prove to be too expensive for data paths in terms of memory as well as execution time. Flattening a hierarchical netlist to gate level is by itself an expensive procedure. Our prediction tool uses a preprocessor called PEG (Power Expression Generator) on each module type used in the circuit to compile (a) signal probability expressions for all nodes and (b) a power expression for the entire module in terms of input signal probabilities. A tool called MOPE uses signal probabilities of the primary inputs and the operational frequencies of modules along with the expressions generated by PEG to estimate the total power. MOPE can handle both sequential and combinational circuits. We describe experimental results which bring out the effectiveness of MOPE and PEG. Although signal probability based estimation was employed in our implementation, more accurate techniques such as those based on transition density can also be employed.

1 Introduction High-level syntheis tools which use structural transformations to optimize aspects such as performance On leave from Electrical Engineering Department, Indian Institute of Technology, New Delhi, India. y This author was a student in the Electrical Engineering Department at Indian Institute of Technology, Delhi, when this work was carried out.

and testability have been reported in the literature [13]. Since order of magnitude reductions in power can result through an architectural redesign of a circuit [1], a structural transformation based approach holds great promise for low power design. In this paper, we describe an efficient estimation tool that can be employed in high-level synthesis for power optimization. Although a number of power estimators have been reported in the literature [1, 3, 4, 5, 9, 12, 7], with the exception of [1], [5], and [12], the others require the circuit to be specified at gate-level. While flattening an RTL netlist into a gate-level netlist is by itself a cumbersome task, gate-level power estimators place excessive run-time and memory requirements. Estimators reported in [1] and [12] provide approximate estimates of power dissipation e.g. the power-factor approximation technique of [12] can give an error of up to 80 % in comparison to gate-level calculations. Existing architectural-level power estimators do not take into consideration the schedule information, leading to inaccuracies in prediction. Although [5] gives more accurate estimates than [1] and [12], its main disadvantage is it does not treat all the modules in a uniform way; thus, depending on the functionality of the module, a different model must be used to estimate the switching activity in the module. In this paper, we describe a hierarchical power estimation tool which attempts to alleviate the above problems. A preprocessing tool called PEG (Power Expression Generator) is used on each module type to generate (a) signal probability expressions for each output line of the module and (b) an expression for power dissipation in the module type. These expressions are in terms of signal probabilities of the primary inputs to the module, the capac-

–1–

Ref.

Technique

Type

[9]

Transition Density Transition Probability Signal Probability lag-1 Markov chain

Comb. Only Both

[3] [14] [7]

Glitch Power? No

Correlations? No

Yes

Yes

Both

No

Yes

Comb. Only

No

Yes

ably accurate estimate of power dissipation of a circuit built using register-level components. 2.2

Table 1: Summary of Recent Gate-level Power Estimators itances at the nodes of the module, and the operational frequency of the module. We report a power estimator (MOPE) which uses the power and signal probability expressions of the modules in conjunction with the operational frequencies of the modules to estimate the total power dissipation in the circuit. Thus, unlike existing estimators, MOPE takes into consideration the schedule and pipeline latency information to provide a more accurate power estimate. The next section is a brief survey of the previous work on power estimation. Section 3 describes PEG, the power expression generator. Section 4 describes the working principles of our module-level power estimator MOPE. Experimental results are discussed in Section 5 and conclusions are presented in Section 6.

2 Literature Survey 2.1 Gate-level Estimators The problem of determining when and how often state transitions occur at a node in a circuit is difficult because the transitions depend on the input vectors and the sequence in which the vectors are applied. Hence probabilistic techniques and stochastic modeling have been used in estimating the switching activity. A summary of gate-level power estimators is given in Table 1. The third column describes whether the technique is applicable to combinational and/or sequential circuits. The fourth and fifth columns describe whether the technique considers the “glitch power” due to finite gate delays and the effect of signal correlations in estimating power dissipation. The common problem underlying all gate-level power estimation techniques is their large computational complexity. They do not provide a system designer with the facility for a quick and reason-

Architectural-level Estimators

In the power factor approximation technique proposed by Powell [12], power estimates for modules are parametrized by their bit-widths. Proportionality constants are extracted through physical measurements or simulations using independent “white noise” inputs. Consequently, the technique does not address the dependency of power consumption on the statistics of input data. Rabaey et al [5] estimate the energy dissipated by a module-level circuit as proportional to capacitance Ctotal which is the sum of the capacitances due to execution unit, the registers, control unit, and interconnect. The capacitance contributed by an EXU is the product of the number of times the operation was performed per sample period and the average capacitance of the unit type. Techniques have also been presented for computing other components of Ctotal. Once again, the capacitance estimates are obtained assuming a uniformly distributed set of inputs. In their later work, Rabaey et al [5] used stochastic modeling of input word statistics to account for the data dependency of power consumption. A piecewise linear model is derived to account for the bit probabilities of common input signals such as speech, music and image. Such signals are assumed to follow a Gaussian distribution to develop a model relating word-level parameters to bit-level probabilities. The word-level parameters are propagated by using standard statistical methods. Once the word parameters for are known for all busses, the bit probabilities are calculated and a piecewise linear model is obtained for the output of each module. Energy is then calculated on a regionby-region basis. For any region, the energy consumed by one bit, Ebit , is obtained by simulation. The energy consumed by a region is the product of Ebit and the number of bits in the region. The use of simulation and the fact that calculation of energy consumed by different modules requires separate treatment are the major drawbacks of this method. The assumption of Gaussian distribution is also difficult to justify. Mehra and Rabaey [8] present a power estimation technique which takes as input the control data flow graph (CDFG) description of the algorithm being synthesized. Since the estimator works without the knowl-

–2–

edge of the target architecture (schedule, allocation, and binding information), the estimates obtained are coarse. The authors reported an accuracy of within 20% of architectural-level estimates in the average case, and 120% in the worst case. Najm [10] presented a method to characterize power dissipation in a combinational logic module specified at the behaviorallevel of abstraction through Boolean equations. Entropy measured at the inputs and outputs of the module are used to estimate power dissipation. The intention of the author is to use this technique for an RTL circuit. The limitations of the work are that sequential logic modules are not treated, and due to the approximations made in the estimation, the results reported for the combinational logic modules are not accurate. Marculescu et al [6] proposed entropy and informational energy as measures for estimating the average switching activity in a module. Their approach is suitable when explicit knowledge of the internal structure of the module is not available. It is not clear if the method is applicable for circuits with sequential logic blocks and/or global feedback loops. Unlike the approaches mentioned above, which work at a behavioral level of abstraction, our technique is useful when the structure of the RTL circuit is completely available. The techniques mentioned above are useful when a structure is being evolved from a behavioral-level of abstraction, or when transformations at behaviorallevel are being evaluated with the aim of minimizing power. After arriving at a structure, structural-level transformations such as altering the module selection, swapping the inputs of a module that performs a commutative binary operation, and splitting or merging of modules can bring further reductions in power dissipation. Our technique can be used to quickly evaluate the change in power dissipation after a structural transformation. Our technique can also be used to characterize the power dissipation of an RTL design starting from its structural description.

3 Power Expressions

The average power Pi consumed by a CMOS gate whose output is i, is given by

Pi = Coutput Vdd 2  Ni 1

2

(1)

where Coutput is the output capacitance of the gate, Vdd is the supply voltage, and Ni is the expected number of

transitions per second at output i. The signal probability pi at a circuit node i is the probability that the signal at i takes on the logic value 1 in a given clock cycle. The transition probability ti at node i is the probability of a logic transition taking place at i in a given clock cycle. We rewrite Ni in Equation 1 as Ni = ti  f , where f is the clock frequency for the module. For uncorrelated successive input vectors, it can be shown [15] that ti = 2  pi  (1 ? pi ). We model Coutput as a function of the fanout Ki of gate i, i.e. Coutput = Cunit  Ki , where Cunit is a proportionality constant. Thus

Pi

= =

pi (1 ? pi )  f  Cunit  Ki  Vdd2 L  Ki  pi (1 ? pi )

(2)

where L = Cunit Vdd 2 f . If there are n nodes in a module, the power dissipation Pavg for the module can be written as

Pavg

X K  p (1 ? p ) =L n

i=1

i

i

i

(3)

The expression for Pavg (excluding the proportionality constant) is the power expression generated by PEG. The only parameter which needs to be calculated in the expression is the signal probability pi ; fanout Ki may be computed from the gate-level netlist of the module. For calculating pi , we employ the method proposed in [11]. We associate variable pi with the ith primary input of the module. Starting at the primary inputs of the module and proceeding in a breadth-first manner to the primary outputs of the module, compute the signal probability expression (SPE) for the output of each gate g as a function of its input expressions using the relations shown in Table 2. The table only lists the probability relations for two-input gates, but the extension to multiple-input gates is straightforward. We suppress all the exponents in an expression to obtain the SPE for that signal. The algorithm for PEG is shown in Figure 1. 3.1 Representation of Expressions Consider a module with n primary inputs, with signal probabilities p1 ; p2,   , pn respectively. The SPE for any node in the module is a sum-of-products expression of the form j Xj where each product term Xj is of the form Xj = Cj ni=1 pi i , i 2 f0; 1g.

–3–

P

Q

Gate Type C = not (A) C = and (A,B) C = or (A,B) C = nand(A,B) C = nor(A,B)

C

Signal Probability Relation

pC = 1 ? pA p C = p A  pB p C = p A + pB ? p A  p B p C = 1 ? p A  pB p C = 1 ? p A ? pB + pA  pB

procedure peg (C; E; P ) // C is the netlist for a combinational logic module. // E is the power expression for the module. // P is the set of SPEs for the outputs of the module. g begin Let N be the set of all signals in C ; Order the signals in N so that if Ni depends on Nj then j > i; For a signal i, let Fi be the set of all signal lines which influence the value on line i; for i := 1 to jN j do if Fi =  then P (i) = pi else switch (type(i)) and : P (i) = j 2 Fi P (j ); or : P (i) = 1 ? j 2 Fi (1 ? P (j )); not : P (i) = 1 ? P (j ); endswitch E (i) = P (i)  (1 ? P (i)); E = E + E (i); end

Q

Figure 1: Algorithm for PEG. Higher powers of pi are suppressed in an SPE as explained before. A product term is characterized by its coefficient Cj and the sequence of bits (1 2 3 :::n)j . If the decimal value of this bit sequence is denoted by an index j , the ordered pair (Cj ; j ) uniquely characterizes the product term. There can be 2n possible product terms for a module with n inputs. We use an array of size 2n to store a probability expression. forms the index of the array and the value of Cj is stored in the memory element corresponding to index j [14]. See Table 3 for an example. Power expressions can be represented as a sum of products of the form j Yj , with the difference that product terms Yj may also contain squares (Equation 3). A product term Yj in a power expression for an ninput circuit is of the form Yj = Cj ni=1 pi i where i 2 f0; 1; 2g. The index i is a sequence of ternary digits (12 :::n)j . The power expression array has a

P

Q

1 1

2 -1

3 -2

Table 3: Array Representation for p1 + p2 ? 2p1p2

C

Table 2: Signal Probability Expressions for Basic Gates.

Q

0 0

0 0

1 1

2 -1

3 1

4 1

5 0

6 -1

7 0

8 -1

Table 4: Representing p1 ? p12 + p2 ? p2 2 + p1p2 ?

p12 p22

size of 3n . Refer to Table 4 for an example. Addition and multiplication are the two important operations performed on probability expressions. When we refer to multiplication of probability expressions, the implicit assumption is that exponents are suppressed in the product. Using the above representation, addition of two expressions can be performed by simply adding the corresponding terms in the two expression arrays. Multiplication of two expressions is performed by taking term-by-term products and then performing addition. If ( i ; Ci) and ( j ; Cj ) are two terms to be multiplied, the resulting term ( k ; Ck ) is given by Ck = Ci  Cj and k = i ^ j , where ^ denotes bit-wise OR.

4 Power Estimation The overall operation of MOPE is illustrated in Figure 2. The user specifies a module-level netlist of the circuit, the signal probabilities for the primary inputs, and the operational frequencies for the modules (Section 4.3). To deal with circuits that contain feedback paths, we break the loops in the circuit and iteratively apply a balancing algorithm to obtain stable signal probability values for all the busses in the circuit. 4.1 Loop Detection and Breaking We find a minimal set of edges in the graph representation of the circuit, which, when removed, render the graph acyclic. For each edge e which is removed, we must balance the probabilities on both sides of the edge through iterative computation (see Section 4.2). The set of edges deleted from the graph is termed as the Minimum Feedback Edge Set (MFES). Using the MFES to break loops reduces the computations for subsequent operations in our estimator and speeds

–4–

up the power estimator considerably. Since the problem of finding the MFES for a directed graph is NPcomplete [2], we propose a greedy heuristic for finding a near-minimum feedback edge set. The loopcount of an edge e is defined to be the number of loops of which e is a part. We identify all the loops in the circuit graph and compute the loopcounts for each edge e. The procedure MOPE (M; L) // M is a module-level netlist. L is a library of logic modules. greedy algorithm deletes the edge whose loopcount is // pi is the signal probability of line i; maximum and updates the loopcounts of the remain// fC is the operational frequency of module C . ing edges; this procedure is repeated until the graph bebegin comes acyclic. The greedy algorithm was seen to give if M has cycles then near-optimal solutions in all the examples on which it Break fewest lines to make M acyclic; was tested. for each broken feedback line l do Introduce a virtual primary input line Il with p(Il ) = 0:54.2 Calculation of Signal Probabilities and a virtual primary output Ol ; The removal of an edge from the circuit graph enLet N be the module-external signal lines in M ; tails the creation of virtual primary inputs (VPI) Vi and Order the lines in N such that if Nj depends on Ni virtual primary outputs (VPO) Vo , whose signal probthen necessarily j > i; abilities are set to 12 . In MOPE, primary inputs and repeat outputs are treated as simple modules. The modules for i := 1 to jN j do are topologically sorted and the primary inputs of the if i is not a primary input then module netlist are identified and their signal probabilLet i be the output of a library module C ; ity values are read from an input file. The module-level Read probability expressions for C from L; Read probabilities for inputs of C ; netlist is traversed in a breadth-first order and the probcall Evaluate-Output-Probabilities (C ); ability values at the output signal lines of each modend ule are calculated. The signal probability values of the end VPO are now compared with those of the correspondfor each virtual primary input Il ing VPI. This procedure is repeated until the differl = p(Il ) ? p(Ol ); ence between the probability values of every VPI and if l <  then converge++; the corresponding VPO is less than a specified value end if (converge 6= number of virtual primary inputs) then . Stable signal probability values are hence obtained for each virtual primary input Il do for all the busses in the architecture, including the feedp(Il ) = p(Ol ); back busses. until converge = number of virtual primary inputs; 4.3 Operational Frequency Let  be the set of all modules in M ; We need the notion of the operational frequency of for each module C 2  do a module since, depending on the schedule, a module E (C ) = EvaluateP ower(C ) may or may not be used in every clock cycle. Hence the T otalP ower + = E (C )  fC ; end maximum rate at which switching can occur in a modend ule is not necessarily the system clock frequency fclock , but a fraction of fclock . The operational frequency of a Figure 2: Algorithm MOPE module i, denoted opfi , is defined as

opfi =

NTi  fclock TS

(4)

where NTi is the number of time steps in which module i is used, T S is the total number of time steps in the sample period, and fclock is the system clock frequency. The user specifies the operational frequency –5–

information in two parts: (i) system clock frequency and (ii) the fraction of the sample period during which the module is used. NTi can also be computed automatically with the knowledge of the schedule. When using Equation 3 to evaluate the power expression for a module i, we replace f with opfi . 4.4 Power Estimation After the signal probability values are known for every module-external signal line, MOPE traverses the module-level netlist and estimates the power dissipation of each module i by using (a) the power expression of the module type of i (which can be precomputed using PEG and stored as part of the module library), (b) the signal probabilities of the input lines, and (c) the operational frequency of module i. The total power dissipation is then the sum of the power dissipations in individual modules. An important feature of MOPE is its ability to recognize the hierarchical nature of a particular module. For example, a 2-to-1 4-bit multiplexer may also be viewed as a replication of four 4-to-1 1bit multiplexers. In order to save memory, the probability and power expressions are stored for only a 1-bit multiplexer. MOPE takes care of the hierarchy, recognizes the number of bits (say n) in the multiplexer and calls the same probability and power expression from the module library n times with different input signal probability values.

5 Results All the experiments were carried out on a Sun SPARCStation 10. In all examples, signal probability values of 0.5 were assumed for primary inputs. The system clock frequency was taken to be 1 MHz. Table 5 shows the results of PEG on several circuits taken from MCNC-91 benchmarks. To illustrate the efficacy of MOPE in evaluating low-power design techniques, we chose two different datapaths implementing the same function, namely, a quadratic expression ax2 +bx+c. Using Horner’s rule, the expression can be evaluated as (a  x + b)  x + c using two multiplications and two additions. The first realization, shown in Figure 3, implements the function using two multipliers and two adders, whereas the realization shown in Figure 4 uses one adder and one multiplier. We assume that both the architectures are fully pipelined and every module in both architectures has an operational frequency equal to fclock . Since

Circuit Name b1 cm42a cm82a t c17 cm138a cm152a cm151a

# of Inputs 3 4 5 5 5 6 11 12

# of Gates 13 16 24 7 6 12 24 32

PEG Exec. Time (s) 0.062 0.129 0.220 0.096 0.095 0.540 98.170 446.820

# of Terms in Power Exp. 10 20 146 21 20 27 268 1580

Table 5: Test results for PEG Datapath Name quadratic 1 quadratic 2

# of modules 40 36

Exec. time (s) 0.947 0.727

Power Est. (W ) 4.715 3.194

Table 6: Results of MOPE on two data paths. quadratic 1 uses more functional units, we would expect it to have a higher value of power dissipation as compared to the second realization quadratic 2, the operational frequencies of all modules being equal in both cases. This is indeed the trend shown by the power estimates of the two architectures, calculated by MOPE and shown in Table 6. 5.1 Performance Comparison We implemented a gate-level power estimation tool built on the estimation model proposed in [3] with which we compared the power estimates provided by MOPE. We tested MOPE on four modular circuits and compared the results with the estimates obtained using the gate-level estimator as well as a logic simulator on gate-level realizations of the same four circuits. The circuits which we selected were (i) an 8-to-1 1-bit multiplexer realized using 2-to-1 1-bit multiplexers (ii) a 4to-1 2-bit multiplexer realized using 2-to-1 2-bit multiplexers (iii) a 4-bit ripple carry adder realized using full adders and (iv) a 4-bit array multiplier realized using multiplier cells. Since the gate level estimator predicts power at a lower level of abstraction than MOPE, the former gives more accurate estimates. Table 7 reports the speedup as well as power estimates observed when using MOPE, the gate-level estimator and the simulator. The MOPE estimates differ from that of the gatelevel estimator by an amount ranging from 0 to 15%.

–6–

CLR

CLR

CLR

c CLR

+

* CLR

*

CLR

+ CLR

OUTPUT

b

a

Circuit Name

x

CLR

8-1mux 4rca 4-2mux parmult

CLR

Figure 3: quadratic 1: Implementation 1 of the quadratic expression function CLR

CLR

* CLR

a

CLR

CLR

+

OUTPUT

c

CLR

MOPE Pow. (W ) 0.291 0.525 0.191 0.456

GL exec. time) (s) 89.65 24.46 18.07 10.43

GL Pow. (W ) 0.291 0.441 0.231 0.433

Sim. exec. time (s) 9.13 9.45 7.63 10.11

Table 7: Comparison of MOPE Performance

CLR

x

b

MOPE exec. time (s) 0.864 0.644 0.425 0.652

Figure 4: quadratic 2: Implementation 2 of the quadratic expression function However, the speed advantage of MOPE over the gatelevel estimator is remarkable. For the examples considered, MOPE gives the power estimate about 40 to 100 times faster than the gate-level estimator. It is expected that the time taken by the gate-level estimator will grow exponentially with the number of modules, while that of MOPE will grow linearly. The estimates obtained using simulation are the most accurate of the three techniques, but the time taken for simulations is heavily dependent on the type of circuit; even for a given circuit, the run time is a function of the kind of inputs, a serious drawback which is not present in a statistical technique. 5.2 Extensions The hierarchical power prediction technique described in this paper is based on the prediction of signal and transition probabilities at all nodes in the circuit given the signal probabilities at the primary inputs. The transition probability technique enables us to characterize the power dissipation of modules in a conve-

nient way; the signal probabilities can be easily propagated from one module to the next by simply evaluating precomputed SPE. We believe that it is not cost effective to use a very accurate technique which will result in bulky expressions that are expensive in terms of storage and require a large computational effort to generate. We view module-level power prediction as a tradeoff between accuracy and speed. Our experimental results indicate that we do obtain reasonably accurate estimates with significant speedup in terms of computational time. Nevertheless, we note that tool can be extended so as to use alternate measures of node switching activity. Such an extension involves modifying PEG to synthesize expressions which express the switching activity and power dissipation of the module in terms of a suitable metric of switching activity at the primary inputs. We also need a fast algorithm to propagate the metric from the primary inputs of the module to the primary outputs of the module without having to flatten the module to its gate-level representation.

6 Conclusions We have reported a module-level power estimation tool which achieves up to two orders of magnitude speedup in comparison to a gate-level estimator. The accuracy of power estimation using MOPE is within 15% of gate-level estimates. PEG and MOPE are useful in a high-level synthesis environment as quick power estimators. In our current implementation, we have used linear arrays to store the power expressions for various gates and modules. Our experiments indicate that the expression arrays for large modules are appreciably sparse. In fact, this sparsity grows with the bit widths of modules. Hence compression of expression arrays can be employed for large modules. One

–7–

Sim. Pow. (W ) 0.277 0.423 0.222 0.412

way to achieve this compression is to store array indices and corresponding coefficients only for non-zero entries. It is reasonable to assume that the higher-order product terms in the power expressions will be quite small as compared to the lower order ones, simply because the terms in the power expressions are products of probabilitieswhose values have to be necessarily between 0 and 1. Hence a number of higher order terms can be safely neglected without compromising on the accuracy of the power estimate. This can again lead to significant memory savings.

References [1] A.P. Chandrakasan, M.Potkonjak, J.Rabaey, and R.W.Brodersen. HYPER-LP:A system for power minimization using architectural transformations. In IEEE/ACM International Conference on Computer Aided Design, pages 300–303, 1992. [2] S. Even. Graph Algorithms. Computer Science Press, 1979. [3] A. Ghosh, S.Devadas, and K.Keutzer. Estimation of average switching activity in combinational and sequential circuits. In IEEE/ACM International Conference on Computer Aided Design, pages 253–259, June 1992. [4] J.Lin, T.Liu, and W.Shen. A cell-based power estimation in CMOS combinational circuits. In IEEE/ACM International Conference on Computer Aided Design, pages 304–309, 1994. [5] P. Landman and J.Rabaey. Power estimation for high level synthesis. In Proceedings of EDACEUROASIC ’93, pages 361–366, 1993.

[8] R. Mehra and J.M. Rabaey. Behavioral Level Power Estimation and Exploration. In International Workshop on Low Power Design, pages 197–202, 1994. [9] F. Najm. Transition density: A new measure of activity in digital circuits. IEEE Transactions on CAD of Integrated Circuits and Systems, 12(2):310–323, February 1993. [10] F. Najm. Towards a High Level Power Estimation Capability. In International Symposium on Low Power Design, pages 87–92, 1995. [11] K.P. Parker and E.J. McCluskey. Probabilistic treatment of general combinational networks. IEEE Transactions on Computers, pages 668– 670, June 1975. [12] S.R. Powell et al. Estimating power dissipation of VLSI signal processing chips : The PFA technique. VLSI Signal Processing, pages 250–259, 1990. [13] C.P. Ravikumar and V. Saxena. Togaps : A testability oriented genetic algorithm for pipeline synthesis. International Journal on VLSI Design, 1995. Accepted for publication. [14] K. Roy and S.C. Prasad. Circuit-activity based logic synthesis for low power reliable operations. IEEE Trans. on Very Large Scale Integration Systems, 1(4):503–511, 1993. [15] V. Tiwari, P.Ashar, and S.Malik. Technology mapping for low power. In 30th IEEE/ACM Design Automation Conference, pages 74–79, 1993.

[6] D. Marculescu, R. Marculescu, and M. Pedram. Information theoretic measures of energy consumption at register transfer level. In International Symposium on Low Power Design, pages 81–86, 1995. [7] R. Marculescu, D. Marculescu, and M. Pedram. Switching activity analysis considering spatiotemporal correlations. In IEEE/ACM International Conference on Computer Aided Design, pages 294–299, 1994. –8–