Low Power, High Speed Hybrid Clock Divider Circuit - IEEE Xplore

8 downloads 0 Views 2MB Size Report
Abstract-. The Clock Divider circuit has found immense application in Multiple Clock Domain (MCD) systems like. ASICs, SoC and GALS. In MCD systems, we ...
2013 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2013]

Low Power, High Speed Hybrid Clock Divider Circuit John Reuben, Mohammed Zackriya. V

Dr.Harish M Kittur

Sehool ofEleetronies Engineering, VIT University, Vellore, lndia johnreuben@vit. ae.in, [email protected]

Sehool ofEleetronies Engineering, VIT University, Vellore, lndia kittur@vit. ae. in

Abstract-

The

Clock

Divider

circuit

application

in

Multiple

Clock

Domain

has

found

(MCD)

In this paper, we present a low power hybrid clock

immense

systems

divider circuit which can take an input frequency up to 6 GHz

like

and perform frequency division. The divider is hybrid because

ASICs, SoC and GALS. In MCD systems, we generate many

it uses two different tlip flops - a modified ETSPC tlip flop

cIock signals of various frequencies from a high frequency cIock by frequency division. Power is an important parameter to be

(METSPC-FF)

minimized since the nodes in a cIock divider circuit will toggle at cIock frequency. In this paper, we present a low power hybrid

(METSPC-FF)

FF

(SBFF)

[4] .

The

power compared to METSPC. The paper is organized as

because it uses two different tlip flops - a Modified Extended

(SBFF) .The

blocking

GHz, while the SBFF is relatively slow but consumes less

GHz and perform frequency division. The divider is hybrid True Single-Phase Clock tlip flop

self

consumes more power when compared to SBFF below 1.5

cIock divider circuit which can take an input frequency up to 6

blocking FF

and

METSPC-FF is fast enough to divide a GHz frequency, but

folIows: Section 11 describes the design of METSPC-FF and

and a self

its simulation results in TSMC 90

METSPC-FF is fast enough to divide a

nm .

Section III describes

the SBFF as presented in [4] and its simulation results in

GHz frequency, but consumes more power when compared to

TSMC 90 nm (In [4] , the SBFF is simulated in SMIC 65 nm

SBFF, while the SBFF is relatively slow but consumes less power compared to METSPC. We analyze the performance of these 2

technology).The hybrid clock divider circuit is presented in

FFs across PVT variations and implement them in a cIock divider

section IV with the simulation results. We conclude with

circuit. Our cIock divider circuit consumes 149.56 /lW power for

possible future work in section V.

'divide by' 8 operation on a 6 GHz cIock. Simulation of these tlip flops in TSMC 90 nm technology using CADENCE SPECTRE simulator shows that they are very energy efficient and hence can

11 .MODIFIED ETSPC FLIP FLOP (METSPC-FF)

be used for other high speed applications without compromising on the power.

A. Basie ETSPC Flip flop

A dynamic ETSPC flip flop is presented in [3], which

Keywords: Clock Divider (CD); TSPC flip flop; Self blocking Flip flop;

propagation

delay;

power

dissipation;

Hybrid

doesn't have stacked MOS structure that slows the switching

Clock

speed.

Divider(HCD); METSPC; ETSPC,PVT

I .INTRODUCTION The

Clock

Divider

circuit

has

found

immense

application in multiple clock domain(MCD) systems like

ASICs,

SoC(System

on

Chip)

and

GALS(Globally

S1

S2

Q

Asynchronous, Locally Synchronous).SoC, which is an IC

designed by stitching together multiple stand-alone VLSI designs(called

IPs)

to

provide

full

functionality

for

W/L=1.2

an

application[l] has different IP blocks operating at different

clock frequency. Clock generation and clock distribution for

these MCD systems are the costliest in terms of power

consumption

[2].The

clock

generation

system

generates

Fig. 1. ETSPC [3]

different frequencies for the clock domains from the basic

crystal oscillator (tens of MHz) using PLLs(as frequency

The ETSPC shown in Fig. 1 is a negative edge

multipliers) followed by Clock Dividers. Hence minimizing

triggered flip flop. When the clock is high, NI and N2 will be

the power consumption of the clock divider circuit is a crucial

on, P3 will be off. The node SI and S2 is precharged to low

step in the design of Clock generator circuit for MCD systems.

978-1-4673-4922-2/13/$31.00 ©2013 IEEE

through NI and N2 irrespective of D state.

935

2013 International Conference on Circuits, Power and Computing Technologies [ICCPCT-20I3]

clock.

Case 1:

The evaluation phase starts at the negative edge of the

Table 1. State of nodes at Precharge Phase

If D is low, PI will turn on to make node SI high

which in turn will turn off P2 to make node S2 stay low. Thus node S2 will turn off N3 to make node S3 Q will become low.

Case 2:

(q) high and hence

If D is high, PI will turn off to make node SI stay

low which in turn will turn on P2 to make node S2 high. Thus node S2 will turn on N3 to make node S3 Q will become high.

(Q )

Case

Clk

Dt

A

B

0

Ip

0

0

1

1

2p

0

1

1

Ip

0

0

2p

0

Case 2p:

low and hence

Q,

Q,

1

1

0

0

1

0

1

0

1

1

0

1

1

0

1

1

0

When D is high, NI will be turned on, thus node A

will discharge through NI and it will turn off N2, therefore node B will stay high through P2. Since both P3 and

B. Proposed Modified ETSPC Flip flop

precharge phase,

141.01 /lW power resuIting in

a

Q,

N3

will

stay turned off in

and Q, will hold the previous state

['-I

and Qt-l respectively even though the present state of D (Dt)

The basis ETSPC [3] has delay of 36.34 pS and

consumes

Q'-l

changes as shown in Table 1.

PDP of 5.12 fJ

at 6 GHz (Nominal process corner). We propose a Modified

ETSPC FF (Fig. 2) which has a better PDP as will be

Evaluation Phase:

discussed below.

When clk goes high, transistor PI, P2 will be turned

off, N3 will be turned on and the node A and B will hold the

precharge state as in Table.2

Table 2. State of nodes at Evaluation Phase Case B

A

Ie 2e

W/L=1.2

Case 1e: Fig. 2. METSPC -FF

Clk

Dt

A

B

J J

0

I

1

0

Q,

Qt

0

1

0

1

0

1

If D, is low at rising edge of the clock, A will stay

high but B will discharge to GND through N2 since P2 is off in evaluation phase. Thus node B will turn on P3. The node

The

proposed

METSPC-FF

is

a

posItIve

edge

of P3 will be lesser than N3). Therefore Q, will become low.

triggered FF. It consists of precharge and evaluation phase as

described. The clock signal (CLK) is generated on chip from

Case 2e:

the PLL and fed to the flip flops.

If D, is high at rising edge of the clock, A will stay

low and B will also stay high since N2 is off. Thus node B will

Precharge Phase:

turn off P3. The node

Q, wi11

be discharged to GND through

N3. Therefore Q, will become high.

When clk is low, transistor PI, P2 will be turned on

and N3 will be turned off.

Case 1p:

ij,

will stay high as the width of P3 is larger than N3 (resistance

The simulation results showed that METSPC flip flop

has delay of 20.03 pS and consumes When D is low, NI will be turned off, thus node A

resulting in

a

PDP of 1.77 fJ

88.74 uW power

at 6 GHz (Nominal process

corner) which is better than ETSPC tlip flop. Hence it is more

will be high through PI and it will turn on N2. Since the width

suitable for high frequency (GHz) operation. The cut-off

of P2 is larger than N2 (resistance of P2 will be lesser than

frequency of MOSFET in TSMC 90 nm is around 120

N2), node B will stay high through P2 irrespective of N2 being

GHz[9].

on or off.

936

2013 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2013]

25 24

(j) e23 >­ co Q; Cl

22

90

-......

.---*'

80

--....

"...,--

"\.

i5

5.5

the

average

power

of our FF averaged over all corners turns

fJ.

III. SELF BLOCKING FLIP FLOP (SBFF)

:;:



A. Basic Selfblocking Flip flop

o

3 3.5 4 4.5 Frequency (GHz)

plotted



10

....... Power Dissipation 2.5

corners is 1.794

co .9-

30 Q; 20

npl,"

1.5

40

have

out to be 76.44 f.1W. The average PDP averaged over all

c:

\ \

021 � Cl �20 e 0.. 19

power dissipation



60 2 50

we

dissipation of our FF across all corners in Fig.5.The average

70 2-

�-

c:

18

Sirnilarly,

100

A single phase clocked flip-flop, SBFF is proposed in

6

[4], where the authors claim that their FF has better power­

Fig. 3. CIk to Q propagation delay of METSPC-FF Vs.

delay product than even the most recent Sense Amplifier FF.

frequency

To verify the claims of SBFF, we simulated it in TSMC 90 nm

and optimized the W/L ratios for proper functionality.

Fig.3 shows the variation of propagation delay and

power dissipation W.r.t frequency which are as expected. To

make sure that our FF

design

is insensitive to process

variations, we did extensive simulation of our design across all

process corners for various frequency. We have plotted the

average propagation delay of our FF across all corners in

Fig.4. The propagation delay of our FF averaged over all

corners turns out to be 23.48 pS,

(NN-Nominal,

FF-Fast

NMOS,Fast PMOS, FS-Fast NMOS, Siow PMOS,SF-Slow

nMOS, Fast PMOS and SS-Slow NMOS and Siow PMOS).

35 'ß:30 � 2S �20

• ss

>-

Cl



c:

Cl

�lS e 0..

!



FS

; Fig. 6 SBFF [4]

�10 � 5 [!!

The SBFF consist of a dynamic XOR gate in the first

stage and a differential storage latch in second stage. The slave

o

latch is controlled by the X and clk signal from the XOR gate.

Process Corners

When clk is low, the node X is precharged and N7 is

Fig.4. Average propagation delay of METSPC-FF across all

turn off, thus the slave latch is opaque to changes in D. At the

corners

positive edge of the clk, signal X is evaluated to D EE> Q.

120 !100 +-----�.�-------------------------------FF § . � 80 +------------7FS�----�.�-----------------• .� o 60 +-------------------------�F�----��----• Q; � � 0.. 40 +-------------------------------��----20 +-----�-----

NN

Casel:

When the present state D and previous state Q are

same (Dt EE> Qt-I = 0), node X discharges through NI. Which in

turn will hold N6 off; this will prevent data from entering into

storage latch as the previous data is unnecessary to be

",F

changed.

Q) Cl [!! Q)

Case2:



When the present state D and previous state Q are

different (Dt EE> Qt-l = 1), node X will hold the VDD state.

o L-_______________________________________

Thus both N6 and N7 will be turned on.

Process Corners

Fig.5. Average power dissipation of METSPC-FF across all

When the present state is high (Dt=I), N8 will be

corners

turned on, which will pull Q to high (Qt= l) through P5. Since

937

2013 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2013]

average power dissipation

Qt is high, N10 is turned on making Qt zero. If present state is low (Dt= O), N9 will be turned on (since Dt= l), which will pull

corners is 0.577

Qt to high (Qt= O) through P4. Since Qt is high, NIl is turned on making Qt zero. Thus SBFF will hold the value Qt and

of SBFF averaged over all corners

turns out to be 76.44 /.I. W. The average PDP averaged over all

Qt

fJ.

9

through the differential latch setup in the evaluation stage.

• FF

Once the Q and D become same, signal X will discharge through NI to GND. Hence changes in D after the positive



edge of the clock will not affect Q.

• NN

·u

...

SF

:�

After extensive simulation of SBFF with optimized

W/L values shown in Fig.6, we realized that this SBFF has

lower PDP than METSPC at frequencies below 1.5 GHz. But

SBFF fails to function as a flip flop above l.5 GHz due to

setup time constraints. Hence, we concluded that SBFF is a

o

better option for sub 1.5 GHz operation. 120

14

100

12

ifi S 80 >­ ro



c

Process Corners

"'

-

--

60

---

o � 40 Cl ro Cl.

___ Propagation Delay

a..

....... Power Dissipation

e 20

/

B. Comparison ofthe two Flip Flops



102c o 8 �

J

---

Fig.9 Average power dissipation of SBFF across all corners

./

6 4

l.794

Cl. "00

Compared to the average PDP of METSPC which is

the average PDP of SBFF is 68% lesser, except that

6

it can function only at frequencies less than l.5 GHz. The PDP

o

high due to short circuit current flowing in the direct path

of METSPC-FF is higher because the power consumption is



a..

o

fJ,

between VDD and GND during the precharge phase.

o

100

200

400

500 600 800 1000 1200 1400 1500 Frequency (MHz)

IV. HYBRID CLOCK DIVIDER (HCD)

Fig.7. Clk to Q propagation delay of SBFF Vs. frequency

To divide high frequency signal we are proposing a

hybrid CD structure as shown in Fig. 10. We have used

Fig.7 shows the variation of propagation delay and

METSPC-FF in fust stage since it has got lesser propagation

power dissipation W.f.t frequency of SBFF which are as

expected. Since the authors in [4] didn't check their FF for

delay than SBFF at higher frequencies. The second stage of

across all process corners for frequencies below l.5 GHz. We

METSPC for frequencies lesser than 1.5 GHz. The fust stage

corners in Fig.8.The average propagation delay of SBFF

a

the CD uses SBFF because it has got lower PDP than

process insensitivity, we verified the functionality of the SBFF

of the CD is used to convert the high input clock frequency to

have plotted the average propagation delay of SB FF across all averaged over all corners turns out to be 89.80 pS.

180



$160 S iU'140

sub-l.5

F

SF



a;

0120

'" c:

0>

Q) 0>

� 40



frequency.

Then

the

second

stage

can

CD using METSPC-FF

F1

=

Fix

CD using SBFF

F2

=

F1 Ix

I---

• SS

2100 � 80 2 a.. 60

GHz

implement the required 'divide-by-n' operation using SBFF.



Fig. 10. Hybrid Clock Divider Structure

"bi

.!

• FS

20

For illustration consider the HCD shown in Fig. 12

o

which does "divide by 8" operation on a 6 GHz clock. We use

Process Corners

two METSPC-FFs to obtain 1.5 GHz clock ("/4") in fust stage

and one SBFF in the second stage to obtain 750 MHz clock

Fig. 8. Average propagation delay of SBFF across all corners Similarly,

we

have

plotted

the

average

("/8").

power

dissipation of our SBFF across all corners in Fig.9.The

938

2013 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2013]

�':[]DDDQDDDn ' \��f�\ �o �'l!D��\ �f \ LH� �ilU: \ [GHz \ 0 �'�j:D�� �������, ��������,��\7 �50_M�Z:H_:__�:_�_:, _�: ___�:__�:__�:,� � ----n-.-----l ��

1.0

4.5137n5

I

2.0

603.956mV

10 time (n5)

4.0

�o

Fig. 11 Simulation result of Hybrid CD for 6GHz TSMC 90 nm technology using CADENCE SPECTRE simulator

First State r"-" -" -" -" -" -" -" -" -" -" -'

Second Stage r" -" -" -" -" -'"

250 Vi 200 3 �

1 D

D

Q METPC-FF

Q METPC-FF

Q

Q

D

Q Q

.\--...::!!::::!:� ::: ....

300

.....!_ 250

__-------......"

200

1�

� � 100

+----��=-==�--------�



+.... ---------------.. Propagation Delay (pS) � 50

� �

SBFF

T'""----�

50

� � c

� �

��

100 �





...... Power(uW) O +--r--r---r--r_.___ -; _.__-r--r�__.r_.____.__""T"_y__+O 0.9 0.957 1.014 1.071 1.129 1. 186 1.243 1.3

VDD(V)

Fig.13. Plot of propagation delay and power dissipation over

CLK

range Of VDD

f/2

The nominal temperature at which we simulated our

Fig. 12. Hybrid Clock Divider Circuit for 'divide by 8'

flip flops was 27 o e . To make sure that our CD is insensitive

to variation in temperature, we varied temperature from 0 to

70

oe .

Our CD was insensitive to temperature variation. The

behavior of propagation delay and power dissipation for

Performance ofC D under PVT variation

temperature variation at 6 GHz is plotted in Fig.14.

Since we verified the functionality of METSPC (upto

250

6 GHz) and SBFF (upto 1.5 GHz) for process variations (at all 4 corners), we conclude that the CD implemented using these

T'""----"""T

160

155 Vi� � � 150 2..... c � � 150 +------....:==-.... :-= .::: . :- --------i... 145 �

t��::s;�;:;;:::e::=-= ,....... ... II::: . :I!::�9

two flip flops will be insensitive to process variations. To

evaluate our CD circuit under PVT variation, we continue to evaluate our CD across the other 2 parameters viz., voltage,

c o

VDD and temperature.

� 100 � e

The nominal supply voltage for our flip flop and CD



was 1.1 V. To make sure that our CD is insensitive to variation

in VDD, we varied VDD for ± 0.2 V deviation. Our CD was

50 o

insensitive to VDD variation of ± 0.2 V and behavior of

140 +---------------:.... : -----f ... 1� ...... Propagation Delay (pS) ...... o

propagation delay and power dissipation for VDD variation at 6

Power(uW)

5

130

� .�

5 I,.

f1.

125

10 15 20 25 30 35 40 45 50 55 60 65 70 Temperature (degree C)

Fig.14. Plot of propagation delay and power dissipation over

GHz is plotted in Fig.13

range Of VDD

939

2013 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2013]

Fig.15. Layout of divide by 2 using METSPC -FF 1.25

c

1.0

2:

.75

>

.5

.25

6GHz

1.25

0.0

-.

'v

25 +-

3GHz



����-_

o

93.1223ps I 906.89SrrN

I

.25

____-_�-____�_

.5



�_____

.75 (ns)

_ 1.0

_

_-__-_

-_�-��

1.25

-----j 1.5

time

Fig.16. Post-Layout simulation of divide by 2 using METSPC -FF for 6 GHz

Simulation of this hybrid CD in TSMC 90 nm

V.CONCLUSION

technology using CADENCE SPECTRE simulator is shown in

Fig.ll. For a divide by 8 operation on 6 GHz clock, our hybrid

The proposed METSPC-FF is designed to reduce

CD consumes average power of 149.56 f.lW considering all

power consumption and propagation delay at high frequency operation. The self blocking FF presented in [4] is indeed

PVT variation. The average propagation delay through the

entire divider is 19l.34 pS across all PVT variation. Table 3

verified to be energy efficient since it has lowest power delay

shows the comparison of our clock divider with the recent

product till 1.5 GHz. Both the flip flops and the resulting CD

using METSPC-FF and the post layout simulation waveforms

reliability

clock dividers in literature. The layout of a divide-by-2 circuit

circuit are simulated across PVT variations to ensure the of

the

design.

These

two

flip

flops

can

be

for a divide-by-2 operation on a 6 GHz clock are shown in

judiciously used to design a high speed, low power clock

Table3 P er Dormance companson

149.56 f.lW. Except for the propagation delay, the CD circuit

divider circuit. We have verified this by being able to "divide

Fig.15 and Fig.16 respectively.

0f

by 8" a 6 GHz clock, the entire operation consuming only

vanous c Ioe k d"lVIders

does not alter the clock waveform and maintains the 50% duty

Ref.

Tech

VDD

(V)

Frequency (GHz)

(mW/GHz)

cycle of the input clock. Hence they can be used directly by

[7]

0.09

l.l

2.4

0.190

simulation of CD using METSPC-FF and verified that they

[8]

0.18

l.8

5

2.4

work

0.09

l.l

6

0.024

This

(f.lm)

Operating

Power

the IP blocks in a Soc. We have also done the post-layout

match closely with the transistor level simulation results.

940

2013 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2013]

REFERENCES [1] R.

Rajsuman,

"System-on-a-chip:

John Reuben holds a B.E., (Honors') degree in Electrical and

Design

and

Artech House Inc. Publishers, 2000,pp.3-7.

Electronics Engineering from BITS,Pilani and M.Tech in

Test",

Communication Engineering from VIT University, Vellore.

He is currently Assistant Professor with VIT University,

[2] R. Y. Chen, N. Vijaykrishnan, and M. 1. Irwin, "Clock power issues in system-on-a-chip designs," in Proc.

Workshop on VLSI, 1999, pp.48-53.

Vellore where he is pursuing his PhD.

IEEE

Mohammed Zackriya. V was born in Vellore, India. He

received B.E.,

[3] Xiao Peng Yu, Manh Anh Do, Wei Meng Lim, Kiat Seng Yeo, and Jian-Guo Ma, "Design and Optimization of the Extended

True

Single-Phase

Clock-Based

M.S (By Research) in the School of Electronics Engineering

Prescaler"

,VIT University, Vellore .

IEEE Transactions On Microwave Theory And Techniques, Vol. 54, No. 11, Pp. 3828-3835, November 2006.

Dr. Harish M Kittur (M'10) was born in Gadag, India. He

received

[4] X. Li, S. Jia, X. Liang and Y. Wang, "Self-blocking flip­ flop design",

January 2012.

Electronics Letters, Vol. 48, No. 2, 19th

B.

Sc.

Degree

in

Physics,

Mathematics

and

M.Sc. in Physics from the Indian Institute of Technology,

Mumbai, in 1996. M. Tech. in Solid State Technology in the

modulus prescaler using the extended true-single-phase­

year 1999 from the Indian Institute of Technology, Madras,

IEEE J. Solid­ State Circuits, vol. 34, no. 1, pp. 97-102, Jan. 1999.

the year 2004.

and Ph. D. in Physics from the RWTH Aachen, Germany in

clock CMOS circuit technique (E-TSPC),"

He

[6] Wu-Hsin Chen and Byunghoo Jung, "High-Speed Low­

is

currently

Professor

with VIT

University,

Dual-Modulus

Vellore, India and heads the VLSI Division. He has published

Transactions On Circuits And Systems-li: Express Briefs, Vol. 58, No. 3, pp. 144-148,

Conferences. His research interests are Low Power VLSI

Stephan Henzler and

member of IETE and member of IEEE.

Power

True

Prescalers ",IEEE

Single-Phase

March 2011. Application

Clock

Siegmar Koeppe,

8 papers in International Journals and 8 papers in International

Design, Memory Design and Nanoelectronics. He is a life

"Design and

of Power Optimized High-Speed

Frequency Dividers,"

CMOS

IEEE Transactions On Very Large Seale Integration (VLSI) Systems, vol. 16, no. 11, pp.

[8]

-

Electronics from the Karnataka University, India, in 1994.

[5] 1. N. Soares, Jr. and W. A. M. Van Noije, "A 1.6-GHz dual

[7]

degree in Electronics and Communication

Engineering from Anna University, Chennai. He is pursuing

1513-1520, November 2008.

Chih-Wei Chang and Yi-Jan Emery Chen," A CMOS

True Single-Phase-Clock Divider With Differential

Outputs," IEEE Microwave And Wireless Components Letters, vol. 19, no. 12, pp. 813-815, December 2009.

[9] Kuang-Yu Cheng et al, "Development of 90nm InGaAs

HEMTs and Benchmarking Logic Performance with Si CMOS,"

Compound Semiconductor Integrated Circuit Symposium (CSICS), 2010 IEEE , vol., no., pp.I-4, 3-6

Oct. 2010

941