A 2.24GHz Wide Range Low Jitter DLL-Based Frequency Multiplier ...

28 downloads 7285 Views 318KB Size Report
rate clock and data recovery circuit using a MSADLL for chip-to-chip interconnection,” IEEE Trans. ... Performance Microprocessor Circuit. New York, IEEE Press,.
A 2.24GHz Wide Range Low Jitter DLL-Based Frequency Multiplier using PMOS Active Load for Communication Applications Chih-Hsing Lin Department of Communications Engineering National Tsing Hua University, Hsin-Chu, Taiwan, R.O.C. E-mail: [email protected] Abstract—In this paper, a wide-range DLL-based frequency multiplier with PMOS active load for communication applications is proposed. Adding the PMOS active load in the delay cells has the inductive-peaking effect to increase the operation frequency range. The DLL-based frequency multiplier uses simple exclusive-or (XOR) gates and phase blending technique for the frequency multiplications. The frequency multiplier can generate N times of frequency of the input clock when the number of delay cells (N) in the VCDL is even. The output frequency of the proposed frequency multiplier ranges from 80MHz to 2.24GHz using TSMC 0.18µm CMOS process. The locked time is 0.96µs locked time at 400MHz. The peak-to-peak jitter is 46ps at 80MHz and 95.3ps at 2.24GHz. The power consumption of proposed frequency multiplier is 25.79mW at 400MHz. I. INTRODUCTION As the advance of the CMOS technologies, IC performances have evolved very rapidly. Many high-speed applications, such as Microprocessor, Memory IC and Communication IC, require high frequency clocks in the design. In order to maximize the performance, clocks must be generated from on-chip frequency multipliers. The clock generated from the frequency multipliers must be closely synchronized with the master clock. The synchronization problem is a critical design issue for wired communication systems using Serializer/Deserializer (SERDES) interfaces [1] and wireless communication systems such as CDMA as well [2]. The clock generation and synchronization can be solved by the phase-locked loop (PLL) or the delay-locked loop (DLL). The DLLs are preferred in these applications due to their stability and better jitter performance. The DLLs are considered more stable than the PLLs since they employ a first order loop filter other than a second order one in PLLs. Besides, the noise is not accumulated over cycles in the voltage controlled delay line (VCDL) whereas it is re-circulated in an oscillator in PLLs. Conventional DLLs have the problems of false locking or harmonic locking over wide operating ranges. The operation range is limited by the minimum delay of the delay line. In recent years, various DLL architectures [3-5] are proposed to solve the problems mentioned above. In [3], the wide-range analog DLL use an unorthodox voltage controlled phase shifter for widerange operations. However, this architecture is deeply affected by the process variation. In [4], a digital self-correcting loop is used to overcome the false locking and to speed up locking time. However, the drawbacks of the digital DLLs are large skew and large jitter due to quantized delay time and control-bit updates during operations. In order to increase the delay range and resolution, a large delay cell array is needed with the drawbacks of large chip

Ching-Te Chiu Department of Computer Science National Tsing Hua University, Hsin-Chu, Taiwan, R.O.C. E-mail: [email protected] area and power consumption. To overcome these problems, dualloop architectures have been proposed. In [5], an analog DLL incorporating a digital-controlled half-replica delay line (DHDL) is proposed. However, this dual loop architecture still occupies large chip area and consumes large power. More and more DLL-based frequency multipliers are employed in high-speed applications [6]. Various DLL-based frequency multipliers are proposed for clock generation [4][8][9]. In [4], a digital self-correcting DLL-based clock synthesizer used AND-OR gates for frequency multiplier. This design suffers from large skews. In [8], a DLL-based local oscillator architecture has used an edge combiner for frequency multiplication. In order to enhance the load impedance of the output, this architecture requires a LC tank, which consumes a large chip area. However, once the LC tank values are chosen, the frequency multiplication ratio is fixed. In [9], a P-type and N-type CMOS inverter based frequency multiplier is proposed. This design requires 2N delay cells to generate Nx output frequency. In this paper, we propose a DLL-based frequency multiplier architecture. The delay cell in the VCDL includes PMOS active loads. The frequency multiplier contains phase blender and XOR gates. The PMOS active load has inductive-peaking feature that increases the frequency bandwidth. The DLL provides eight phase signals to the frequency multiplier. The frequency multiplier possesses eight times input clock frequency at the output. The simulation results show that the proposed DLL operates in a wide frequency range without false locking problem; meanwhile, it can achieve low jitter performance and low power consumption.

II. CONVENTIONAL DLL ARCHITECTURE AND LOCK RANGE PROBLEM Fig. 1 shows the simplified block diagram of a conventional DLL architecture which comprises a phase frequency detector (PFD), a charge pump (CP), a voltage controlled delay line (VCDL) [6], and a loop filter (LF) that usually made from the first order capacitor.

Figure 1. Conventional DLL architecture.

The conventional DLLs may suffer from harmonic locking or false locking over wide operating ranges. The maximum and minimum delay of the VCDL are required in a certain range is the main design issue in conventional DLLs. If TVCDL.max (max delay of VCDL) and TVCDL.min (minimum delay of VCDL) can not satisfy the following inequality, then the false locking will occur. 0.5*TREF_CLK < TVCDL.min < TREF_CLK TREF_CLK < TVCDL.max < 1.5* TREF_CLK.

(1)

We can reformulate this inequality in terms of TREF_CLK by Max [(2/3)*TVCDL.max,TVCDL.min] < TREF_CLK < Min [2*TVCDL.min,TVCDL.max]. (2) If both conditions of Eqs.(1) and (2) exist, we can overcome the harmonic locking problem. Examples of the harmonic locking are demonstrated in Fig. 2. In Fig. 2(a), when the control voltage (vctrl) is minimum, TVCDL.max must locate between TREF_CLK and 1.5 TREF_CLK. On the other hand, when the vctrl is maximum, TVCDL.min must locate between 0.5TREF_CLK and TREF_CLK, as shown in Fig. 2(b). Besides, the initial value of TVCDL is unknown at the loop start-up, hence, the false lock will be made. In order to solve this problem, we need to add an extra initial start-up loop to setup up the initial voltage value of the loop filter to VDD. Therefore, the delay time (TVCDL) of the VCDL is initially minimum and then the down signal of the PFD is active before the up signal. Then, the control voltage of the loop filter decreases significantly, meanwhile, the delay time of the VCDL increases until the VCDL output clock and the reference clock are in phase.

Figure 3. Block Diagram of the proposed DLL-based frequency multiplier. II. WIDE RANGE DLL-BASED FREQUENCY MULTIPLIER The architecture of the proposed DLL-based frequency multiplier is shown in Fig. 3. The DLL core and the frequency multiplier are the two major blocks in this design. The DLL core consists of a VCDL core, a start-up circuit, a PFD, a CP and a LF. The PMOS active load is added in the delay cell in the VCDL to improve the operation frequency. The start-up circuit is used to set up the initial control voltage of the loop filter to VDD such that it can solve the harmonic locking or false locking problems. The VCDL core generates eight phase low frequency clocks that are mixed in the frequency multiplier to produce a desired high frequency clock. The functionality of each block is described below.

A. Phase Frequency Detector (PFD) The Phase Frequency Detector detects the phase difference between the reference clock (Ref_clk) and the VCDL feedback clock (VCDL_clk) as shown in Fig.1. Many flip-flop based PFDs have dead zone problems, such as large phase jitter. To achieve better jitter performance, minimization of dead zone is indispensable. In this paper, we use the Half Transparent (HT) register to reduce the dead zone problem [12].

B. Charge Pump (CP) and VCDL with PMOS active load The charge pump circuit uses the phase difference between the up and down signals from the PFD. The charge pump circuit proposed in [9] is employed in this work. The VCDL is comprised of delay cells with replica bias. Based on the control voltage, the VCDL generates an appropriate delay time in the delay cell and feedback it back to the PFD. The VCDL with PMOS active load [10] is shown in Fig. 4. It includes an active inductor formed by PMOS transistors that behaves like active resistors connected to NMOS transistors load. They act as the on-chip inductors to employ inductive-peaking effect, which can increase the operation speed. Therefore, we use the PMOS active load to replace symmetric load in delay cell of VCDL [11].

C. Phase Blending and Frequency Multiplier Figure 2. DLL locking operation range for lock-correct cases.

A phase blender circuit is shown in Fig. 5 [14]. It uses the adjacent phases of two input signals, ФA and ФB. The phase blender received these two signals with some delay to generate output signals and a pair of phase-blending inverter is used to

generate φAB phase between φA and φB. The DLL-based frequency multiplier uses simple exclusive-or (XOR) gates and phase blending technique to achieve the frequency multiplication function. The frequency multiplier can generate an output frequency that is N times of the input frequency. The N is the number of phases of the input clock, which is an even number and N≥2. The output clock (OUTCLK) can be expressed as: OUTCLK = REFCLK x N.

(3)

We choose two pair of phases (Φ1, Φ3) and (Φ2, Φ4) for the XOR gates respectively, and 2x output frequencies (such as Φ13 and Φ24) are achieved. Then, we select the pair of two phases (Φ13♁Φ24) to the XOR gate, 4x output frequency is achieved. As shown in Fig. 7, we also use the similar method to achieve 2x and 4x output frequency with phase blending and XOR gates technique. Multiplication Factor N of the proposed frequency multiplier can be chosen according to the number of the delay stage in the VCDL. The overall architecture for multiplying by 8 times of input frequency is shown in Fig. 8.

Where the REFCLK is the reference input clock and the OUTCLK is the multiplied output clock. The frequency multiplier generates four sets of 2x REFCLK, two sets of 4x REFCLK, and one set of 8x REFCLK, as shown in Fig. 6 and Fig. 7. vdd MP32

MP12 MP2

MP31

MP3

MNB MP11 VOM MNB1 VP

Vctrl

VOP

MNP

PMOS active load

MNN

VM

Figure 8. Block Diagram of the proposed Frequency Multiplier.

III. SIMULATION RESULTS

MNB2 MNB3

Figure 4. VCDL with PMOS active load.

Figure 5. Phase Blender.

This work has been implemented in TSMC 0.18µm CMOS technology. The simulation results show that the DLL-based frequency multiplier can operate in the frequency ranging from 80MHz to 2.24GHz with maximum input clocks at 280MHz. The DLL core can operate in the frequency range from 80MHz to 400MHz with PMOS active load and 70MHz-350MHz without PMOS active load. The jitter is 11ps at 350MHz with PMOS active load and 14.8ps without PMOS active load. The jitters are 46ps at 80MHz and 95.3ps at 2.24GHz, respectively. Fig. 9 shows the timing diagram when the DLL is on the locking state at 80MHz. The eye diagram at 350MHz with jitter information is shown in Fig. 10. Fig. 11 depicts the peak-to-peak jitter characteristics of the DLL for different operating frequencies. Fig. 12(a) shows the waveforms of the 2, 4, and 8 times output clocks based on the proposed circuit with 280MHz input reference clock. The output waveform versus input reference waveform is shown in Fig. 12(b). Table I summaries the performance of the proposed DLL. The performance compared with other related methods is shown in Table II. From Table II, we observe that our proposed DLL exhibits better jitter performance than the related works.

Figure 6. The 2x and 4x output clocks with XOR gates.

Figure 7. The 2x, 4x, and 8x output clocks with phase blending and XOR gates.

Figure 9. DLL in locking state at 80MHz.

Figure 10. Eye diagram at 350MHz.

Peak to Peak Jitter (ps)

V. CONCLUSION In this paper, a VCDL delay cell with PMOS active load has been proposed. It extends the operational range from 80MHz to 2.24GHz. Our proposed DLL-based frequency multiplier can be used in high speed serial link and personal communication service. From the simulated results, the critical problem of false locking over a wide frequency has been overcome and the results show the better jitter performance and lower power compared with the previous DLL methods. The power consumption is 25.79 mW at 400MHz.

60 45 30 15 0 50

150

250

350

Frequency (MHz)

Figure 11. Jitter characteristics of the DLL for different operating frequencies.

REFERENCES [1]

[2]

[3]

[4]

(a)

(b)

Figure 12. Output and input waveform of (a) 2, 4, and 8 times frequency output clock and (b) output waveform versus input reference waveform. TABLE I. PERFORMANCE SUMMARY 0.18-um TSMC CMOS Process Process 1.8v Operation Voltage 80MHz~400MHz DLL operation Frequency Range 80MHz ~2.24GHz Frequency Multiplier Operation Range 2X, 4X, 8X Multiplication Factor 46ps@80MHz Peak to Peak Jitter 17.1 ps@400MHz [email protected] 0.96us @400MHz Locking Time 25.79mW@400MHz Power Consumption (DLL core) TABLE II. PERFORMANCE EVALUATION [2] [10] [11] [13] Ours Process Operating Voltage Operating Frequency Range Peak to Peak jitter

0.35-um 3.3v

0.18-um 1.8v

0.35-um 3.0~3.6

0.35-um 3.3v

0.18-um 1.8v

62.5~25 0MHz

31.25~12 5MHz

100~140 MHz

6~130 MHz

80~400 MHz

44ps@2 50MHz

30.67@ 125MHz

51.6ps@ 125MHz

24.3ps@ 130MHz

Locking time

X

0.432us

X

X

46ps@ 80MHz 15ps@ 150MHz 10.8ps@ 250MHz 0.96us@ 400MHz

X

32mW@1 25MHz

X

132mW@ 130MHz

(Simulated)

Power

25.79W@ 400MHz

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

Lau, M.V., Shieh, S. Pei-Feng Wang etc., “Gigabit Ethernet switches using a shared buffer architecture, ”IEEE Communications Magazine, vol.41, issue 12, pp. 76- 84, Dec. 2003. J.J. Olmos and R. Agusti, “Performance analysis of a second order delay-lock loop with application to a CDMA system with multipath propagation,” IEEE ICUPC Proceedings, pp.08.04/1 - 08.04/5, Sept. 1992. T. H. Lee et al., “A 2.5-V CMOS delay-locked loop for an 18-Mbit 500-Mbyte/s DRAM, ”IEEE J. Solid-State Circuits, vol. 29, pp. 1491–1496,Dec. 1994. Liu, T. and C. Wang, “A 1-4 GHz DLL based low-jitter multi-phase clock generator for low-band ultra-wideband application,” IEEE Asia-Pacific Conf. on Adv. Sys. Integrated Circuits, pp. 330–333, Aug. 2004. H. H. Chang, J. W Lin and S. I Liu, “A fast locking and low jitter delay-locked loop using DHDL,” IEEE J. Solid-State Circuits, vol. 38, pp. 343 - 346, Feb. 2003. Y. Moon, J. Choi, K. Lee, D. K. Jeong and M. K. Kim, “An All-Analog Multiphase Delay-Locked Loop Using a Replica Delay Line for Wide-Range Operation and Low-Jitter Performance,” IEEE J. Solid-State Circuits, vol.35, no.3, pp.377-384, Mar. 2000. H. H. Chang, R. J Yang, and S. I Liu, “Low jitter and multirate clock and data recovery circuit using a MSADLL for chip-to-chip interconnection,” IEEE Trans. Circuits and Systems-I: Regular Papers, vol. 51, pp. 2356- 2364, Dec. 2004. G. Chien and P.R. Gray, “A 900-MHz local oscillator using a DLL-based frequency multiplier technique for PCS applications, ” IEEE J. of Solid-State Circuits, vol. 35, pp. 1996 - 1999, 2000. R. M. Weng, T. H. Su, C. Y. Liu, and Y. F. Kuo, “A CMOS Delay-Locked Loop Based Frequency Multiplier for Widerange Operation, ” IEEE Conf. on Electron on Devices and Solid-State Circuit, pp. 419-422,2005. M.S. Kao, C.H. Jen, C.T. Chiu, etc., “A 10 Gb/s Wide-Band Current-Mode Logic I/O Interface for High-Speed Interconnect in 0.18µm CMOS Technology,” IEEE ISSOC, pp.257 – 260, Sept. 2005. A. Chandrakasan, W. J. Bowhill, and F. Fox, Design of HighPerformance Microprocessor Circuit. New York, IEEE Press, pp. 240, 2001. P. Heydari and M. Pedram, “Jitter-induced power/ground noise in CMOS PLLs: a design perspective,” IEEE Computer Design, pp. 209 – 213, Sept. 2001. H. Chang, J. Lin, C. Yang, and S. Liu, “A wide-range delaylocked loop with a fixed latency of one clock cycle,” IEEE J. Solid-State Circuits, vol.37, no.8, pp.1021–1027, Aug. 2002. W. Garlepp, K. S. Donnelly, J. Kim, etc., “A portable digital DLL for high-speed CMOS interface circuits,” IEEE J. SolidState Circuits, vol. 34, no. 5, pp. 632-644, May 1999.