A CMOS PWM Transceiver Using Self-Referenced Edge Detection

23 downloads 5435 Views 2MB Size Report
0.5T and the data-modulated falling edge in one carrier clock cycle, and this ... by Starc and VLSI Design and Education Center, in part by the Grant-in-Aid for Young .... The proposed ECC cannot recover 2-bit successive error as designed.
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 6, JUNE 2015

1145

Transactions Briefs A CMOS PWM Transceiver Using Self-Referenced Edge Detection Kiichi Niitsu, Yusuke Osawa, Naohiro Harigai, Daiki Hirabayashi, Osamu Kobayashi, Takahiro J. Yamaguchi, and Haruo Kobayashi Abstract— A CMOS pulsewidth modulation (PWM) transceiver circuit that exploits the self-referenced edge detection technique is presented. By comparing the rising edge that is self-delayed by about 0.5 T and the modulated falling edge in one carrier clock cycle, area-efficient and highrobustness (against timing fluctuations) edge detection enabling PWM communication is achieved without requiring elaborate phase-locked loops. Since the proposed self-referenced edge detection circuit has the capability of timing error measurement while changing the length of selfdelay element, adaptive data-rate optimization and delay-line calibration are realized. The measured results with a 65-nm CMOS prototype demonstrate a 2-bit PWM communication, high data rate (3.2 Gb/s), and high reliability (BER > 10−12 ) with small area occupation (540 µm2 ). For reliability improvement, error check and correction associated with intercycle edge detection is introduced and its effectiveness is verified by 1-bit PWM measurement.

Index Terms— CMOS, jitter, pulsewidth modulation (PWM), self-referenced, transceiver.

Fig. 1. Conceptual diagram of the conventional PWM receiver in 2-bit PWM.

Fig. 2.

Conceptual diagram of the proposed PWM receiver in 2-bit PWM.

I. I NTRODUCTION The requirement of decreased power supply voltage for CMOS device scaling motivates us to develop time-domain circuits. To develop time-domain circuits, the pulsewidth modulation (PWM) scheme is a promising approach and widely used for many applications, such as wireline transceivers [1]–[3], CMOS imagers [4], and biosensor array [5]. However, conventional PWM transceiver design [1]–[3] requires large-area and power-hungry phase-locked loops (PLLs) for multiphase sampling clock generation. This brief describes the design of an area-efficient and highly robust PWM transceiver using the newly proposed self-referenced edge detection. The proposed PWM transceiver exploits a timing comparison between the rising edge that is self-delayed by about 0.5T and the data-modulated falling edge in one carrier clock cycle, and this mechanism is introduced. Adaptive data-rate optimization and error check and correction (ECC) technique are also introduced. This brief is organized as follows. The proposed PWM transceiver using self-referenced edge detection is introduced in Manuscript received October 15, 2013; revised February 26, 2014; accepted April 8, 2014. Date of publication May 29, 2014; date of current version May 20, 2015. This work was supported in part by Japan Science and Technology, in part by the Ministry of Economy, Trade and Industry, in part by Starc and VLSI Design and Education Center, in part by the Grant-in-Aid for Young Scientist (B) under Grant 24760266, in part by the Grant-in-Aid for Scientific Research (S) under Grant 20226009 and Grant 25220906, and in part by the University of Tokyo in collaboration with Cadence Design Systems, Inc. K. Niitsu is with the Department of Electronic Engineering and Computer Science, Graduated School of Engineering, Nagoya University, Nagoya 464-8603, Japan (e-mail: [email protected]). Y. Osawa, N. Harigai, D. Hirabayashi, and H. Kobayashi are with the Division of Electronics and Informatics, Faculty of Science and Technology, Gunma University, Kiryu 376-8515, Japan (e-mail: t09306014@gunma-u. ac.jp; [email protected]; [email protected]). O. Kobayashi is with STARC, Yokohama 222-0033, Japan. T. J. Yamaguchi is with Advantest Laboratories Ltd., Sendai 989-3124, Japan (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TVLSI.2014.2321393

Fig. 3. Circuit implementation of the proposed PWM transceiver architecture in 2-bit PWM.

Section II. Section III introduces adaptive data-rate optimization. Section IV presents ECC. The test chip design and the measurement results are summarized in Sections V and VI. Discussion is described in Section VII. Section VIII concludes this brief. II. PWM T RANSCEIVER U SING S ELF -R EFERENCED E DGE D ETECTION Figs. 1 and 2 show a conceptual diagram of the conventional and proposed receiver design for a 2-bit PWM. A conventional PWM receiver [1] exploits a PLL for generating time-shifted sampling clocks. Our newly proposed PWM receiver using a selfreferenced edge detector employs latches as timing comparators and a thermometer-binary decoder. The proposed self-referenced edge detection utilizes the self-delayed rising edge instead of the PLLs sampling clock. By comparing this self-delayed rising edge and the data-modulated falling edge in one carrier clock cycle, edge detection can be realized. The proposed structure removes the elaborate PLL; hence, high area efficiency is obtained. Fig. 3 shows the schematic diagram of the proposed transceiver architecture. The transmitter consists of a duty controller, four delay elements with length of multiples of modulation factor, T , selector

1063-8210 © 2014 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

1146

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 6, JUNE 2015

Fig. 4. Conceptual diagram of the adaptive data-rate optimization and delay-line calibration associated with timing jitter measurement using selfreferenced edge detection.

Fig. 6. Schematic diagram of the proposed receiver with ECC capability for 1-bit PWM.

Fig. 7. Schematic diagram of the proposed ECC circuit containing latches and error detecting logic for 1-bit PWM. Fig. 5.

Conceptual image of the proposed ECC circuit.

controlled by the PWM modulator, and OR logic. The receiver consists of three delay elements with various lengths about a half of the clock period, 0.5T , latches, and a thermometer-binary decoder. The outputs of the three latches are thermometer codes that are converted to binary code by the thermometer-binary decoder. III. A DAPTIVE DATA -R ATE O PTIMIZATION Fig. 4 shows a conceptual diagram of the adaptive data-rate optimization and delay-line calibration. The receiver has the same configuration as the reference-clock-free timing jitter measurement circuit using self-referenced clock with nT delay [6]. In [6], nT -delay generates √ two uncorrelated edges. By comparing two uncorrelated edges, 2-times timing jitter can be obtained. Thus, timing error measurement is realized by only changing the length of the delay. Based on the obtained timing error information of the carrier clock, modulation factor, T , and the carrier clock frequency can be optimized. Since the timing error of the carrier clock primarily determines the bit error rate, smaller jitter enables both higher modulation and a higher carrier clock frequency, which results in a higher data rate. Moreover, the delay lines for communication can be calibrated while processing the timing error measurement. IV. E RROR C HECK AND C ORRECTION This section introduces the ECC circuit for improving the performance of the proposed PWM transceiver using self-referenced

edge detection. Fig. 5 shows the conceptual image of the proposed ECC technique. The auxiliary receiver detects the intercycle time difference of the adjacent falling edges, as shown in Fig. 5. Delay length design of the auxiliary receiver can be determined by the autocorrelation function of the jitter of the carrier clock [7], [8]. In typical situation, correlation decreases as the number of cycles between the preceding and succeeding edges increases. Thus, a larger number of interreference cycles are expected to be effective for error detection. However, to be large number of interreference cycles causes area penalty. Therefore, we have to consider the tradeoff between ECC effectiveness and area overhead in designing the ECC circuits. Fig. 6 shows the conceptual image of the ECC function in 1-bit PWM that is enabled by comparing the received data from the main receiver and that from the auxiliary receiver. Since the implemented ECC uses only 1T -delayed edges for error correction, successive errors cannot be recovered. As stated above, instead of using 1T -delay as in Fig. 6, 0.5T -delay (comparing with the latter rising edges instead of the earlier rising edges) or a longer delay is possible. Thus, the proposed technique is flexible and can be optimized under the characteristic of timing error. Fig. 7 shows the schematic diagram of the ECC circuit for 1-bit PWM. The ECC circuit consists of latches and an error detector. ECC utilizes the successive received data from the main receivers, D1 and D2, and data from the auxiliary receivers, A+ and A−. An error code is generated when a discrepancy between the received data from the main and auxiliary receiver occurs, and the original RX bit is inverted.

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 6, JUNE 2015

Fig. 8. Microphotograph of the test chips that are fabricated in 65-nm CMOS.

1147

Fig. 10. Measured waveform of the carrier clock, transmitted and received data, and error code under nonsuccessive 1-bit error. ECC operation was successfully verified. Data rate is 3.2 Gb/s, and carrier clock frequency is 3.2 GHz.

Fig. 9. Measured waveform of the transmitted and received data with 2-bit PWM. A shifted PRBS was utilized, and RX data are inverted due to the test chip design. Data rate is 3.2 Gb/s, and carrier clock frequency is 1.6 GHz.

Fig. 11. Measured waveform of the carrier clock, transmitted and received data, and error code under 2-bit successive error. Toggle pattern is utilized for verification. The proposed ECC cannot recover 2-bit successive error as designed. Data rate is 3.2 Gb/s, and carrier clock frequency is 3.2 GHz.

V. T EST C HIP D ESIGN AND M EASUREMENT S ETUP

Fig. 10 shows the measured waveform for verifying ECC operation. To create a nonsuccessive 1-bit error situation, a carrier clock with large timing jitter of 1/16 probability was utilized. Large timing jitter was generated by changing the duty cycle. The ECC successfully recovered the received data, and generated an error code with a probability of 1/16. Fig. 11 shows the measured waveform for demonstrating the limitations of ECC operation. To create a 2-bit successive error situation, a carrier clock with a large timing jitter of 2/16 probability (two successive times every 16 cycles) was utilized. The ECC could not recover the received data as designed. Performance summary is shown in Table I. Since the proposed PWM transceiver does not require PLL nor delay locked loop (DLL) circuits, the circuit area can be minimized. Even if considering technology scaling from 0.18-μm CMOS to 65-nm CMOS, the proposed circuit is competitive from the viewpoint of area efficiency. The BER measurement has been performed by carrier clock with lower jitter than these in other related works. However, the state-ofthe-art PLLs [9], [10] can generate subpicosecond jitter clock. Thus, by associating with the low-jitter clock generator in TX, the proposed technique can be feasible without PLL or DLL in RX.

To verify the effectiveness of the proposed technique, a test chip was fabricated using 65-nm CMOS technology, as shown in Fig. 8. The footprints of the circuits are 20 μm × 27 μm (2-bit PWM, T = 15 ps, without ECC) and 20 μm × 39 μm (1-bit PWM, T = 30 ps, with ECC). To confirm the multibit operation and ECC feasibility separately, two types are implemented and tested. The common offset of the inverter-based delay line was shared by all delay lines for compact layout. An on-chip interconnect with a length of 3.2 μm was implemented as a communication channel. The target application includes mobile application [3] and large-scale sensor array [4], [5], where communication channel is short. The input carrier clock and TX data were fed into the circuit by a BERTS (Agilent 81250, timing jitter is 1.6-ps RMS). Input and output signals were captured with a sampling oscilloscope (Tektronix DSA71254B). VI. M EASUREMENT R ESULT Fig. 9 shows the measured waveform of the input and output signal. A carrier clock and shifted 2-bit pseudorandom bit sequence (PRBS) were fed into the test chip. Data communication was successfully verified with data rate of 3.2 Gb/s (=1.6 Gb/s/bit × 2 bit). The data rate is determined by the measurement setup. The maximum data rate of BERTS (Agilent 81250) is 3.2 Gb/s. High reliability with a BER < 10−12 was also verified with a BERTS.

VII. D ISCUSSION A. Another Topology of the Proposed CMOS PWM Transceiver Using Self-Referenced Edge Detection This section introduces another topology of the proposed CMOS PWM transceiver using self-reference edge detection. Instead of

1148

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 6, JUNE 2015

TABLE I P ERFORMANCE S UMMARY

compatibility with the spread spectrum clocking technique by comparing with the conventional techniques. C. Dynamic Range of the Proposed Transceiver

Fig. 12. Conceptual image of another topology of the proposed CMOS PWM transceiver using self-referenced edge detection.

using both rising and falling edges, another topology utilizes only rising or falling edges. Since this topology can remove the limitation of implementation, it expands the application of the proposed technique. Fig. 12 shows the conceptual image of another topology of the proposed CMOS PWM transceiver using self-referenced edge detection. It is well known that the timing jitter can be expressed as the accumulation of the period jitter [11]. By applying this characteristic to the PWM, PWM edges can be expressed by the accumulation of the period. For instance, when the previous bit is 0 and the period, M−M+, is 00, the current data become 1, as shown in the bottom part of Fig. 12. B. Compatibility With Spread Spectrum Clocking Since the proposed technique detects edge within the same clock cycle, it has good compatibility with the spread spectrum clocking technique [12]. The conventional techniques with PLL or DLL [1]–[3] have difficulty in recovering clock signal when the both rising and falling edges are modulated. On the other hand, the proposed technique can achieve edge detection because it utilizes intracycle clock edges. Therefore, the proposed technique has better

This section provides analysis on dynamic range of the proposed CMOS PWM transceiver using self-referenced edge detection. In this brief, we implemented 3.2-μm on-chip interconnect as a communication channel and employed 1.6-ps jitter carrier clock to check the function of the proposed transceiver. By measuring the test chip fabricated in 65-nm CMOS technology, the proposed technique found to be feasible under 3.2-μm on-chip interconnect with 1.6-ps carrier clock. However, more realistic communication channel and carrier clock signal have to be considered for making the proposed transceiver practical for commercial applications. At first, dynamic range for length of communication channel is analyzed. The dynamic range for jitter measurement is determined by combination of T and T in Fig. 4. To measure the timing jitter for guaranteeing BER of less than 10−12 , probability density function in 3σ of the timing jitter has to be measured [6]. Thus, 2T (jitter measurement range is from −2T to 2T ) must be designed to be greater than 3σ . The literature of the communication channel model [13] indicates that the additional jitter is approximately 8 ps when considering 12-in interconnect. Thus, jitter is increased from 1.6 to 9.6 ps by enlarging interconnect. Since BER of less than 10−12 requires that T must be greater than approximately 1.5σ , T must be 14.4 ps. In the 2-bit PWM, maximum clock frequency is determined by inverse of 8T . Therefore, the proposed technique can maintain the data rate of 8.68 Gb/s for guaranteeing BER of less than 10−12 even when considering 12-in interconnect. Second, the dynamic range for data rate is analyzed. The proposed technique exploits the fixed T , as shown in Fig. 3. The test chip embedded T of 15 ps for 2-bit PWM. The maximum clock frequency is 8.33 GHz, and the maximum data rate is 16.66 Gb/s. In this brief, we have implemented the fixed T and T to check the function of the proposed technique. However, if the redundancy in T and T in Fig. 4 is implemented, the proposed technique can apply to various data rate and communication channel. For example, when implementing M kinds of the T , M kinds of the data rate can be feasible. VIII. C ONCLUSION A CMOS PWM transceiver circuit using the self-referenced edge detection technique has been demonstrated for the first time. By comparing the self-delayed rising edge and modulated falling edge, edge detection was realized. This edge detection enables area-efficient and high-robustness for PWM communication without exploiting PLLs. The data-rate optimization and ECC with interstage edge detection was introduced. Test chip was fabricated in 65-nm CMOS. The measured results have demonstrated 2-bit PWM communication, high data rate (3.2 Gb/s), and high reliability (BER < 10−12 ) with small area occupation (540 μm2 ). R EFERENCES [1] W.-H. Chen, G.-K. Dehang, J.-W. Chen, and S.-I. Liu, “A CMOS 400-Mb/s serial link for AS-memory systems using a PWM scheme,” IEEE J. Solid-State Circuits, vol. 36, no. 10, pp. 1498–1505, Oct. 2001. [2] W.-J. Choe, B.-J. Lee, J. Kim, D.-K. Jeong, and G. Kim, “A singlepair serial link for mobile displays with clock edge modulation scheme,” IEEE J. Solid-State Circuits, vol. 42, no. 9, pp. 2012–2020, Sep. 2007.

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 6, JUNE 2015

[3] C.-Y. Yang and Y. Lee, “A PWM and PAM signaling hybrid technology for serial-link transceivers,” IEEE Trans. Instrum. Meas., vol. 57, no. 5, pp. 1058–1070, May 2008. [4] M. Takihi, K. Niitsu, and K. Nakazato, “Charge-conserved analog-to-time converter for a large-scale CMOS bionsensor array,” in Proc. IEEE Int. Symp. Circuits and Syst., Jun. 2014. [5] M.-T. Chung and C.-C. Hsieh, “A 0.5 V 4.95 μW 11.8 fps PWM CMOS imager with 82 dB dynamic range and 0.055 % fixed-pattern noise,” in Proc. IEEE ISSCC, Feb. 2012, pp. 114–116. [6] K. Niitsu, M. Sakurai, N. Harigai, T. J. Yamaguchi, and H. Kobayashi, “CMOS circuits to measure timing jitter using a self-referenced clock and a cascaded time difference amplifier with duty-cycle compensation,” IEEE J. Solid-State Circuits, vol. 47, no. 11, pp. 2701–2710, Nov. 2012. [7] J. A. McNeill, “Jitter in ring oscillators,” IEEE J. Solid-State Circuits, vol. 32, no. 6, pp. 870–879, Jun. 1997. [8] K. Niitsu et al., “A clock jitter reduction circuit using gated phase blending between self-delayed clock edges,” in Proc. IEEE Symp. VLSI Circuits, Jun. 2012, pp. 142–143.

1149

[9] A. Sai, Y. Kobayashi, S. Saigusa, O. Watanabe, and T. Itakura, “A digitally stabilized type-III PLL using ring VCO with 1.01 psrms integrated jitter in 65 nm CMOS,” in Proc. IEEE Int. Solid-State Circuits Conf., Feb. 2011, pp. 98–100. [10] B. Shen, G. Unruh, M. Lugthart, C.-H. Lee, M. Chambers, and C. Parrella, “An 8.5 mW, 0.07 mm2 ADPLL in 28 nm CMOS with sub-ps resolution TDC and < 230 fs RMS jitter,” in Proc. IEEE Symp. VLSI Circuits, Jun. 2013, pp. 192–193. [11] M. Ishida et al., “A programmable on-chip picosecond jitter-measurement circuit without a reference-clock input,” in Proc. IEEE ISSCC, Feb. 2005, pp. 512–513. [12] D. De Caro, C. A. Romani, N. Petra, A. G. M. Strollo, and C. Parrella, “A 1.27 GHz, all-digital spread spectrum clock generator/synthesizer in 65 nm CMOS,” IEEE J. Solid-State Circuits, vol. 45, no. 5, pp. 1048–1060, May 2010. [13] K.-J. Sham, M. R. Ahmadi, S. B. G. Talbot, and R. Harjani, “FEXT crosstalk cancellation for high-speed serial link design,” in Proc. IEEE Custom Integr. Circuits Conf., Sep. 2006, pp. 405–408.