A 5.4-Gb/s Clock and Data Recovery Circuit Using Seamless Loop Transition Scheme With Minimal Phase Noise Degradation

Similar documents
A 10-Gb/s Multiphase Clock and Data Recovery Circuit with a Rotational Bang-Bang Phase Detector

/$ IEEE

Lecture 160 Examples of CDR Circuits in CMOS (09/04/03) Page 160-1

A10-Gb/slow-power adaptive continuous-time linear equalizer using asynchronous under-sampling histogram

Taheri: A 4-4.8GHz Adaptive Bandwidth, Adaptive Jitter Phase Locked Loop

REDUCING power consumption and enhancing energy

ALTHOUGH zero-if and low-if architectures have been

A Variable-Frequency Parallel I/O Interface with Adaptive Power Supply Regulation

DESIGN OF MULTIPLYING DELAY LOCKED LOOP FOR DIFFERENT MULTIPLYING FACTORS

ECEN720: High-Speed Links Circuits and Systems Spring 2017

LETTER A 1.25-Gb/s Burst-Mode Half-Rate Clock and Data Recovery Circuit Using Realigned Oscillation

A 5-Gb/s 156-mW Transceiver with FFE/Analog Equalizer in 90-nm CMOS Technology Wang Xinghua a, Wang Zhengchen b, Gui Xiaoyan c,

Fractional- N PLL with 90 Phase Shift Lock and Active Switched- Capacitor Loop Filter

A Random and Systematic Jitter Suppressed DLL-Based Clock Generator with Effective Negative Feedback Loop

A GHz Wideband Sub-harmonically Injection- Locked PLL with Adaptive Injection Timing Alignment Technique

A 0.18µm SiGe BiCMOS Receiver and Transmitter Chipset for SONET OC-768 Transmission Systems

WITH the growth of data communication in internet, high

A 1.5 Gbps Transceiver Chipset in 0.13-mm CMOS for Serial Digital Interface

A Clock and Data Recovery Circuit With Programmable Multi-Level Phase Detector Characteristics and a Built-in Jitter Monitor

ECEN620: Network Theory Broadband Circuit Design Fall 2012

Noise Analysis of Phase Locked Loops

A Reset-Free Anti-Harmonic Programmable MDLL- Based Frequency Multiplier

A Low-Jitter Phase-Locked Loop Based on a Charge Pump Using a Current-Bypass Technique

Bootstrapped ring oscillator with feedforward inputs for ultra-low-voltage application

ISSCC 2003 / SESSION 4 / CLOCK RECOVERY AND BACKPLANE TRANSCEIVERS / PAPER 4.3

Energy Efficient and High Speed Charge-Pump Phase Locked Loop

CMOS 120 GHz Phase-Locked Loops Based on Two Different VCO Topologies

THE reference spur for a phase-locked loop (PLL) is generated

PHASE-LOCKED loops (PLLs) are widely used in many

CLOCK AND DATA RECOVERY (CDR) circuits incorporating

Delay-Locked Loop Using 4 Cell Delay Line with Extended Inverters

ISSN:

A Wide-Range Delay-Locked Loop With a Fixed Latency of One Clock Cycle

SiNANO-NEREID Workshop:

Phase interpolation technique based on high-speed SERDES chip CDR Meidong Lin, Zhiping Wen, Lei Chen, Xuewu Li

WIDE tuning range is required in CMOS LC voltage-controlled

I. INTRODUCTION. Architecture of PLL-based integer-n frequency synthesizer. TABLE I DIVISION RATIO AND FREQUENCY OF ALL CHANNELS, N =16, P =16

An All-digital Delay-locked Loop using a Lock-in Pre-search Algorithm for High-speed DRAMs

A fully digital clock and data recovery with fast frequency offset acquisition technique for MIPI LLI applications

ECEN620: Network Theory Broadband Circuit Design Fall 2014

A 10-GHz CMOS LC VCO with Wide Tuning Range Using Capacitive Degeneration

Enhancement of VCO linearity and phase noise by implementing frequency locked loop

This chapter discusses the design issues related to the CDR architectures. The

Phase Locked Loop Design for Fast Phase and Frequency Acquisition

A Low Noise, Voltage Control Ring Oscillator Based on Pass Transistor Delay Cell

NEW WIRELESS applications are emerging where

THE serial advanced technology attachment (SATA) is becoming

Design of a 3.3-V 1-GHz CMOS Phase Locked Loop with a Two-Stage Self-Feedback Ring Oscillator

ECE1352. Term Paper Low Voltage Phase-Locked Loop Design Technique

THE DEMANDS of a high-bandwidth dynamic random access

Design of Phase Locked Loop as a Frequency Synthesizer Muttappa 1 Akalpita L Kulkarni 2

A Clock and Data Recovery Circuit with Adaptive Loop Bandwidth Calibration and Idle Power Saved Frequency Acquisition

Design and Implementation of High-Speed CMOS Clock and Data Recovery Circuit for Optical Interconnection Applications. Seong-Jun Song. Dec.

Designing Nano Scale CMOS Adaptive PLL to Deal, Process Variability and Leakage Current for Better Circuit Performance

A Fully Integrated 20 Gb/s Optoelectronic Transceiver Implemented in a Standard

Analysis and Design of a 1GHz PLL for Fast Phase and Frequency Acquisition

Conference Guide IEEE International Symposium on Circuits and Systems. Rio de Janeiro, May 15 18, 2011

A 10Gbps Analog Adaptive Equalizer and Pulse Shaping Circuit for Backplane Interface

CHAPTER 6 PHASE LOCKED LOOP ARCHITECTURE FOR ADC

5Gbps Serial Link Transmitter with Pre-emphasis

THE BASIC BUILDING BLOCKS OF 1.8 GHZ PLL

A VCO-based analog-to-digital converter with secondorder sigma-delta noise shaping

AS VLSI technology continues to advance, the operating

TIMING recovery (TR) is one of the most challenging receiver

IN RECENT years, low-dropout linear regulators (LDOs) are

NOWADAYS, multistage amplifiers are growing in demand

A PROCESS AND TEMPERATURE COMPENSATED RING OSCILLATOR

Low Power, Wide Bandwidth Phase Locked Loop Design

DESIGN AND VERIFICATION OF ANALOG PHASE LOCKED LOOP CIRCUIT

A CMOS Phase Locked Loop based PWM Generator using 90nm Technology Rajeev Pankaj Nelapati 1 B.K.Arun Teja 2 K.Sai Ravi Teja 3

A Multiobjective Optimization based Fast and Robust Design Methodology for Low Power and Low Phase Noise Current Starved VCO Gaurav Sharma 1

Synchronous Mirror Delays. ECG 721 Memory Circuit Design Kevin Buck

Fast-lock all-digital DLL and digitally-controlled phase shifter for DDR controller applications

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2012

DAT175: Topics in Electronic System Design

Designing of Charge Pump for Fast-Locking and Low-Power PLL

15.3 A 9.9G-10.8Gb/s Rate-Adaptive Clock and Data-Recovery with No External Reference Clock for WDM Optical Fiber Transmission.

A multi-band single-loop PLL frequency synthesizer with dynamically-controlled switched tuning VCO Samuel, A.M.; Pineda de Gyvez, J.

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 41, NO. 2, FEBRUARY A Regulated Charge Pump With Small Ripple Voltage and Fast Start-Up

RECENT advances in integrated circuit (IC) technology

A 0.2-to-1.45GHz Subsampling Fractional-N All-Digital MDLL with Zero-Offset Aperture PD-Based Spur Cancellation and In-Situ Timing Mismatch Detection

Analysis of phase Locked Loop using Ring Voltage Controlled Oscillator

Dedication. To Mum and Dad

6.976 High Speed Communication Circuits and Systems Lecture 21 MSK Modulation and Clock and Data Recovery Circuits

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2010

Design and Characterization of a 10 Gb/s Clock and Data Recovery Circuit Implemented with Phase-Locked Loop

A digital phase corrector with a duty cycle detector and transmitter for a Quad Data Rate I/O scheme

ISSN:

ECEN 720 High-Speed Links: Circuits and Systems

A Clock Regenerator using Two 2 nd Order Sigma-Delta Modulators for Wide Range of Dividing Ratio

A Phase-Locked Loop with Embedded Analog-to-Digital Converter for Digital Control

A Clock Generating System for USB 2.0 with a High-PSR Bandgap Reference Generator

A 2.2GHZ-2.9V CHARGE PUMP PHASE LOCKED LOOP DESIGN AND ANALYSIS

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2010

20 MHz-3 GHz Programmable Chirp Spread Spectrum Generator for a Wideband Radio Jamming Application

IN the design of the fine comparator for a CMOS two-step flash A/D converter, the main design issues are offset cancelation

Single-Ended to Differential Converter for Multiple-Stage Single-Ended Ring Oscillators

A 3-10GHz Ultra-Wideband Pulser

Design of Low Power High Speed Fully Dynamic CMOS Latched Comparator

A New Approach for Op-amp based VCO Design Using 0.18um CMOS Technology

Transcription:

2518 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 59, NO. 11, NOVEMBER 2012 A 5.4-Gb/s Clock and Data Recovery Circuit Using Seamless Loop Transition Scheme With Minimal Phase Noise Degradation Won-Young Lee, Student Member, IEEE, and Lee-Sup Kim, Senior Member, IEEE Abstract This paper presents a 5.4-Gb/s clock and data recovery circuit using a seamless loop transition scheme which has minimal phase noise degradation. The proposed scheme enables the CDR circuit to change the operation mode without output phase noise degradation or stability problems. A modified half-rate linear phase detector reduces the phase error between the data and clock. A tested chip is manufactured using 0.13-µm CMOS technology. The rms jitter of the proposed CDR circuit is 5.98 ps-rms, which is 2.61 ps lower than the CDR circuit with the conventional scheme. The measured power dissipation is 138 mw with output drivers and an embedded 2:1 MUX at 5.4-Gb/s data rate. Index Terms Dual-loop architecture, clock and data recovery (CDR), phase noise. I. INTRODUCTION R ECENTLY as the color depth and the resolution of a display panel have increased due to demands for 3-D display, high resolution TV s and monitors, the data rate of display interface also increases. Nowadays, display interfaces are capable of supporting over 2560 1600 resolution with the maximum data rate of tens of Gb/s through cables as shown in Fig. 1. In the latest display interfaces such as DisplayPort and HDMI, employment of a high speed CDR circuit is necessary in order to achieve low bit-error rate (BER) since sampling time margin between clock and data signals decreases. In communication systems, there are various CDR architectures [1] [8]. Among those, the single loop architecture is simple to design [1], [2]. However, the initial clock frequency should be near baud rate for correct operation unless the CDR circuit has a frequency acquisition loop or a frequency detection circuit [8]. Although a frequency-locked loop using a frequency detector operates simultaneously with a phase-locked loop, a detection range of a commonly used frequency detector is limited and control strengths of two loops should be adjusted to avoid a conflict between two loops. A CDR circuit using an additional PLL has been also reported [3]. As using an additional PLL Manuscript received August 05, 2011; revised December 28, 2011; accepted February 09, 2012. Date of publication April 16, 2012; date of current version October 24, 2012. This work was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2012-0000116). The fabrication of the chip was supported by IC Design Education Center (IDEC). This paper was recommended by Associate Editor Ali Sheikholeslami. The authors are with the Department of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 305-701, Korea (e-mail: wylee@mvlsi.kaist.ac.kr; lskim@ee.kaist.ac.kr). Digital Object Identifier 10.1109/TCSI.2012.2190678 Fig. 1. System interfaces for display devices. with an external reference, the CDR circuit has a reliable clock source but it requires more power and area consumptions. As shown in Fig. 2, a dual-loop architecture reduces power and area consumption sharing resources such as a loop filter and a VCO [4]. This architecture is also suitable to support multi-rate operation with a clock training sequence since the output frequency of a VCO is set to input data rate by a frequency acquisition loop before data transmission. However, the dual-loop architecture requires attention to the design of a loop filter and a current-injection circuit such as a charge-pump and a V/I converter to avoid the stability problem [9]. Conventional designs have avoided this problem by making the output current of a V/I converter smaller than the charge-pump current. In the next section, we discuss disadvantages of this current reduction method. In Section III, the CDR circuit using the seamless loop transition scheme with minimal phase noise degradation is presented and the circuit implementation of building blocks is described in detail. The experimental results are shown in Section IV, and conclusions are given in Section V. II. DUAL-LOOP CDR CIRCUIT In Fig. 2, assume that and havethesamecurrent,or the CDR circuit has only one charge-injection circuit. When a dual-loop CDR circuit transits from a frequency acquisition mode (F.A) to a phase tracking mode (P.T), loop bandwidth increases and phase margin of the CDR circuit decreases as shown in Fig. 3. If the phase margin is not enough, the frequency acquisition loop of the CDR circuit diverges from the steady state 1549-8328/$31.00 2012 IEEE

LEE AND KIM: A 5.4-Gb/s CLOCK AND DATA RECOVERY CIRCUIT USING SEAMLESS LOOP TRANSITION SCHEME 2519 Fig. 4. Phase margins of a full-rate, a half-rate, and a quarter-rate dual-loop CDR (D-CDR) circuit with various clock division ratios after the loop transition. Fig. 2. Dual-loop CDR circuit without the reference clock. Fig. 3. The change of phase margin due to absence of the division ratio. and cannot be locked at the target output frequency. The variations of loop bandwidth and phase margin are caused by a clock frequency divider because it is removed from the feedback loop in the phase tracking mode as follows: and a quarter-rate dual-loop CDR circuit with various clock division ratios, which are 4, 8, 16, 32, and 64. The phase margin in the frequency acquisition loop is set to be 55. Basically, an increase in the division ratio of a clock divider causes a decrease in phase margin of the phase tracking loop. In addition, as the ratio of an output clock frequency to a data rate decreases, for instance, if a half-rate or a quarter-rate dual-loop architecture is used, increases since the number of charge injections per 1-UI is increased by multi-phase operation. This makes loop gain increase and reduces phase margin in the phase tracking mode unless the injected current is reduced to 1/2 or 1/4 times the injected current of a full-rate CDR circuit. Therefore, a half-rate or a quarter-rate architectures require more attention to the loop design to secure the phase margin. A dual-loop CDR circuit can be designed to have enough phase margin in the phase tracking mode. However, in this case, the frequency acquisition loop inevitably suffers the drop in loop gain and bandwidth due to appearance of 1/N. This drop decreases phase margin and increases loop instability. As the larger division ratio is used, the probability of loop instability increases more. To maintain loop bandwidth and loop stability, the current reduction scheme has been widely used. The open-loop transfer function of a CDR circuit with the current reduction scheme is expressed as follows: (1) where is the VCO gain, is the output current of a current-injection circuit, and are elements of the loop filter,and is the product of the data transition density and the linear phase detector gain. is about 1 since of PRBS-7 is about 0.5 and of a half-rate phase detector is 2. Although the frequency acquisition loop is designed to have the bandwidth of 20 MHz and the phase margin of 55, the bandwidth increases to 91 MHz and the phase margin decreases to 31. The phase margin in the phase tracking mode also decreases as the ratio between a data rate and an output clock frequency decreases. Fig. 4 shows phase margins of a full-rate, a half-rate, (2) where is the current reduction ratio that includes the data transition density and the linear phase detector gain. By reducing the output current of the charge-injection circuit, the increase in loop gain due to disappearance of is suppressed. Thus, phase margin and loop bandwidth can be maintained as the expected values. However, this solution causes a new problem. In a common frequency synthesizer, there are various noise sources which affect the output phase noise as shown in Fig. 5. Since the closed loop operates as a low-pass filter for reference and charge-pump noises and a high-pass filter for VCO noise, some portions of these noises are filtered out. The filtered noises are shown in the output clock as the total phase noise. (3)

2520 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 59, NO. 11, NOVEMBER 2012 Fig. 5. Noise sources in a frequency synthesizer. Fig. 6. Simulated output phase noise of a frequency synthesizer with 500- A and 62.5- A charge-injection. The phase noise due to a charge-injection circuit is a function of the injected current, which is given by where is the output phase noise, and is the openloop transfer function including. H(s) depends on. When a loop mode of the dual-loop CDR circuit changes from frequency acquisition to phase tracking, a division ratio (N) in (4) becomes 1 since the phase tracking loop does not have a frequency divider. In case of,if is reduced to using the current reduction scheme, becomes Therefore, increases when the current reduction is used to maintain loop characteristics. (4) (5) Fig. 7. VCO noise transfer functions when a frequency divider is removed without the current reduction. Fig. 6 depicts the simulated output phase noise of a frequency synthesizer in Fig. 5 when. Each noise is extracted from noise simulation using transistor-level schematics and spectrerf. The used VCO is a ring-type and the symmetric charge-pump in [10] is used as the charge-injection circuit. The output phase noise approximately increases by 30 dbc/hz as the charge-pump current is reduced at a rate of 1/8. The difference between output phase noises is mainly affected by a VCO phase noise. Fig. 7 shows VCO noise transfer functions when a frequency divider is removed without the current reduction. In this case, the output phase noise decreases since more phase noise from a VCO is filtered out. Therefore, when the CDR circuit changes to the phase tracking loop without the current reduction scheme, a large reduction of the output phase noise occurs due to large VCO noise-filtering by the loop and the relatively low phase noise from a charge-injection circuit as shown Fig. 6. However, loop bandwidth increases from 20 MHz to 91 MHz and phase margin decreases from 55 to 31 as shown in Fig. 3. It means that if a dual-loop CDR

LEE AND KIM: A 5.4-Gb/s CLOCK AND DATA RECOVERY CIRCUIT USING SEAMLESS LOOP TRANSITION SCHEME 2521 Fig. 8. Architecture of the dual-loop CDR circuit. Fig. 9. Schematic of the loop filter. circuit has a divide-by-8 divider and uses the current reduction scheme, the dual-loop CDR circuit is able to avoid a stability problem but suffers the increased phase noise of the recovered clock. To prevent an increase in the output phase noise due to the current reduction scheme, the charge-injection circuit can be designed to supply large current in the frequency acquisition mode. According to (5), this approach can reduce the output phase noise in spite of the current reduction because the charge-injection circuit will produce the reasonable current after the current reduction. However, if the current reduction ratio is large, i.e., a clock divider has a large division ratio, high currents flow into a loop filter in the frequency acquisition mode. It increases the power consumption in proportion to the frequency acquisition time and the number of times restarting the frequency acquisition mode due to losing frequency information in the phase tracking mode. The proposed dual-loop CDR circuit focuses on a solution of two problems; loop stability and phase noise degradation. Instead of the current reduction, coefficients of the loop filter Fig. 10. Voltage difference between charged voltages in C and C when SW is opened. function are changed with loop transition to prevent the increase of loop gain and maintain loop bandwidth and stability. III. CIRCUIT DESIGN Fig. 8 shows the architecture of the proposed dual-loop CDR circuit. According to DisplayPort version 1.2 specification [11],

2522 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 59, NO. 11, NOVEMBER 2012 Fig. 11. Transient simulation results of and for the entire process. the data rates are 5.4 Gb/s, 2.7 Gb/s, and 1.62 Gb/s. However, since the proposed CDR circuit has been designed before publication of the technical specification, the data rates of the prototype CDR circuit have been selected to be 5.4 Gb/s and 3.24 Gb/s. In DisplayPort main link, before transmitting video data, a source device transmits a D10.2 pattern (101010 )whichis used as the reference clock in the CDR circuit. During the clock training sequence, the frequency acquisition loop of the CDR circuit is turned on to generate the half-rate clock. Since a phasefrequency detector (PFD) and a lock detector are designed using CMOS logic gates, the received 2.7-GHz D10.2 pattern is divided by 8. When the half-rate clock has been generated, a lock detector sets Lock high and then the loop transition is performed. A lock detector is restarted every 379 ns until the frequency lock is detected. The lock detector consists of a comparator and two 7-bit counters, which provides the accuracy of 99.22%. The power consumption of the frequency acquisition loop is similar to the phase tracking loop because most circuits have been designed using current-mode logics. The settling time of frequency acquisition is less than 250 ns from post-layout simulation results. After the loop transition from frequency acquisition to phase tracking, Lock changes the resistance and the capacitance of the loop filter to prevent loop instability and the increase in the output phase noise. A half-rate linear phase detector (HRLPD) is employed and produces 1:2 de-multiplexed data with the recovered clock. In the proposed CDR circuit, two VCOs are implemented since it is difficultforanlc-vcotocoverfrom 1.62 GHz to 2.7 GHz with low noise performance. One of two VCOs is selected by MAX_LINK_RATE according to the link speed and the unselected VCO is turned off for saving power consumption. If the CDR circuit in the phase tracking mode loses frequency information which is stored in the loop filter, a restoring signal,, resets the lock detector and restarts the entire process. This is effective since the frame rate of a monitor or a TV is Fig. 12. (a) Open-loop characteristic and (b) output phase noise of the proposed scheme. much slower than the frequency acquisition time, and human eyes cannot detect some wrong bits due to persistence of vision.

LEE AND KIM: A 5.4-Gb/s CLOCK AND DATA RECOVERY CIRCUIT USING SEAMLESS LOOP TRANSITION SCHEME 2523 Fig. 13. (a) Half-rate linear phase detector in [12]. (b) Modified half-rate linear phase detector. A. Loop Filter for Seamless Loop Transition The detailed schematic of the controllable loop filter is depicted in Fig. 9. This circuit consists of an amplifier, switches, resistors, and capacitors. When the CDR circuit is in the frequency acquisition mode, in other words, Lock is 0 V, the loop filter is composed of and. The frequency acquisition loop stores frequency information in. When the frequency acquisition has been accomplished, the phase tracking mode is started and the loop filter is composed of and to maintain loop characteristics. At this time, the stored charges in are distributed to and finally the loop loses frequency information. To prevent this situation, the amplifier minimizes a voltage error between and during the frequency acquisition mode. A transistor,, and a capacitor,, are used for dominant-pole compensation, which realizes a left-half-plane zero at frequencies around the unity-gain frequency. The 3-dB frequency of the unity gain feedback buffer when Lock is 0 V is 97 MHz which is obtained from the simulated closed loop frequency response. This loop bandwidth is sufficient for voltage tracking since the frequency acquisition time is 100 s according to the specification [11]. In the phase tracking mode, the amplifier turns off to save power. Fig. 10 shows the difference between charged voltages in and when SW is opened; The CDR circuit is in the frequency acquisition mode. The voltage difference causes a voltage drop when two capacitors are connected in parallel. For example, assuming is equal to is 0 V, and the loop has been locked at of 800 mv, the voltage of the merged capacitor becomes 400 mv. To merge two capacitors without the voltage drop, the voltage difference between charged voltages in each capacitor should be minimized. Otherwise, after loop transition, the phase tracking loop loses frequency information which is stored in the loop capacitor. If Fig. 14. (a) Error signal of the linear phase detector. (b) Narrow pulse between error signals. Fig. 15. Gain graphs of the conventional and the modified linear phase detectors. the frequency offset is over a detection range of a phase detector, the loop fails to lock. The simulation result shows that tracks well with the limit of 10 mv excluding less than 0.2 V which is a non-used range. Fig. 11 shows transient post-layout simulation results of and for the entire process. In the frequency acquisition loop, follows by the amplifier in Fig. 9. The voltage difference between and is less than 5 mv. When the lock signal goes high, and are connected in parallel and then becomes equal to 60 ns later. Through these operations, finally total resistance decreases and total capacitance increases to suppress the increase in DC gain and maintain loop stability and bandwidth as expressed in (5). (6)

2524 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 59, NO. 11, NOVEMBER 2012 Fig. 16. Schematic of the V/I converter. where,and is the product of the data transition density (PRBS7) and the HRLPD gain, about 0.5 and 2, respectively. The effect by the second pole,, can be neglected since the second pole is located far away from the dominant pole. From Fig. 12, it is proved that the proposed scheme can be an alternative solution to maintain the loop bandwidth and avoid the loop stability problem with minimal phase noise degradation of the output clock. The used noise and circuit parameters except the loop filter are the same as used in Fig. 6 to verify and compare the proposed scheme with the current reduction scheme. According to simulated results, even if the output current of the V/I converter is not reduced, the loop bandwidth is not varied. The phase margin of 56.3 is obtained. The output phase noise of the proposed scheme is 22.5 dbc/hz lower than the current reduction scheme at 10-MHz offset. B. Half-Rate Linear Phase Detector The proposed CDR circuit employs a modified version of the half-rate parallel linear phase detector in [12] which has the good performance by extending the width of error signals over 1-UI. However, this phase detector has a problem of output load unbalancing between two cascaded latches. In [12], input ports of XOR and XOR are connected to A, D and B, C, respectively as shown in Fig. 13(a). The total loads at node A and D, which are input ports of XOR,are and, respectively; is the input capacitance and is the output intrinsic capacitance. In the proposed CDR circuit, the half-rate linear phase detector has been modified to distribute output loads of a latch as shown in Fig. 13(b). The total load at node A and E, which are input ports of XOR is and, respectively. Assuming output driving strength of all used latches is equal, the propagation delay of the latch is proportional to the total output load. If output loads of latches are not the same as each other, the difference between propagation delays of latches exists. It means that unnecessary phase differences occur at the each input of XOR gates. These differences extend Fig. 17. Output current of the V/I converter in case of the maximum data transition. pulse widths of phase detector outputs wider than ideal widths. As the phase detector generates clock-data phase information including unnecessary phase differences, it makes static phase errors between the incoming data and the sampling clock. s of the conventional circuit and the modified circuit are given by assuming and. From (7) and (8), it is noticed that if approximately. Since outputs of the phase detector, and are connected to a DEMUX, is the input capacitance of the latch which is in the DEMUX to align data. Therefore, assuming is equal to is longer than if. As showninfig.14(a),thephasedetector produces error signals which are larger than 1-UI. The wide pulsewidth is an advantage of the phase detector in [12]. However, the other side of (7) (8)

LEE AND KIM: A 5.4-Gb/s CLOCK AND DATA RECOVERY CIRCUIT USING SEAMLESS LOOP TRANSITION SCHEME 2525 Fig. 18. LC-VCO block composed of 2.7-GHz and 1.62-GHz VCOs. it is that the interval between a previous error signal and the next error signal is short. The narrowest pulse width is 0.25-UI at worst. An XOR gate with narrow bandwidth cannot generate the short interval as shown in Fig. 14(b). This causes phase errors due to unnecessary charge injections in the loop filter. Therefore, XOR gates used in the phase detector should have enough bandwidth and to do that, the power consumption and transistor size of the XOR gate should be large compared to other components. It means that is larger than even if parasitic capacitances of metal interconnects are not counted. Gain curves of the linear parallel phase detector and its modified version are shown in Fig. 15. The same circuits for latches and XOR gates are used in both phase detectors for comparison. 277.5 ps on the y-axis means the ideal pulse width when the CDR circuit is in the lock state, that is the ratio of the error signal to the reference signal is. The conventional phase detector has an offset of 0.11 UI which comes from the propagation delay difference between the each input of the XOR gate. In [12], the up and down current ratio of 5/4 is used for offset correction instead of the ideal ratio of 3/2. However, since the ratio of 5/4 was determined by the consideration of the circuit constants at an XOR gate [12], it can be changed according to the simulated performance of XOR gates. By the modified phase detector, the phase error can be reduced and the ideal ratio of 3/2 becomes available. Fig. 16 shows the schematic of a V/I converter connected to the half-rate linear phase detector. The conventional phase detector in Fig. 13(a) divides the reference signal into two reference signals using the clock. An unbalanced charge-pump uses two error signals and two reference signals [12]. In the modified half-rate linear phase detector, a DEMUX block composed of AND gates is removed and so the V/I converter uses two error signals and a reference signal. Since error signal width is 3/2 times wider than a referencesignalwhen theclock samplesthe centerofthe received data, the up-current and the down-current should be different. Assuming that the up-current is, the down-current by one error signal is. Although the maximum down-current is when two error signals are simultaneously high, the average current, which flows into the loop filter, is zero as shown in Fig. 17. Since the output current toggles between and withthesametimedurationthe average voltage variation drops to zero. C. LC-VCO As shown in Fig. 18, An LC-VCO block is composed of two individual LC-VCOs. Since the CDR circuit has been designed to support 5.4-Gb/s and 3.24-Gb/s data modes, a VCO block should be designed to cover from 1.62 GHz to 2.7 GHz in the case of a half-rate architecture. However, it is difficult for an LC-VCO to have a wide tuning range such as from 2.7 GHz to

2526 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 59, NO. 11, NOVEMBER 2012 Fig. 19. VCO gain curves of (a) 2.7-GHz and (b) 1.62-GHz LC-VCOs. Fig. 21. The eye diagrams of (a) 2.7-Gb/s retimed output data, and (b) 1.62-Gb/s retimed output data. Fig. 20. Die photograph of the CDR circuit. 1.62 GHz. If too many capacitors are used as a capacitor array to increase a tuning range considering variations, it is possible that the VCO gain decreases too much, so that the VCO does not oscillate at the low frequency band. Therefore, two individual LC-VCOs with different oscillation frequencies are implemented to provide high reliability, and one of two VCOs is selected by MAX_LINK_RATE according to the link speed. For saving power consumption, the unselected VCO is turned off by a supply voltage regulator. Fig. 19 shows VCO gain curves of the implemented LC-VCOs. Each VCO is controlled by 3-bit binary codes to be robust to variations. The gains of 2.7-GHz and 1.62-GHz LC-VCOs are on average 1 GHz/V and 500 MHz/V, respectively. The decrease in the VCO gain is compensated by increasing the injected current into the loop filter as the operation frequency changes. IV. EXPERIMENTAL RESULTS The CDR chip has been fabricated in 0.13- m CMOS technology. It occupies an area of 1.1 mm 1mmwithdecoupling capacitors as shown in Fig. 20, and consumes 138 mw with off-chip drivers at 5.4-Gb/s data rate. The eye diagrams of the recovered half-rate data are shown in Fig. 21. The retimed data exhibit peak-to-peak jitters of 55.4 ps and 92.2 ps for the recovered 2.7-Gb/s and 1.62-Gb/s data, respectively. The 1:2 de-multiplexed recovered data are multiplexed by the embedded 2:1 MUX for loopback mode BER test. Fig. 22 shows the eye diagram and jitter tolerance of the 5.4-Gb/s output data. The measured jitter tolerance shows that the CDR circuit is tolerant of the maximum jitter amplitude which the tester can generate. Measured tolerances for 1 khz, 100 khz, 4 MHz and 40 MHz are over 440 UIpp, 4.4 UIpp, 0.22 UIpp, and 0.22 UIpp, respectively. Since the used tester cannot generate the large jitter amplitude at low jitter frequencies, a comparison with DisplayPort jitter tolerance mask is available with only 4-MHz to 40-MHz jitter frequencies. The measured result shows the CDR circuit meets the DisplayPort mask with UI. Fig. 23 shows the jitter transfer characteristic. We have transmitted 5.4 Gb/s clock pattern data (2.7 GHz) with various jitter frequencies and measured peak powers of input data jitter and recovered clock jitter. The jitter peaking is 1.87 db and the 3-dB frequency is 26 MHz. The jitter peaking can be reduced to negligible levels by overdamping the loop. The phase noise graph of the recovered clock is shown in Fig. 24. Both schemes have similar output phase noises at

LEE AND KIM: A 5.4-Gb/s CLOCK AND DATA RECOVERY CIRCUIT USING SEAMLESS LOOP TRANSITION SCHEME 2527 TABLE I PERFORMANCE SUMMARY Fig. 23. Measured jitter transfer characteristic. of the recovered clock with the current reduction scheme is 77.60 dbc/hz, 97.52 dbc/hz, and 91.49 dbc/hz at 10 khz, 1 MHz, and 10 MHz, respectively. The rms jitter of the recovered clock is 5.98 ps and 8.59 ps with the proposed scheme and the current reduction scheme, respectively. As confirmed by the simulations, the measured result shows that the in-band noise of the current reduction scheme is larger than that of the proposed scheme. The performance summary and comparisons with the previous works are presented in Table I. Fig. 22. (a) The eye diagram and (b) jitter tolerance of 5.4-Gb/s 2:1 multiplexed output data. higher frequency offsets. It is hard to make the loop bandwidth of the current reduction scheme equal to the proposed scheme due to PVT variations. Moreover, the proposed scheme has an additional noise which comes from the proposed loop filter composed of active elements in switches and an amplifier, but it is quite small. The phase noise of the recovered clock with the proposed scheme is 89.75 dbc/hz, 96.37 dbc/hz, and 92.10 dbc/hz at 10 khz, 1 MHz, and 10 MHz, respectively as shown in Fig. 24(a). Fig. 24(b) shows that the phase noise V. CONCLUSION A 5.4-Gb/s clock and data recovery circuit using the seamless loop transition scheme has been presented as a precedent study on DisplayPort version 1.2. The proposed scheme enables the CDR circuit to change the operation mode without phase noise degradation or stability problems. The half-rate linear phase detector has been modified to reduce the phase error between incoming data and the recovered clock. The rms jitter of the proposed CDR circuit is 5.98 ps-rms, which is 2.61 ps lower as compared with the conventional CDR circuit. The measured jitter tolerance shows the CDR circuit meets the DisplayPort mask with UI from 4-MHz to 40-MHz jitter frequencies. A 0.13- m CMOS technology has been used

2528 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 59, NO. 11, NOVEMBER 2012 Fig. 24. Phase noises of the recovered 2.7-GHz half-rate clock signals (a) with the proposed loop transition scheme and (b) the current reduction scheme. for the chip fabrication. The chip occupies the active area of 1.1 mm 1 mm, and consumes the power of 138 mw with output drivers. REFERENCES [1] J. Savoj and B. Razavi, A 10-Gb/s CMOS clock and data recovery circuit with a half-rate linear phase detector, IEEE J. Solid-State Circuits, vol. 36, no. 5, pp. 761 767, May 2001. [2] D. Rennie and M. Sachdev, A 5-Gb/s CDR circuit with automatically calibrated linear phase detector, IEEE Trans. Circuits Syst. I: Reg. Papers, vol. 55, no. 3, pp. 796 803, Apr. 2008. [3] Y.Seo,J.W.Lee,H.J.Kim,C.Yoo,J.J.Lee,andC.S.Jeong, A 5-Gb/s clock-and data-recovery circuit with 1/8-rate linear phase detector in 0.18- m CMOS technology, IEEE Trans. Circuits Syst. II: Express Briefs, vol. 56, no. 1, pp. 6 10, Jan. 2009. [4] S.Byun,J.C.Lee,J.H.Shim,K.Kim,andH.K.Yu, A10-Gb/s CMOS CDR and DEMUX with IC a quarter-rate linear phase detector, IEEE J. Solid-State Circuits, vol. 41, no. 11, pp. 2566 2576, Nov. 2006. [5] S.K.Lee,Y.S.Kim,H.Ha,Y.Seo,H.J.Park,andJ.Y.Sim, A650 Mb/s-to-8 Gb/s referenceless CDR circuit with automatic acquisition of data rate, in IEEEISSCCDig.Tech.Papers, 2009, pp. 184 185. [6] O. Tyshchenko, A. Sheikholeslami, H. Tamura, M. Kibune, H. Yamaguchi,andJ.Ogawa, A5Gb/sADC-basedfeed-forwardCDR in 65 nm CMOS, IEEE J. Solid-State Circuits, vol. 45, no. 6, pp. 1091 1098, Jun. 2010. [7] W.Y.LeeandL.S.Kim, A5.4Gb/sclockanddatarecoverycircuit using the seamless loop transition scheme without phase noise degradation, in Proc. IEEE Int. Symp. Circuits and Systems (ISCAS),2011, pp. 430 433. [8] R. Inti, W. Yin, A. Elshazly, N. Sasidhar, and P. K. Hanumolu, A 0.5-to-2.5 Gb/s reference-less half-rate digital CDR with unlimited frequency acquisition range and improved input duty-cycle error tolerance, IEEE J. Solid-State Circuits, vol. 46, no. 12, pp. 3150 3162, Dec. 2011. [9] B. Razavi, Design of Integrated Circuits for Optical Communications, 1st ed. New York: McGraw-Hill, 2003, pp. 318 322. [10] J. G. Maneatis, Low-jitter process-independent DLL and PLL based on self-biased techniques, IEEE J. Solid-State Circuits, vol. 31, no. 11, pp. 1723 1732, Nov. 1996. [11] Video Electronics Standard Association (VESA), DisplayPort Standard Version1.2,Jan.2010. [12] Y. Ohtomo, K. Nishimura, and M. Nogawa, A 12.5-Gb/s parallel phase detection clock and data recovery in 0.13- mcmos, IEEE J. Solid-State Circuits, vol. 41, no. 9, pp. 2052 2057, Sep. 2006. Won-Young Lee (S 08) received the B.S. and M.S. degrees in electrical engineering from the Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea, in 2006 and 2008, respectively. He is currently working toward the Ph.D. degree in the same university. His research interests include PLL, CDR, and equalizer designs for high-speed interfaces. Lee-Sup Kim (M 89 SM 05) received the B.S. degree in electronics engineering from Seoul National University, Seoul, Korea, in 1982, and the M.S. and Ph.D. degrees in electrical engineering from Stanford University, Stanford, CA, in 1986 and 1990, respectively. He was a postdoctoral fellow at Toshiba Corporation, Kawasaki, Japan, during 1990 1993, where he was involved in the design of the high-performance DSP and single-chip MPEG2 decoder. Since March 1993, he has been with the Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea, where he became a Professor in September 2002 and is currently with the MVLSI Laboratory, Department of Electrical Engineering. During 1998, he was on sabbatical leave with Chromatic Research and SandCraft Inc., Sunnyvale, CA. His research interests include 3D graphics processing unit design and high-speed/low-power mixed-mode integrated circuit design.