Lecture 15: Clock Recovery

Similar documents
Lecture 15: Clock Recovery

ECEN720: High-Speed Links Circuits and Systems Spring 2017

EE290C - Spring 2004 Advanced Topics in Circuit Design High-Speed Electrical Interfaces. Announcements

Circuit Design for a 2.2 GByte/s Memory Interface

ECEN620: Network Theory Broadband Circuit Design Fall 2012

ECEN620: Network Theory Broadband Circuit Design Fall 2014

Self-Biased PLL/DLL. ECG minute Final Project Presentation. Wenlan Wu Electrical and Computer Engineering University of Nevada Las Vegas

6.976 High Speed Communication Circuits and Systems Lecture 21 MSK Modulation and Clock and Data Recovery Circuits

Self Biased PLL/DLL. ECG 721 Memory Circuit Design (Spring 2017) Dane Gentry 4/17/17

Lecture 23: PLLs. Office hour on Monday moved to 1-2pm and 3:30-4pm Final exam next Wednesday, in class

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2010

Delay-Locked Loop Using 4 Cell Delay Line with Extended Inverters

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2012

THE SELF-BIAS PLL IN STANDARD CMOS

A Variable-Frequency Parallel I/O Interface with Adaptive Power Supply Regulation

Phase interpolation technique based on high-speed SERDES chip CDR Meidong Lin, Zhiping Wen, Lei Chen, Xuewu Li

INF4420 Phase locked loops

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2010

DESIGN OF MULTIPLYING DELAY LOCKED LOOP FOR DIFFERENT MULTIPLYING FACTORS

Lecture 160 Examples of CDR Circuits in CMOS (09/04/03) Page 160-1

Introduction to CMOS RF Integrated Circuits Design

High-speed Serial Interface

20Gb/s 0.13um CMOS Serial Link

Lecture 15: Transmitter and Receiver Design

Accomplishment and Timing Presentation: Clock Generation of CMOS in VLSI

ISSCC 2003 / SESSION 4 / CLOCK RECOVERY AND BACKPLANE TRANSCEIVERS / PAPER 4.3

DESIGN AND VERIFICATION OF ANALOG PHASE LOCKED LOOP CIRCUIT

ECEN620: Network Theory Broadband Circuit Design Fall 2012

LETTER A 1.25-Gb/s Burst-Mode Half-Rate Clock and Data Recovery Circuit Using Realigned Oscillation

Ultrahigh Speed Phase/Frequency Discriminator AD9901

/$ IEEE

A PROCESS AND TEMPERATURE COMPENSATED RING OSCILLATOR

Integrated Circuit Design for High-Speed Frequency Synthesis

ECEN620: Network Theory Broadband Circuit Design Fall 2014

THE DEMANDS of a high-bandwidth dynamic random access

ECEN620: Network Theory Broadband Circuit Design Fall 2014

A 2.2GHZ-2.9V CHARGE PUMP PHASE LOCKED LOOP DESIGN AND ANALYSIS

5Gbps Serial Link Transmitter with Pre-emphasis

Phase-locked loop PIN CONFIGURATIONS

Lecture 11: Clocking

ISSCC 2006 / SESSION 13 / OPTICAL COMMUNICATION / 13.2

A Fully Integrated CMOS Phase-Locked Loop With 30MHz to 2GHz Locking Range and ±35 ps Jitter

This chapter discusses the design issues related to the CDR architectures. The

Lecture 3. FIR Design and Decision Feedback Equalization

Lecture 3. FIR Design and Decision Feedback Equalization

VLSI Broadband Communication Circuits

Taheri: A 4-4.8GHz Adaptive Bandwidth, Adaptive Jitter Phase Locked Loop

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2010

Research on Self-biased PLL Technique for High Speed SERDES Chips

Low Power Phase Locked Loop Design with Minimum Jitter

ECE1352. Term Paper Low Voltage Phase-Locked Loop Design Technique

Dedication. To Mum and Dad

A CMOS Multi-Gb/s 4-PAM Serial Link Transceiver*

Phil Lehwalder ECE526 Summer 2011 Dr. Chiang

Synchronous Mirror Delays. ECG 721 Memory Circuit Design Kevin Buck

A Random and Systematic Jitter Suppressed DLL-Based Clock Generator with Effective Negative Feedback Loop

Tuesday, March 29th, 9:15 11:30

Design and Analysis of a Second Order Phase Locked Loops (PLLs)

A 5-Gb/s 156-mW Transceiver with FFE/Analog Equalizer in 90-nm CMOS Technology Wang Xinghua a, Wang Zhengchen b, Gui Xiaoyan c,

ISSN:

Advanced Regulating Pulse Width Modulators

A Wide-Range Delay-Locked Loop With a Fixed Latency of One Clock Cycle

An Analog Phase-Locked Loop

Decoupling Technique for Reducing Sensitivity of Differential Pairs to Power-Supply-Induced Jitter

Single-Ended to Differential Converter for Multiple-Stage Single-Ended Ring Oscillators

A 5Gbit/s CMOS Clock and Data Recovery Circuit

A Low-Jitter Phase-Locked Loop Based on a Charge Pump Using a Current-Bypass Technique

A 0.18µm SiGe BiCMOS Receiver and Transmitter Chipset for SONET OC-768 Transmission Systems

A CMOS Clock and Data Recovery Circuit with a Half-Rate Three-State Phase Detector

Sudatta Mohanty, Madhusmita Panda, Dr Ashis kumar Mal

ECEN 720 High-Speed Links: Circuits and Systems. Lab3 Transmitter Circuits. Objective. Introduction. Transmitter Automatic Termination Adjustment

CHAPTER 6 PHASE LOCKED LOOP ARCHITECTURE FOR ADC

A Phase-Locked Loop with Embedded Analog-to-Digital Converter for Digital Control

Lecture 7: Components of Phase Locked Loop (PLL)

NRZ DPLL CMOS Frequency Synthesizer Using Active PI Filter

Digital Dual Mixer Time Difference for Sub-Nanosecond Time Synchronization in Ethernet

Design and Implementation of High-Speed CMOS Clock and Data Recovery Circuit for Optical Interconnection Applications. Seong-Jun Song. Dec.

A 2-byte Parallel 1.25 Gb/s Interconnect I/O Interface with Self-configurable Link and Plesiochronous Clocking

ALTHOUGH zero-if and low-if architectures have been

Low Power, Wide Bandwidth Phase Locked Loop Design

Low Phase Noise CMOS Ring Oscillator VCOs for Frequency Synthesis

Advanced Operational Amplifiers

LM13600 Dual Operational Transconductance Amplifiers with Linearizing Diodes and Buffers

8-Bit A/D Converter AD673 REV. A FUNCTIONAL BLOCK DIAGRAM

MODELING THE PHASE STEP RESPONSE OF BANG-BANG DIGITAL PLLS

6.976 High Speed Communication Circuits and Systems Lecture 11 Voltage Controlled Oscillators

Lecture 12: Introduction to Link Design

Lecture 12: Introduction to Link Design

15.3 A 9.9G-10.8Gb/s Rate-Adaptive Clock and Data-Recovery with No External Reference Clock for WDM Optical Fiber Transmission.

A 10-Gb/s Multiphase Clock and Data Recovery Circuit with a Rotational Bang-Bang Phase Detector

Phase Locked Loops, Report Writing, Layout Tuesday, April 5th, 9:15 11:00

Design of CMOS Adaptive-Bandwidth PLL/DLLs: A General Approach

Low-Jitter 155MHz/622MHz Clock Generator

A 0.3-m CMOS 8-Gb/s 4-PAM Serial Link Transceiver

Simulation technique for noise and timing jitter in phase locked loop

IN the design of the fine comparator for a CMOS two-step flash A/D converter, the main design issues are offset cancelation

Advanced Regulating Pulse Width Modulators

6.776 High Speed Communication Circuits and Systems Lecture 14 Voltage Controlled Oscillators

TWO AND ONE STAGES OTA

A Compact, Low-Power Low- Jitter Digital PLL. Amr Fahim Qualcomm, Inc.

Transcription:

Lecture 15: Clock Recovery Computer Systems Laboratory Stanford University horowitz@stanford.edu Copyright 2001 by Mark Horowitz 1

Overview Reading Chapter 19 - High Speed Link Design, by Ken Yang, Stefanos Sidiropoulos Introduction One of the critical tasks in building high-speed IO is getting the receive clock to be properly aligned to the incoming data. This means you need to control the phase (and sometimes the frequency) of the receive clock. Clock alignment is usually done using a feedback system that controls the phase, and is called a phase-locked loop or PLL. There are two ways to build this kind of system, one using a voltage controlled oscillator and the other using a delay line. 2

Timing The timing (clocking) discipline dictates the transmission and sampling of the signals on the channel: Tx Channel Rx T-clk R-clk i.e. determines how we generate the clocks that drive the transmitter and receiver ends of the link Clocking circuit design is tightly coupled with signal encoding for timing recovery: High-bandwidth serial links recover timing based on the transitions of the data signals (need encoded data to guarantee spectral characteristics) Low latency/parallel systems use a source synchronous discipline (transmitter clock is sent along with the data) The basic circuit block is a Phase Locked Loop 3

Outline Clock-recovery/phase-alignment approaches Traditional CDRs Oversampled CDRs Source synchronous links Timing Loop Design Delay Locked Loops Phase Locked Loops Circuit Components Variable delay/frequency generation Phase Detectors Filters 4

Classic Clock/Data Recovery Decision D OUT D IN PhDet Filter VCO PLL Many different implementations ([1]-[5]) Data stream must guarantee transitions (i.e. PSD content) State of system is stored in analog filter 5

Oversampled Clock/Data Recovery Oversample the data and perform phase alignment digitally D IN ref CLK clk0-n PLL/DLL Multi-phase Data Receiver Data Recovery PhDet Filter Delay sel D OUT D 0 D 1 D 2 Alternatives range from closed digital loop systems to feedforward systems ([6]-[9]) De-couples the clock generator from the tracking of the data Still data must guarantee transitions to ensure proper tracking clk 0 clk 1 clk 2 clk 3 6

Phase Alignment in Source Synchronous Systems ref CLK DLL data CLK ref ref CLK data D0 D1 D2 D3 CLK Timing information is carried by an explicit clock signal ([10]-[13]) State can be stored either in analog filter or digital logic 7

Timing Loop Performance Parameters Phase Error: clock w/o jitter clock w/ jitter Time Domain Phase Histogram AC - jitter: The uncertainty of the output phase DC - phase offset: Undesired difference of the average output phase relative to the input phase. Bandwidth: Rate at which the output phase tracks the reference phase Lock time, Frequency Range Duty cycle (in classic CDRs and most source synchronous systems) Spacing uniformity of multiple edges (in oversampled CDRs) 8

Loop Architectures: DLL vs PLL may also be a local clock VCDL VCO clk clk V CTL V CTL PD ref clk First order loop: easily stabilizable Filter frequency synthesis a problem ref clk jitter passes through PD ref clk Second/Third order loop: Filter stability is an issue frequency synthesis easy filtering of ref clk jitter no phase error accumulation phase error accumulation 9

Delay Locked Loop Controlled variable is delay through the VCDL VCDL K dly K dly (sec/v) clk V CTL ref clk Dly err K pd K f /s Filter K pd K f (V/sec 2 ) Open Loop TF: Ts ( ) = K pd K f K ---------------------------- dly s Closed Loop TF: Hs ( ) = K pd K f K dly ------------------------------------ s + K pd K f K dly 10

DLL Dynamics Single pole system H(s) 1 K pd K f K dly Stable as long as feedback delay is not excessive Jitter sources: Device noise: usually negligible Noise sensitivity of the delay line Noise sensitivity of the subsequent clock buffer System issues: Phase noise of the input signal -> systems with DLL s require low jitter differential clocks Limited locking range -> need to ensure adequate VCDL range and employ special reset 11 ω

Interpolating DLL s Pick two successive coarse edges and then interpolate between them to generate the desired output phase [13], [22], [23]: π/2 in CLK CORE DLL 0 θ 1 θ 2 θ 3 θ 4 θ 5 θ (θ=π/6) Phase Selection π 0 FSM φ = i θ (i = 0,2,4) ψ = j θ (j = 1,3,5) Selective Phase Inversion 3π/2 ref CLK Phase Detector φ = { φ φ+π Θ φ +(1 α/16) (ψ -φ ) (α = 0..16) Phase Interpolation No range boundaries on the generated delay Can use digital control ψ = { ψ ψ+π PERIPHERAL DLL 12

VCO-based Phase Locked Loop Controlled variable is phase of the output clock K VCO K VCO (Hz/V) clk ref clk K pd φ err F(s) Filter K pd F(s) (V/rad) Main difference from DLL is the VCO transfer function: H VCO ( s) = K VCO --------------- s The extra VCO pole needs to be compensated by a zero in the loop filter: Fs ( ) K f ( 1+ s z 1 ) = -------------------------------- s 13

PLL Dynamics Open Loop TF: Ts ( ) = K pd K f ( 1 + s z 1 )K ------------------------------------------------------- vco s 2 Closed loop TF: Hs ( ) = K pd K f K vco ( 1 + s z 1 ) ------------------------------------------------------------------- s 2 + K pd K f K vco ( 1 + s z 1 ) K pd *K f *K vco Open-loop TF T(s) 40dB/decade Mag z 0 o 1 phase margin Ph 90 o 180 o Closed-loop TF H(s) Mag 1 peaking ω i.e: we are adding proportional control (z 1 ) to adjust the output phase while the filter integrator (pole at 1/s) holds the frequency information 14

PLL Dynamics (cont d) Other effects that reduce PLL stability/performance [14]: Higher order poles: Suppress ripple but may compromise phase margin Sampled nature of the feedback system Keep ω bw < ω ref /10 Ultimately limits the lock range of the loop Phase error accumulation (VCO is an integrator i.e.: θ = ωdt ): Freq delay/phase phase V supply VCDL VCO 15

PLL vs DLL: Phase Error Accumulation 0-10 DLL-pk -20-30 -40 PLL-pk DLL PLLBW 20MHz PLLBW 5MHz -50 0 500 1000 1500 time (ns) Simulated data for 6-stage PLL vs 6-stage DLL @250MHz: supply sens: 20ps/element/Volt, supply-step: 300-mV @200-ns This would suggest that if no clock multiplication is needed and the input clock is quiet, the obvious choice is a DLL. However: Multiplication is often necessary from a system stand-point (EMI, clock generator chips not fast enough) Jitter really matters on the pins... 16

System Jitter jitter matters here DLL/PLL clock buffer A lot of energy is usually spent optimizing half of the problem: A state of the art inverter has a supply sensitivity of ~ 1%- delay/%-supply An average PLL/DLL has a supply sensitivity of < 1%- delay/%-supply -> If the clock buffer delay approaches a cycle, more than half of the system jitter comes from that buffer... 17

Loop Components Variable delay/frequency generators Mainly built as voltage controlled delay elements Main issue is supply/substrate voltage sensitivity Phase detectors Linear and non-linear designs depending on the system Main goal is to achieve low offset Loop filters Almost always constructed around a charge pump Main issue is to minimize offset and ripple Other: Signal amplifiers, Supply de-coupling 18

Variable delay elements Delays in CMOS are usually generated by RC elements. e.g.: R V inv C Delay can be controlled by varying R (or I), C, or Vinv. All of the above can be changed easily, but the problem is that they also change with varying Process, Temperature, and Supply voltage: Process - Usually not a problem if the total variation is reasonable Temperature - Slowly varying -> well below the loop bandwidth Voltage - Both supply and substrate change rapidly Design Goals: high supply & substrate noise rejection; adequate range 19

Simple delay elements Current starved inverter [15] V C t d controls delay by limiting maximum current through a standard inverter Shunt capacitor inverter [16] V C t d controls delay by changing effective capacitance at the output node Vc Both have poor supply rejection (>= 1%delay/1%supply) 20 V C

Improving simple delay elements Make a high impedance current source to isolate completely the inverter supply r O V s C Vs tracks Vss for frequencies higher than 1/r O C Current source can be implemented as cascode or activecascode [18] Or you can use a source follower to achieve a similar effect [17] What about substrate noise? DEC makes ground and substrate the same node by supplying current to the inverters through the p-epi!! [18] OK, as long as max drop through epi resistor is small and constant 21

Differential delay elements 1. Isolate supply with a high impedance current source 2. Increase CMRR by making signals self-referenced Ideally we need load elements that look like perfect resistors whose value is adjusted by the control voltage/current: V ctrl load element i o- o+ i- I bias Main problem: Create a resistor that is both variable AND linear 22

Differential delay elements with linear loads Change the current through a linear-fet-loaded diff pair and adjust the gate voltage of the loads to change the resistance. Replica-feedback circuit keeps swing constant and the loads V REF - linear [6] + V CTRL o- o+ i+ i- Swing must be quite small to get a real resistive behavior Even then transients might slightly saturate the loads and decrease CMRR Try to increase the impedance of the current source by cascoding -> small head-room Use a load with larger dynamic range [20] 23

A variable load with high CMRR FET s are non-linear but what we really need is to clamp the swing. Also if load transfer function is symmetric CMRR is improved [19] V ctrl I ds @ various Vctrl Sum of transistor I-V and diode-connected I-V curve Use replica feedback biasing to cancel substrate and supply noise V ctrl - + V bias o- o+ i+ i- 24 V ds Replica loop keeps the swing of the buffers equal to Vctrl In VCO s the replica loop bandwidth should be high enough to stop the VCO from accumulating phase error for many cycles Watch out for loop stability

Interpolative Delay Generation Generate an edge based on the weighted sum of two other edges: φ φ φ ψ Load Θ weight = 0...N Θ ψ+ y - φ+ φ- ψ ψ Θ = [(N-weight) x φ + weight x ψ]/n I 1 I tot -I 1 Useful for: DLL s with unlimited phase shift Fine edge placement for oversampled CRCs Some designs use this technique even for their main VCO [4] Non-linearity might be large if input edges are spaced far apart relative to the time constant at the output node 25 Courtesy D. Foty

Phase Detectors Goal: Align the clock to the right place (e.g. the center of the data eye) data-in 90 o ideal sampling point T setup PLL/DLL Clk Buf PLL/DLL Clk Buf φ err clk φ err clk ref Linear PD Comp. ref Replica PD Use perfect PD and compensate set-up time Use replica PD to autocompensate [13] Other requirements: Fast acquisition Minimum dead-band (i.e. area in which the PD is blind to its inputs) 26

Linear Phase Detectors XOR phase detector - 90 o lock dn clk clk ref ref up dn sensitive to input duty cycle φ err 0 π/2 π SR phase detector - 180 o lock clk clk 1-shot dn ref ref dn up up 1-shots remove duty cycle sensitivity 1-shot φ err 0 π 2π 27

Phase/Frequency Detector Aids in frequency acquisition clk ref D Q R rst R dn clk ref up dn D Q up φ err 0 2π Overlap up/dn pulses to eliminate dead-band Can implement flip/flops in various ways to maximize speed/operating frequency [18]-[20] 28

Non-linear Phase Detector An ideal flip/flop should force a loop to lock at 0 o clk dn ref up Flop/Sampler clk ref dn 0 π The set-up time of the flip-flop will introduce phase offset Symmetric structures can eliminate this problem [16] Can be used to cancel the set-up time of an input sampler [13] The loop dynamics change: The loop is now a bang-bang system which dithers around a locking point: Risky for a PLL, routinely done for DLL s. The dither magnitude depends on the delay through the loop and the loop-gain 29 offset φ r -φ i

Typical DLL Loop Filter A charge-pump acting a perfect integrator i.e. K f = I pmp /C filt /s. up I up dn C FILT I dn We could have static phase offset if I up is not equal to I dn Does not matter much for a bang-bang loop Can be an important issue in linear PD-based loops Current sources with high output impedance (cascoding) Differential charge pumps [23] Replica-feedback biased charge pumps [19] 30

Typical 2nd/3rd order PLL filter For a VCO based PLL, insert a resistor in series with integrating C. Explicit C 2 is often used to suppress ripple. I up up Filter V ctrl dn C 2 I dn C Implement resistor in high resistivity layer (poly, diffusion, well) Difficult: layer might not exist, or ρ might not be well controlled, or have high TC 31

PLL filter without real resistors Sum a proportional current to an integrated current [21] i.e: I p ( sc) + K p I p R o up R o C 2 dn I p C K p *I p Filter If Ro and Ip are scaled appropriately, we can achieve scaling of the loop bandwidth with operating frequency [19] 32

Conclusion Timing/Phase-alignment circuits are crucial in system interconnect performance The good news: No black magic required!! With many architectures you only have to use basic control theory The bad news: A lot of opportunities to make a mistake Noise is your worst enemy but lots of techniques exist to alleviate it The challenge: Make the right trade-offs early and optimize what matters for your system 33

Clocking References 1. S. Y. Sun, An Analog PLL-Based Clock and Data Recover Circuit with High Input Jitter Tolerance, JSSC, April 1989 2. T. Lee and J Bulzacchelli, A 155-MHZ Clock Recovery Delay and Phase Locked Loop, JSSC, December 1992 3. L. DeVito, A Versatile Clock Recovery Architecture and its Monolithic Implementation, in Monolithic PLL and CRC Circuits Behzad Razavi Ed. 4. B. Lai, and R. Walker, A Monolithic 622Mb/s Clock Extraction Data Retiming Circuit, ISSCC, February 1991 5. M. Banu and A. Dunlop, A 660Mb/s CMOS Clock CRC with Instantaneous Locking for NRZ Data and Burst-Mode Transmission, ISSCC February 1993. 6. B. Kim et. al. A 30-MHZ Hybrid Analog/Digital RCR in 2-um CMOS,, IEEE JSSC, December 1990 7. T. Hu, A Monolithic 480 Mb/s Parallel AGC/Decision/Clock-Recovery Circuits in 1.2um CMOS, JSSC, Dec. 1993. 8. C.K. Yang et. al., A 0.5-um CMOS 4Gb/s transceiver with data recovery using oversampling, JSSC May 1998 9. K. Lee, A CMOS serial link for fully duplexed data communication, JSSC, April 1995 10. R. Mooney et. al., A 900 Mb/s bidirectional signalling interface, JSSC, Dec 1995 11. T. Takahashi, A CMOS gate array with 600 Mb/s simultaneous bidirectional I/O circuits, JSSC, Dec 1995a 34

Clocking References 12. S. Sidiropoulos and M. Horowitz. A 700 Mbps/pin CMOS signalling interface using current integrating receivers, JSSC, May 1997 13. M. Horowitz et. al., PLL Design for a 500 MB/s Interface, ISSCC-93 14. F. Gardner, Phaselock techniques, 2nd ed. John Wiley, 1979 15. D.K. Jeong, Design of PLL-based clock generation circuits, JSSC, April 1987 16. M. Johnson and E. Hudson, A Variable Delay Line PLL for CPU-Coprocessor Synchronization, JSSC, October 1988 17. D. Draper, Circuit techniques in a 266-MHz MMX-enabled processor, JSSC, Nov. 1997 18. V. vonkaenel, A 320 MHz, 1.5 mw@1.35 V CMOS PLL for microprocessor clock generation, JSSC, Nov 1996. 19. J. Maneatis, Low-jitter process-independent DLL and PLL based on self-biased techniques, JSSC, Nov. 1996 20. I. Young, A PLL Clock Generator with 5 to 110 MHZ locking range for Microprocessors, JSSC, November 1992 21. I. Novoff, Fully integrated CMOS PLL with 15-200 MHZ range and +/- 50-ps jitter,, JSSC, Nov. 1995. 22. S. Sidiropoulos, A semi-digital Delay Locked Loop, JSSC, December 1997 23. T. Lee, et. al. A 2.5V CMOS Delay-Locked Loop for an 18Mbit, 500 MByte/s DRAM, JSSC, Dec 1994 35