High Performance Digital Fractional-N Frequency Synthesizers IEEE Distinguished Lecture Lehigh Valley SSCS Chapter Michael H. Perrott October 2013 Copyright 2013 by Michael H. Perrott All rights reserved.
Why Are Digital Phase-Locked Loops Interesting? PLLs are needed for a wide range of applications - Communication systems, digital processors, Performance is important - Phase noise/jitter is often a limiting factor Standard analog PLL implementations present issues - Analog blocks pose design and verification challenges - The cost of implementation is becoming too high Can digital phase-locked loops offer excellent performance with a lower cost of implementation? 2
Just Enough PLL Background
What is a Phase-Locked Loop (PLL)? out(t) out(t) e(t) v(t) e(t) v(t) Phase Detect e(t) Analog v(t) out(t) Loop Filter VCO de Bellescize Onde Electr, 1932 Voltage Controlled Oscillator (VCO) efficiently provides oscillating waveform with variable frequency PLL synchronizes VCO frequency to input reference frequency through feedback - Key block is phase detector Realized as digital gates that create pulsed signals 4
Integer-N Frequency Synthesizers div(t) e(t) v(t) F out = N F ref div(t) Phase Detect e(t) Analog v(t) out(t) Loop Filter Divider VCO Sepe and Johnston US Patent (1968) Use digital counter structure to divide VCO frequency - Constraint: must divide by integer values Use PLL to synchronize reference and divider output Output frequency is digitally controlled N 5
Fractional-N Frequency Synthesizers div(t) e(t) v(t) Kingsford-Smith US Patent (1974) Wells US Patent (1984) F out = M.F F ref div(t) Phase Detect e(t) Analog v(t) out(t) Loop Filter Riley VCO US Patent (1989) Divider JSSC 93 N sd [k] Σ Δ Modulator N[k] M.F Dither divide value to achieve fractional divide values - PLL loop filter smooths the resulting variations Very high frequency resolution is achieved 6
The Issue of Quantization Noise div(t) e(t) v(t) F out = M.F F ref div(t) Phase Detect e(t) Analog v(t) out(t) Loop Filter Divider VCO N sd [k] Σ Δ Modulator N[k] Limits PLL bandwidth Increases linearity requirements of phase detector f M.F Σ Δ Quantization Noise 7
Striving for a Better PLL Implementation
Analog Phase Detection div(t) 1 reset 1 D Q D Q Reg error(t) phase error div(t) error(t) Phase Detect Analog Loop Filter out(t) div(t) Divider VCO Pulse width is formed according to phase difference between two signals Average of pulsed waveform is applied to VCO input 9
Tradeoffs of Analog Approach div(t) error(t) Phase Detector Signals Average of error(t) Phase Detector Characteristic phase error Phase Detect Analog Loop Filter out(t) div(t) Divider VCO Benefit: average of pulsed output is a continuous, linear function of phase error Issue: analog loop filter implementation is undesirable 10
Issues with Analog Loop Filter error(t) Charge Pump I cp V out C int Phase Detect Analog Loop Filter out(t) Divider VCO Charge pump: output resistance, mismatch Filter caps: leakage current, large area 11
Going Digital Phase Detect Analog Loop Filter out(t) Divider VCO Time -to- Digital Digital Loop Filter Divider DCO out(t) Staszewski et. al., TCAS II, Nov 2003 Digital loop filter: compact area, insensitive to leakage Challenges: - Time-to-Digital Converter (TDC) - Digitally-Controlled Oscillator (DCO) 12
Outline of Talk Basics - Time-to-Digital Converters (TDCs) - Digitally-Controlled Oscillators (DCOs) Modeling - Transfer function modeling - Noise analysis Improving the TDC - Background - Gated-Ring Oscillator (GRO) structure A high performance digital PLL example - GRO TDC for low noise - Quantization noise cancellation - Low jitter divider design 13
Classical Time-to-Digital Converter div(t) D Q Reg Delay Delay D Q Reg Delay D Q Reg e[k] div(t) Delay 1 1 1 0 0 e[k] div(t) Time -to- Digital Digital Loop Filter Divider DCO out(t) Resolution set by a Single Delay Chain structure - Phase error is measured with delays and registers Corresponds to a flash architecture 14
Modeling of TDC Phase Detector Characteristic quantization error detector output 1 Δt del time error phase error[k] T 2π t q [k] TDC Gain 1 Δt del e[k] T reference period div(t) Time -to- Digital Digital Loop Filter Divider DCO out(t) Phase error converted to time error by scale factor: T/2 TDC introduces quantization error: t q [k] TDC gain set by average delay per step: t del 15
Impact of Limited Resolution and Delay Mismatch div(t) Delay varies due to mismatch 1 1 1 0 0 e[k] detector output Phase Detector Characteristic phase error div(t) Time -to- Digital Digital Loop Filter Divider DCO out(t) Integer-N PLL - Limit cycles due to limited resolution (unless high ref noise) Fractional-N PLL - Fractional spurs due to non-linearity from delay mismatch 16
A Straightforward Approach for Achieving a DCO DAC Varactor Varactor Analog Control div(t) Time -to- Digital Digital Loop Filter Divider DCO out(t) Ferriss ISSCC 2007 Hsu ISSCC 2008 Use a DAC to control a conventional LC oscillator - Allows the use of an existing VCO within a digital PLL - Can be applied across a broad range of IC processes 17
A Much More Digital Implementation Varactor Varactor Digital Control div(t) Time -to- Digital Digital Loop Filter Divider DCO out(t) Staszewski et. al., TCAS II, Nov 2003 Adjust frequency in an LC oscillator by switching in a variable number of small capacitors - Most effective for CMOS processes of 0.13u and below 18
Leveraging Segmentation in Switched Capacitor DCO Coarse Control Fine Control Varactor Varactor Binary Array 1x 2x 4x 2 n x Unit Element Array 1x 1x 1x 1x Similar tradeoffs as segmented capacitor DAC structures - Binary array: efficient control, but may lack monotonicity - Unit element array: monotonic, but complex control Coarse and fine control segmentation of DCO - Coarse control: active only during initial frequency tuning Binary array provides efficient control implementation - Fine control: controlled by PLL feedback Unit element array minimizes dynamic charge transfer 19
Leveraging Dithering for Fine Control of DCO out(t) Varactor Divide-by-K Varactor Coarse Control Fine Control Initial Frequency Tuning Digital Σ Δ Modulator T in[k] Digital Loop Filter T c =T/M DCO TDC out Increase resolution by dithering of fine cap array Reduce noise from dithering by - Using small unit caps in the fine cap array - Increasing the dithering frequency (defined as 1/T c ) Assume 1/T c = M/T (i.e. M times reference frequency) 20
Noise Spectrum of a Switched Cap DCO Varactor Varactor Digital Control Phase noise - Same as for conventional VCO (tank Q, etc.) Quantization noise from dithering - Designed to be lower than phase noise Quantization Noise in[k] M f q raw [k] H ntf (z) z=ej2πft c q[k] T c 2πK v s s=j2πf Phase Noise f Φ out (t) 21
Modeling
Overall Digital PLL Model TDC DCO TDC-referred Noise S t q (ej2πft ) DCO-referred S Φ n (f) Noise -20 db/dec Φ ref [k] T 2π f t q [k] TDC Gain 1 Δt del e[k] Loop Filter H(z) z=e j2πft DT-CT T 2πK v s f s=j2πf Φ n (t) Φ out (t) Φ div [k] Divider CT-DT 1 N 1 T TDC and DCO-referred noise influence overall phase noise according to associated transfer functions to output Calculations involve both discrete and continuous time 23
Key Transfer Functions Φ ref [k] T 2π t q [k] TDC Gain 1 Δt del e[k] Loop Filter H(z) DT-CT T 2πK v s Φ n (t) Φ out (t) Φ div [k] z=e j2πft s=j2πf CT-DT 1 N 1 T TDC-referred noise DCO-referred noise 24
Utilize G(f) as a Parameterizing Function Φ ref [k] T 2π t q [k] TDC Gain 1 Δt del e[k] Loop Filter H(z) DT-CT T 2πK v s Φ n (t) Φ out (t) Φ div [k] z=e j2πft s=j2πf CT-DT 1 N 1 T Define open loop transfer function A(f) as: Define closed loop parameterizing function G(f) as: - Note: G(f) is a lowpass filter with DC gain = 1 25
Transfer Function Parameterization Calculations TDC-referred noise DCO-referred noise 26
Key Observations Φ ref [k] T 2π t q [k] TDC Gain 1 Δt del e[k] Loop Filter H(z) DT-CT T 2πK v s Φ n (t) Φ out (t) Φ div [k] z=e j2πft s=j2πf CT-DT 1 N 1 T TDC-referred noise Lowpass with a DC gain of 2 N DCO-referred noise Highpass with a high frequency gain of 1 How do we calculate the output phase noise? 27
Phase Noise Calculation TDC-referred Noise S t q (ej2πft ) f t q [k] DCO-referred Noise S Φ n (f) -20 db/dec f Φ n (t) TDC noise - Dominates PLL phase noise at low frequency offsets 2πN G(f) 1-G(f) f o f o Φ out (t) DCO noise - Dominates PLL phase noise at high frequency offsets dbc/hz 1 2πNG(f) 2 T St q (ej2πft) 1- G(f) 2 S Φ n (f) f o f 28
A Closer Look at the Influence of TDC Noise TDC-referred Noise S t q (ej2πft ) f t q [k] DCO-referred Noise S Φ n (f) -20 db/dec f Φ n (t) PLL bandwidth dramatically influences relative impact of TDC and VCO noise 2πN G(f) 1-G(f) f o f o Φ out (t) Want high PLL bandwidth? Need low TDC Noise Low PLL Bandwidth High PLL Bandwidth dbc/hz TDC Noise f o DCO Noise f dbc/hz DCO Noise f o TDC Noise f 29
How Do We Improve TDC Performance? Two Key Issues: TDC resolution Mismatch
Improve Resolution with Vernier Delay Technique div(t) D Q Reg Delay Delay D Q Reg Delay D Q Reg e[k] div(t) Delay 1 1 1 0 0 e[k] div(t) Vernier Delay Delay Delay D Q D Q D Q Reg Reg Reg Delay2 Delay2 Delay2 e[k] div(t) Delay 1 1 1 0 0 e[k] Effective resolution: Delay-Delay2 Delay2 31
Issues with Vernier Approach Mismatch issues are more severe than the single delay chain TDC - Reduced delay is formed as difference of two delays Large measurement range requires large area - Initial PLL frequency acquisition may require a large range div(t) Vernier Delay Delay Delay D Q D Q D Q Reg Reg Reg Delay2 Delay2 Delay2 e[k] div(t) Delay 1 1 1 0 0 e[k] Effective resolution: Delay-Delay2 Delay2 32
Two-Step TDC Architecture Allows Area Reduction Single Delay Chain Vernier Delay Delay Delay div(t) Delay Delay Delay Mux D Q D Q D Q D Q Reg D Q Reg D Q Reg Reg Reg Reg Ramakrishnan, Balsara VLSID 06 Logic Single delay chain provides coarse resolution (Folded) Vernier provides fine resolution Coarse e[k] Delay Delay2 Delay2 Delay2 Delay - Delay2 Fine e[k] 33
Two-Step TDC Using Time Amplification Single Delay Chain Time Amplifier Single Delay Chain div(t) Delay Delay Delay Delay Delay Delay Mux D Q D Q D Q D Q D Q D Q Reg Reg Reg Reg Reg Reg Simplified view of: Lee, Abidi VLSI 2007 Logic Coarse e[k] Delay Fine e[k] Single delay chain provides coarse and fine resolution Time amplification is used to improve resolution Delay Amplification of Time 34
Leveraging Metastability to Create a Time Amplifier in(t) Time Amplifier out(t) in(t) Δt in D Q Latch out(t) Δt in in(t) in(t) out(t) out(t) Δt out Δt out Simplified view of: Abas, et al., Electronic Letters, Nov 2002 (note that actual implementation uses SR latch) Metastability leads to progressively slower output transitions as setup time on latch is encroached upon - Time difference at input is amplified at output 35
Interpolating time-to-digital converter Tq Start(t) Delay Delay Delay Start 1 1 Stop(t) Registers 1 1 1 1 Out 0 Out Henzler et al., ISSCC 2008 Stop Tin Interpolate between edges to achieve fine resolution Cyclic approach can also be used for large range 36
An Oscillator-Based TDC V dd Ring Oscillator div(t) Osc(t) Phase Error[1] Phase Error[2] Reset Logic div(t) Counter Register Count[k] Count[k] e[k] e[k] 3 3 Output e[k] corresponds to the number of oscillator edges that occur during the measurement time window Advantages - Extremely large range can be achieved with compact area - Quantization noise is scrambled across measurements 37
A Closer Look at Quantization Noise Scrambling V dd Ring Oscillator div(t) Osc(t) Phase Error[1] Phase Error[2] div(t) Reset Logic Counter Register Count[k] e[k] Count[k] Quant. Error[k] e[k] -q[0] q[1] -q[2] 3 3 Quantization error occurs at beginning and end of each measurement interval As a rough approximation, assume error is uncorrelated between measurements - Averaging of measurements improves effective resolution q[3] 38
Deterministic quantizer error vs. scrambled error Deterministic TDC do not provide inherent scrambling For oversampling benefit, TDC error must be scrambled! Some systems provide input scrambling ( fractional-n PLL), while some others do not (integer-n PLL) 39
Proposed GRO TDC Structure
A Gated Ring Oscillator (GRO) TDC Ring Oscillator Enable div(t) Osc(t) Phase Error[1] Phase Error[2] div(t) Reset Logic Counter Register Count[k] e[k] Count[k] Quant. Error[k] e[k] -q[0] q[1] -q[1] 3 4 Enable ring oscillator only during measurement intervals - Hold the state of the oscillator between measurements Quantization error becomes first order noise shaped! - e[k] = Phase Error[k] + q[k] q[k-1] - Averaging dramatically improves resolution! q[2] 41
Improve Resolution By Using All Oscillator Phases Phase Error[1] Phase Error[2] Ring Oscillator Enable div(t) Reset Counters Osc. Phases(t) Logic div(t) Register Count[k] e[k] Helal, Straayer, Wei, Perrott VLSI 2007 Count[k] Quant. Error[k] q[1] -q[0] -q[1] e[k] 11 10 Raw resolution is set by inverter delay Effective resolution is dramatically improved by averaging q[2] 42
GRO TDC Also Shapes Delay Mismatch Measurement 1 Enable Measurement 2 Enable Measurement 3 Enable Measurement 4 Enable Barrel shifting occurs through delay elements across different measurements - Mismatch between delay elements is first order shaped! 43
Simple gated ring oscillator inverter-based core Enabled Ring Oscillator Disabled Ring Oscillator (a) (b) Gate the oscillator by switching the inverter cores to the Enable Vo n-1 Delay Element Enable M 4 power supply Vo 5 Vo n Vo i-1 M 3 Vo i Vo 4 Vo 1 M 2 Vo 3 Vo 2 Enable M 1 44
GRO Prototype En Dis 15 Stage Gated Ring Oscillator S Q R enable(t) enable enable Straayer, Perrott Logic error[k] GRO implemented as a custom 0.13 m CMOS IC 45
Measured GRO Results Confirm Noise Shaping Variable Delay 15 Stage Gated Ring Oscillator S Q R enable(t) enable enable 40 30 20 Input variable delay signal Harmonics due to nonlinearity of variable delay Logic error[k] Amplitude (db) 10 0-10 -20 Noise shaped quant. noise -30 0.01 0.1 1 10 100 Frequency (MHz) 46
Measured deadzone behavior of inverter-based GRO Deadzones were caused by errors in gating the oscillator GRO injection locked to an integer ratio of F S Behavior occurred for almost all integer boundaries, and some fractional values as well Noise shaping benefit was limited by this gating error 47
Next Generation GRO: Multi-path oscillator concept Single Input Single Output Multiple Inputs Single Output Use multiple inputs for each delay element instead of one Allow each stage to optimally begin its transition based on information from the entire GRO phase state Key design issue is to ensure primary mode of oscillation 48
Multi-path inverter core Lee, Kim, Lee JSSC 1997 Mohan, et. al., CICC 2005 49
Proposed multi-path gated ring oscillator Hsu, Straayer, Perrott ISSCC 2008 Oscillation frequency near 2GHz with 47 stages Reduces effective delay per stage by a factor of 5-6! Represents a factor of 2-3 improvement compared to previous multi-path oscillators 50
A simple measurement approach Enable N-Stage Gated Ring Oscillator Start Reset Logic Counters Stop Count[k] Register e[k] Helal, Straayer, Perrott VLSI 2007 2 counters per stage * 47 stages = 94 counters each at 2GHz Power consumption for these counters is unreasonable Need a more efficient way to measure the multi-path GRO 51
Count Edges by Sampling Phase Calculate phase from: - A single counter for coarse phase information (keeps track of phase wrapping) - GRO phase state for fine count information 1 counter and N registers much more efficient 52
Proposed Multi-Path Measurement Structure Multi-path structure leads to ambiguity in edge position Partition into 7 cells to avoid such ambiguity Requires 7 counters rather than 1, but power still OK 53
Prototype 0.13 m CMOS multi-path GRO-TDC Start Stop Timing Generation Enable 47-stage Gated Ring Oscillator Start Stop Enable CLK Z 1-47 State Register 1 2 3 4 5 6 7 Measurement Cells CLK Adder Out Straayer et al., VLSI 2008 Two implemented versions: - 8-bit, 500Msps - 11-bit, 100Msps version 2-21mW power consumption depending on input duty cycle 54
Measured noise-shaping of multi-path GRO Power Spectral Density (db ps 2 /Hz) -40-50 -60-70 -80-90 -100 65,536 pt. FFT (Hanning window + 20x averaging) Input of 1.2ps pp Noise of 80fs rms in 1MHz BW 10 4 10 5 10 6 10 7 Frequency (Hz) (a) Ideal variance of 50-Msps quantizer with 1ps steps 278.6 0 40 80 120 160 200 Data collected at 50Msps More than 20dB of noise-shaping benefit 80fs rms integrated error from 2kHz-1MHz Floor primarily limited by 1/f noise (up to 0.5-1MHz) Filtered TDC Output 279.2 279.0 278.8 TDC Output after 1MHz LPF 1.2ps Time (µs) (b) 55
Measured deadzone behavior for multi-path GRO Only deadzones for outputs that are multiples of 2N - 94, 188, 282, etc. - No deadzones for other even or odd integers, fractional output Size of deadzone is reduced by 10x 56
The Issue of Quantization Noise Due to Divider Dithering
The Nature of the Quantization Noise Problem Ref PFD Loop Filter Out Div N/N+1 Frequency Selection M-bit ΔΣ Modulator 1-bit Frequency Selection Quantization Noise Spectrum Output Spectrum F out Noise ΔΣ PLL dynamics Increasing PLL bandwidth increases impact of fractional-n noise - Cancellation offers a way out! 58
Previous Analog Quantization Noise Cancellation Phase error due to Σ is predicted by accumulating Σ quantization error Gain matching between PFD and D/A must be precise Matching in analog domain limits performance 59
Proposed All-digital Quantization Noise Cancellation Hsu, Straayer, Perrott ISSCC 2008 Scale factor determined by simple digital correlation Analog non-idealities such as DC offset are completely eliminated 60
Details of Proposed Quantization Noise Cancellation Correlator out is accumulated and filtered to achieve scale factor - Settling time chosen to be around 10 us See analog version of this technique in Swaminathan et.al., ISSCC 2007 61
Proposed Digital Wide BW Synthesizer Gated-ring-oscillator (GRO) TDC achieves low in-band noise All-digital quantization noise cancellation achieves low out-of-band noise Design goals: - 3.6-GHz carrier, 500-kHz bandwidth - <-100dBc/Hz in-band, <-150 dbc/hz at 20 MHz offset 62
Overall Synthesizer Architecture Note: Detailed behavioral simulation model available at http://www.cppsim.com 63
Dual-Port LC VCO Frequency tuning: - Use a small 1X varactor to minimize noise sensitivity - Use another 16X varactor to provide moderate range - Use a four-bit capacitor array to achieve 3.3-4.1 GHz range 64
Digitally-Controlled Oscillator with Passive DAC 1X varactor minimizes noise sensitivity 16X varactor provides moderate range A four-bit capacitor array covers 3.3-4.1GHz Goals of 10-bit DAC - Monotonic - Minimal active circuitry and no transistor bias currents - Full-supply output range 65
Operation of 10-bit Passive DAC (Step 1) 5-bit resistor ladder; 5-bit switch-capacitor array Step 1: Capacitors Charged - Resistor ladder forms V L = M/32 V DD and V H = (M+1)/32 V DD, where M ranges from 0 to 31 - N unit capacitors charged to V H, and (32-N) unit capacitors charged to V L 66
Operation of 10-bit Passive DAC (Step 2) Step 2: Disconnect Capacitors from Resistors, Then Connect Together - Achieves DAC output with first-order filtering - Bandwidth = 32 C u /(2 C load ) 50MHz Determined by capacitor ratio Easily changed by using different C load 67
Dual-Path Loop Filter Step 1: reset Step 2: frequency acquisition - V c (t) varies - V f (t) is held at midpoint Step 3: steady-state lock conditions - V c (t) is frozen to take quantization noise away - Σ quantization noise cancellation is enabled 68
Fine-Path Loop Filter Equivalent to an analog lead-lag filter - Set zero (62.5kHz) and first pole (1.1MHz) digitally - Set second pole (3.1MHz) by capacitor ratio First-order Σ reduces in-band quantization noise 69
A Closer Look at the Frequency Divider TDC Digital Loop Filter out(t) div(t) Divider DCO N sd [k] Δ Σ Modulator N[k] Delta-Sigma modulator dithers the divider value - Divider must support a range of divide values - Want to maintain low jitter as divide value changes 70
The Issue of Divide Value Delay Variation TDC Digital Loop Filter div(t) Divider DCO N sd [k] Δ Σ Modulator N[k] N[k] T delay in(t) div(t) T delay [k-1] T delay [k] N Divider input to output delay is a function of divide value - Adds significant jitter for dynamic divide value variation 71
Key Observation N[k] = N 1 [k] + N 2 [k] N 1 [k] N 2 [k] in(t) div_x2(t) div(t) T delay [k-1] T delay [k] We can realize a given divide value as the sum of lower divide values - Only pass select edges from higher frequency divider 72
Application of Divider Concept to Digital PLL Example: desired frequency division range is 64 to 127 Dithered by a third order Delta-Sigma modulator 73
Proposed Divider Structure Divide value =N 0 +N 1 +N 2 +N 3 Increase division frequency by a factor of 4 - Only pass one of four divider edges to GRO TDC 74
Removal of Divide Value Delay Variation Place Σ dithered edge (N 2 ) on edge not passed to GRO - Divide value (N 3 ) is constant for edge that passes to GRO 75
Die Photo of Prototype 0.13-μm CMOS Active area: 0.95 mm 2 Chip area: 1.96 mm 2 V DD : 1.5V Current: - 26mA (Core) - 7mA (VCO output buffer at 1.1V) GRO-TDC: - 2.3mA - 157X252 um 2 76
Power Distribution of Prototype IC Divider DAC 1.4mW (3%) 2.8mW (6%) 3.0mW (7%) 21.0mW VCO (46%) 6.8mW (15%) 7.7mW (17%) 3.4mW (7%) Ref. Buffer GRO-TDC Digital VCO Pad Buffer Total Power: 46.1mW Notice GRO and digital quantization noise cancellation have only minor impact on power (and area) 77
Measured Phase Noise at 3.67GHz Suppresses quantization noise by more than 15 db Achieves 204 fs (0.27 degree) integrated noise (jitter) Reference spur: -65dBc 78
Calculation of Phase Noise Components dbc/hz 40 60 80 100 120 VCO Noise Finepath ΣΔ Quantization Noise Fine tune DAC Thermal Coarse tune DAC Thermal Divider Noise (1% left) GRO Noise Ref Noise Close loop Noise 140 160 180 10 3 10 4 10 5 10 6 10 7 f offset See wideband digital synthesizer tutorial available at http://www.cppsim.com 79
Conclusions Digital Phase-Locked Loops look extremely promising for future applications - Very amenable to future CMOS processes - Excellent performance can be achieved A low-noise, wide-bandwidth digital Σ fractional-n frequency synthesizer is achieved with - High performance noise-shaping GRO TDC - Quantization noise cancellation in digital domain Key result: < 250 fs integrated noise with 500 khz bandwidth Innovation of future digital PLLs will involve joint circuit/algorithm development 80