20Gb/s 0.13um CMOS Serial Link

Similar documents
5Gbps Serial Link Transmitter with Pre-emphasis

ISSCC 2003 / SESSION 4 / CLOCK RECOVERY AND BACKPLANE TRANSCEIVERS / PAPER 4.3

1004 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 40, NO. 4, APRIL 2005

CS 250 VLSI System Design

/$ IEEE

A Variable-Frequency Parallel I/O Interface with Adaptive Power Supply Regulation

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2012

Design of VCOs in Global Foundries 28 nm HPP CMOS

ECEN620: Network Theory Broadband Circuit Design Fall 2012

ECEN 720 High-Speed Links: Circuits and Systems. Lab3 Transmitter Circuits. Objective. Introduction. Transmitter Automatic Termination Adjustment

PRECISON CLOCK SYNTHESIS USING DIRECT MODULATION OF FRONT-END MULTIPLEXERS/DEMULTIPLEXERS IN HIGH SPEED SERIAL LINK TRANSCEIVERS

DESIGN OF MULTIPLYING DELAY LOCKED LOOP FOR DIFFERENT MULTIPLYING FACTORS

A 10-Gb/s Multiphase Clock and Data Recovery Circuit with a Rotational Bang-Bang Phase Detector

Dedication. To Mum and Dad

A 5-Gb/s 156-mW Transceiver with FFE/Analog Equalizer in 90-nm CMOS Technology Wang Xinghua a, Wang Zhengchen b, Gui Xiaoyan c,

Lecture 160 Examples of CDR Circuits in CMOS (09/04/03) Page 160-1

A 4 GSample/s 8-bit ADC in. Ken Poulton, Robert Neff, Art Muto, Wei Liu, Andrew Burstein*, Mehrdad Heshami* Agilent Laboratories Palo Alto, California

A Serial Link Transceiver Based on 8 GSa/s A/D and D/A Converters

Single-Ended to Differential Converter for Multiple-Stage Single-Ended Ring Oscillators

A 2-byte Parallel 1.25 Gb/s Interconnect I/O Interface with Self-configurable Link and Plesiochronous Clocking

Phase interpolation technique based on high-speed SERDES chip CDR Meidong Lin, Zhiping Wen, Lei Chen, Xuewu Li

ECEN620: Network Theory Broadband Circuit Design Fall 2014

10.1: A 4 GSample/s 8b ADC in 0.35-um CMOS

Self Biased PLL/DLL. ECG 721 Memory Circuit Design (Spring 2017) Dane Gentry 4/17/17

on-chip Design for LAr Front-end Readout

ECE1352. Term Paper Low Voltage Phase-Locked Loop Design Technique

A Serial Link Transceiver Based on 8 GSa/s A/D and D/A Converters

A CMOS Multi-Gb/s 4-PAM Serial Link Transceiver*

Analysis of Phase Noise Profile of a 1.1 GHz Phase-locked Loop

A10-Gb/slow-power adaptive continuous-time linear equalizer using asynchronous under-sampling histogram

Digital Systems Design

Hot Topics and Cool Ideas in Scaled CMOS Analog Design

Accomplishment and Timing Presentation: Clock Generation of CMOS in VLSI

High-Performance Electrical Signaling

EE273 Lecture 7 Introduction to Signaling October 14, Today s Assignment

International Journal of Scientific & Engineering Research, Volume 4, Issue 6, June ISSN

Lecture 15: Transmitter and Receiver Design

A 0.18µm SiGe BiCMOS Receiver and Transmitter Chipset for SONET OC-768 Transmission Systems

20 GHz Low Power QVCO and De-skew Techniques in 0.13µm Digital CMOS. Masum Hossain & Tony Chan Carusone University of Toronto

A 3-10GHz Ultra-Wideband Pulser

Design for MOSIS Educational Program (Research)

High-speed Serial Interface

IN the face of shrinking feature size, one of the major

ECEN 720 High-Speed Links Circuits and Systems

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2010

An Analog Phase-Locked Loop

Design and Implementation of High-Speed CMOS Clock and Data Recovery Circuit for Optical Interconnection Applications. Seong-Jun Song. Dec.

Geared Oscillator Project Final Design Review. Nick Edwards Richard Wright

A New Approach for Op-amp based VCO Design Using 0.18um CMOS Technology

5.5: A 3.2 to 4GHz, 0.25µm CMOS Frequency Synthesizer for IEEE a/b/g WLAN

Fractional- N PLL with 90 Phase Shift Lock and Active Switched- Capacitor Loop Filter

ISSCC 2006 / SESSION 13 / OPTICAL COMMUNICATION / 13.2

Optimization of Digitally Controlled Oscillator with Low Power

Taheri: A 4-4.8GHz Adaptive Bandwidth, Adaptive Jitter Phase Locked Loop

ECEN 720 High-Speed Links: Circuits and Systems

Multiplexer for Capacitive sensors

A Low Noise, Voltage Control Ring Oscillator Based on Pass Transistor Delay Cell

Multi-gigabit signaling with CMOS

Research on Self-biased PLL Technique for High Speed SERDES Chips

Integrated Circuit Design for High-Speed Frequency Synthesis

Fully-Integrated Low Phase Noise Bipolar Differential VCOs at 2.9 and 4.4 GHz

D f ref. Low V dd (~ 1.8V) f in = D f ref

A Reset-Free Anti-Harmonic Programmable MDLL- Based Frequency Multiplier

Synchronous Mirror Delays. ECG 721 Memory Circuit Design Kevin Buck

NEW WIRELESS applications are emerging where

Self-Biased PLL/DLL. ECG minute Final Project Presentation. Wenlan Wu Electrical and Computer Engineering University of Nevada Las Vegas

Ultra-high-speed Interconnect Technology for Processor Communication

LETTER A 1.25-Gb/s Burst-Mode Half-Rate Clock and Data Recovery Circuit Using Realigned Oscillation

AN ABSTRACT OF THE THESIS OF

EE290C - Spring 2004 Advanced Topics in Circuit Design High-Speed Electrical Interfaces. Announcements

4 x 10 bit Free Run A/D 4 x Hi Comparator 4 x Low Comparator IRQ on Compare MX839. C-BUS Interface & Control Logic

An Efficient Design of CMOS based Differential LC and VCO for ISM and WI-FI Band of Applications

Design of High-Speed Serial-Links in CMOS (Task ID: )

An Optimal Design of Ring Oscillator and Differential LC using 45 nm CMOS Technology

A Multiobjective Optimization based Fast and Robust Design Methodology for Low Power and Low Phase Noise Current Starved VCO Gaurav Sharma 1

A Novel Low Power Digitally Controlled Oscillator with Improved linear Operating Range

High Speed Digital Design & Verification Seminar. Measurement fundamentals

EE290C - Spring 2004 Advanced Topics in Circuit Design High-Speed Electrical Interfaces. Outline

A GHz Wideband Sub-harmonically Injection- Locked PLL with Adaptive Injection Timing Alignment Technique

To learn fundamentals of high speed I/O link equalization techniques.

A 5.4-Gb/s Clock and Data Recovery Circuit Using Seamless Loop Transition Scheme With Minimal Phase Noise Degradation

A 0.2-to-1.45GHz Subsampling Fractional-N All-Digital MDLL with Zero-Offset Aperture PD-Based Spur Cancellation and In-Situ Timing Mismatch Detection

DESIGN OF LOW-VOLTAGE WIDE TUNING RANGE CMOS MULTIPASS VOLTAGE-CONTROLLED RING OSCILLATOR

12.5 Gb/s JESD204B Compliant Transmitter Design in 28nm FD-SOI Technology

CHAPTER 6 PHASE LOCKED LOOP ARCHITECTURE FOR ADC

Outline. Motivation. Design Challenges. Design of Mode-Switching VCO. Measurement Results. Conclusion 7/8/14

Signal Integrity Design of TSV-Based 3D IC

ECEN720: High-Speed Links Circuits and Systems Spring 2017

A 10 bit, 1.8 GS/s Time Interleaved Pipeline ADC

Design of Phase Locked Loop as a Frequency Synthesizer Muttappa 1 Akalpita L Kulkarni 2

A Sub-0.75 RMS-Phase-Error Differentially-Tuned Fractional-N Synthesizer with On-Chip LDO Regulator and Analog-Enhanced AFC Technique

A Pulse-Based CMOS Ultra-Wideband Transmitter for WPANs

A Fully Integrated CMOS Phase-Locked Loop With 30MHz to 2GHz Locking Range and ±35 ps Jitter

ISSN: International Journal of Engineering and Innovative Technology (IJEIT) Volume 1, Issue 2, February 2012

Enhancing FPGA-based Systems with Programmable Oscillators

Introduction to CMOS RF Integrated Circuits Design

A Low Power, Small Area Cyclic Time-to-Digital Converter in All-Digital PLL for DVB-S2 Application

Voltage Controlled Ring Oscillator Design with Novel 3 Transistors XNOR/XOR Gates

DESIGN OF CMOS BASED FM MODULATOR USING 90NM TECHNOLOGY ON CADENCE VIRTUOSO TOOL

A PROCESS AND TEMPERATURE COMPENSATED RING OSCILLATOR

Transcription:

20Gb/s 0.13um CMOS Serial Link Patrick Chiang (pchiang@stanford.edu) Bill Dally (billd@csl.stanford.edu) Ming-Ju Edward Lee (ed@velio.com) Computer Systems Laboratory Stanford University Stanford University 1

Outline Motivation Background Static phase offset Random/power supply induced jitter Proposed 20Gb/s transceiver New Architecture Circuit Blocks Receiver Design Preliminary Results Conclusion Stanford University 2

I/O Bandwidth is Limiting Factor Predicted Off-Chip Bandwidth growing slower than On-Chip Terabits/sec 60 50 40 30 20 10 0 Predicted Maximum On-Chip vs. Maximum Off-Chip Bandwidth Off Chip BW On Chip BW 1999 2000 2001 2002 2003 2004 2005 Year Total I/O BW calculated from total I/O pins * I/O bandwidth/pin. Total on-chip BW calculated from on-chip clock frequency * # wires/chip Higher bit rate I/O s needed to close this gap Stanford University 3

20Gb/s 0.13um CMOS Transceiver Goals Design I/O architecture that minimizes timing uncertainty Systematic/static phase offset Random/power supply induced jitter Not addressing channel equalization Reasonable power dissipation(200mw/link) Small area footprint(500um x 500um) for high integration on single chip Stanford University 4

Outline Motivation Background Static phase offset Random/power supply induced jitter Proposed 20Gb/s transceiver New Architecture Circuit Blocks Receiver Design Preliminary Results Conclusion Stanford University 5

Static Phase Offset Ideal Transceiver Transmitter Receiver Outa Outb Ina Inb Sampling Clock 25ps 25ps 25ps 25ps 25ps 25ps 25ps 25ps 12ps 12ps 12ps 12ps Ideal Transmitter Output Time Sampling clocks Ideal Receiver Input Time Timing Margin=12ps Stanford University 6

Static Phase Offset Reality Transmitter Receiver Outa Outb Ina Inb Sampling Clock 10ps static phase offset 25ps 35ps 25ps 15ps 25ps 35ps 25ps 15ps 17ps 17ps 7ps 7ps Time Actual Transmitter Output Sampling clocks Actual Receiver Input Time Timing Margin=7ps 42% reduction Stanford University 7

Power Supply Induced Jitter Transm itter Receiver Supply Noise VDD VDD Supply Noise Outa Outb Ina Inb Samplng Clock 10ps pk-pk Supply Induced Jitter 25ps 15ps 25ps 15ps 25ps 10ps pk-pk Supply Induced Jitter 15ps 25ps 15ps 2ps 2ps 2ps 2ps 10ps Actual Transmitter Output Time 10ps Sampling clocks Actual Receiver Output Time Timing Margin=2ps Stanford University 8

20Gb/s Transmitter Design Spaces Choose this Architecture Stanford University 9

Outline Motivation Background Static phase offset Random/power supply induced jitter Proposed 20Gb/s transceiver New Architecture Circuit Blocks Receiver Design Preliminary Results Conclusion Stanford University 10

New Architecture Dirty Multi-Phase Clocks Timing uncertainty based solely on last stages, clocked by 10GHz clock D0 D1 D2 D3 4:1 Mux 10Gb/s 10GHz Latch 10Gb/s 8 data signals @ 2.5Gb/s D0 2:1 Output Mux 20Gb/s D1 D2 D3 4:1 Mux 10Gb/s 10GHz Latch 10Gb/s Clean 20Gb/s Dirty Multi-Phase Clocks Clean 2-Phase 10GHz CLK Stanford University 11

New Architecture Reduces Jitter/Phase Offset Two 10Gb/s Data Streams Mid0a Mid0b Mid1a Mid1b 100ps A C E B 100ps 50ps D Can tolerate jitter/static phase offset here 2-phase 10Ghz Clock 50ps 50ps 20Gb/s Output Outa Outb A B C D E t Stanford University 12

20Gb/s Transmitter Low Static Phase Offset Low Supply Induced Jitter No post-pll Buffers Stanford University 13

20Gb/s Output Stage Vdd 25 Ohms Outa Vdd 25 Ohms Outb 10GHz clock sources directly from LC oscillator tank No post-pll buffer jitter Low static phase offset Simulated data-dependent jitter is minimal Data0_10g Data0b_10g Data1_10g Data1b_10g Clock comes directly from LC tank Clk_10g Clkb_10g Calibration Scheme Send DC balanced 1010 pattern Sample 20Gb/s output with uncorrelated clock Adjust variable capacitance based upon output sampling histogram FSM Uncorrelated random clock Stanford University 14

10GHz Analog Latch 10GHz Analog Sampler 10GHz Output Buffer Full pass gates provide symmetric clock injection Gain loss of ½ from 10Gb/s input to output Stanford University 15

4:1 10Gb/s Mux Design 100ps d0 d1 d2 d3 4:1 Mux 10Gb/s 8 Data Streams @ 2.5Gb/s 0 90 180 270 d4 d5 d6 d7 4:1 Mux 10Gb/s 10GHz Latch CLK 10GHz Latch Data0_10g Data1_10g 50ps 600mV 45 135 225 315 CLKB 250 Ohm On-Chip Resistor D0-top D1-top D2-top D3-top Vdd Data D0-bot D1-bot D2-bot D3-bot Clk270 D0-top...... 4:1 Output Multiplexed Preamp Data / Clock Gating Clk0 D0-bot Stanford University 16

10GHz Clock Alignment Problem How do you ensure 10Gb/s data is in phase with 10Ghz clock? Two 10Gb/s Data Streams Mid0a Mid0b Mid1a Mid1b 2-Phase 10GHz Clock 100ps A C E 25ps B 50ps D 20Gb/s Output Outa Outb A B C D E Static Phase Offset/Jitter Passed to Output t Stanford University 17

Phase Adjusting FSM 8 multi-phases @ 2.5GHz 8 multi-phases @ 2.5GHz PLL interpolator Control Digital FSM Clk0 8 Sampler Banks Clk45 Clk90 Clk135 Clk180 Clk225 Clk270 Clk315 10GHz 10GHzb 4:1 Mux 4:1 Mux 10Gb/s 10Gb/s 10GHz Latch A 10GHz Latch B 10Gb/s 10Gb/s 2:1 Output 20Gb/s Stage Align zero crossings of 10GHz clock and 8 multi-phases of 2.5GHz Clock Stanford University 18

Transmitter Outline Stanford University 19

Phase Interpolator Tri-state inverters provide coarse interpolation Digitally switch capacitors provide fine control Maximum phase step = 7.3ps Stanford University 20

10GHz LC Oscillator Use passive L,C elements for frequency synthesis 10x less jitter/power supply sensitivity than ring oscillator VCO s Significantly less static phase offset Higher frequency of oscillation Disadvantage--area is significantly larger than conventional techniques Area disadvantage mitigated by higher frequency--inductor size reduces by factor of 4 for 2x increase in frequency A 130um x 130um 1nH inductor deemed reasonable area / per IO Tuning range given by inversion mode PMOS capacitors Regulated Supply provides additional power supply rejection < 3ps pk-pk jitter--2000 cycles, with 20mV wideband Vdd noise Stanford University 21

Receiver Design Clock recovery done at reset time Sampling clock swept across entire bit period at reset time Bit error is measured for sampling instances, and optimum sampling time chosen at startup Periodic retraining of receiver to compensate for slowly varying timing drift Stanford University 22

Simulated Results 230um 270um Transmitter Layout Simulated 20Gb/s Output, with Clean Supply Data Rate 20Gb/s Process 1.2V, 0.13um Generic CMOS Power 200mW(transmitter & receiver) (PLL=20mW) Estimated Area 500um x 500um Pk-Pk Jitter < 10ps, with 20mV Vdd Noise Output Swing 100mV Input Receiver Sensitivity 40mV Tuning Range 10ps (10%) Stanford University 23

Conclusion A 20Gb/s CMOS I/O Link has been designed Low Power, Low Area enable high integration of these 20Gb/s I/O pads on a single chip Stanford University 24

Acknowledgements Velio Communications Ramesh Senthinathan, Mark Kellam, John Poulton Jaeha Kim, Mark Horowitz, Niranjan Talwalkar for discussion Stanford University 25

BW Numbers 1999 2000 2001 2002 2003 2004 2005 # of pins 1600 1792 2007 2248 2518 2820 3158 I/O bw/pin 1.92E+09 2.77E+09 3.20E+09 3.50E+09 3.70E+09 4.00E+09 4.07E+09 total I/O bw 1.54E+12 2.77E+12 3.21E+12 3.94E+12 4.66E+12 5.64E+12 6.43E+12 on-chip bw/wire 1.20E+09 1.40E+09 1.60E+09 1.72E+09 1.86E+09 2.00E+09 2.12E+09 chip size 1.76E-02 1.76E-02 1.76E-02 1.80E-02 1.84E-02 1.89E-02 1.93E-02 minimum wiring width(16l) 1.44E-06 1.44E-06 1.04E-06 1.04E-06 1.04E-06 7.20E-07 7.20E-07 # of wires 1.22E+04 1.22E+04 1.69E+04 1.73E+04 1.77E+04 2.63E+04 2.68E+04 Total on-chip BW 1.46E+13 1.71E+13 2.72E+13 2.98E+13 3.30E+13 5.30E+13 5.68E+13 Stanford University 26