A 10Gb/s 10mm On-Chip Serial Link in 65nm CMOS Featuring a Half-Rate Time-Based Decision Feedback Equalizer Po-Wei Chiu, Somnath Kundu, Qianying Tang, and Chris H. Kim University of Minnesota, Minneapolis, MN chiux148@umn.edu Symposia on VLSI Technology and Circuits
Outline Background Proposed Time-Based Decision Feedback Equalizer (DFE) 65nm Test Chip Details BER Measurements Conclusion Slide 1
On-Chip Interconnect Scaling Interconnect length not scaling down at same rate as transistor scaling Interconnect power larger fraction of total chip power Slide 2
Interconnect Design Challenges A 2cm x 2cm processor requires interconnect lengths as long as 15 mm Signal loss due to RC parasitics limits performance Slide 3
Standard Solution: Repeater 1.5 1 2 n-1 n Latency (ns) 1 0.5 Repeater (no equalization) Equalizer n=1 n=2 n=3 n=5 5 10 15 Length (mm) [2] L. Zhang, et al., TVLSI, 2009. Pros: Improved RC delay, digital implementation, good CAD tool support Cons: Floorplan disruption, increased power consumption 0 0 20 Slide 4
TX/RX Equalization Techniques TX FFE Channel RX CTLE + DFE Volt. Gain Volt. Time Freq. Time Feed Forward Equalizer (FFE) Continuous Time Linear Equalizer (CTLE) Decision Feedback Equalizer (DFE) Slide 5
Equalization Techniques for On- Chip Interconnects INP V DD OUTN Active Inductor V OUT V SS V DD INN OUTP D+ D- Dd- Dd+ V SS I DRV I PE [3] D. Walter, et al., ISSCC, 2012. [4] S. Lee, et al., ISSCC, 2013. Left: Utilizes capacitors to pre-distort signal Right: Utilizes current mode logic to pre-distort signal, inductor provides frequency peaking Slide 6
Conventional DFE Implementation V RX (t) Σ V DFE Slicer x[n] Output W N... W 2 W 1 Z -1... Z -1 Z -1 I BIAS W 1 W 2 W N Current mode logic... Slide 7
Proposed Time-Based DFE V RX (t) Σ t PD PD=Phase Detector x[n] Output W N... W 2 W 1 Z -1... Z -1 Z -1 CLK W N W 2 W 1 t Delay line Slide 8
DFE Comparison CLK PD Pros Cons Conventional DFE Ultra high speed (>20Gb/s) Analog intensive, headroom issues, limited number of taps, large power consumption Time-based DFE Inverter based, scalable to large number of taps, low power consumption Moderate speed Slide 9
Time-Based DFE Operation Input V RX (t) N REF w/o DFE N DFE w/ DFE W 1 W 2 W 1 W 2 W 1 W 2 W 1 Decision Results 0 1 0 0 Slide 10
Time-Based DFE Operation Input V RX (t) w/o DFE D=1 D=0 w/ DFE Slide 11
Optimizing Time-Based DFE T 1 T 2 T 3 x[n] PD Output T REF T REF T REF Slide 12
Optimizing Time-Based DFE x[n] T 1 T 2 PD Output T REF 2T REF -T 3 Performs same time-based comparison with fewer delay stages low power consumption All delay stages are identical Slide 13
Delay Stage Implementation Analog Control Digital Control T RX T W1X1 PD T REF T -W2X2 V RX (t) Σ t PD x[n] Output W 2 W 1 Z -1 Z -1 Slide 14
Delay Stage w/ Analog Control V RX 15 65 nm GP CMOS, 25 C 4X V RX 2X V RX 1X V RX V RX V RX Delay (ps) 10 5 4X 4X 1X 2X 4X 0 0 0.2 0.4 0.6 0.8 1.0 1.2 V RX (V) Delay controlled by analog signal Slide 15
Delay Stage w/ Digital Control D w<5> 8 65nm GP CMOS, 25 C 4X D w<4> 2X D w<3> 1X 1X 2X 4X Delay (ps) 6 4 2 D=0 D=1 4X 4X D w<2> D w<1> D w<0> 0 0 20 40 60 w<5:0> Inverters for coarse control, capacitors for fine control Capacitor connected to VDD for wider delay range Slide 16
65nm Test Chip Diagram TX: Half rate (i.e. F clk =5GHz for 10 Gbps) FFE RX: Transimpedance amp.(tia), time-based DFE Testing features: 15 bit PRBS, in-situ eye-diagram monitor Slide 17
Time-Based DFE Implementation -W 2 1X 2X4X 8X TIA CLK (5GHz) T REF T RX T -W2X2 T o D[n] T PD FF FF W1X1 T o + T Zero-offset aperture PD D Q V RX (t) W 1 T o + T W 1 RST CLK (5GHz) T RX T REF T W1X1 T -W2X2 T o PD D[n+1] FF FF Transmission gate resistor for impedance matching Half rate operation Zero-offset aperture phase detector (PD) -W 2 Slide 18
Eye-Diagram Measurement [4] S. Lee, et al., ISSCC, 2013. Typical BER eye diagram Y-axis: Voltage offset X-axis: Unit Interval Slide 19
Eye-Diagram Measurement [4] S. Lee, et al., ISSCC, 2013. Typical BER eye diagram Y-axis: Voltage offset X-axis: Unit Interval TB-DFE BER eye diagram Y-axis: Time offset X-axis: Unit Interval Slide 20
In-situ BER Eye-Diagram T Monitor UI CLK Phase Delay T REF DFE T RX DFE T PD D' D BER Monitor Err 11b Counter Parallel to serial Error count 2 15-1 PRBS 6-bit programmable delay to sweep the X,Y-axes BER monitor compares TX data with RX data Slide 21
Measured BER Eye Diagram Y-axis is time offset, not voltage offset Slide 22
Measured BER Bathtub 65nm GP, 1.2V, 25ºC 10-6 10-9 Bit Error Rate10-3 10-12 -1.2 0.43UI -0.9-0.6-0.3 0 0.3 0.6 0.9 Phase (UI) w/o DFE w/ DFE 1.2 Eye width=0.43 UI @ BER=10-12 with DFE Slide 23
Die Photo and Feature Summary Slide 24
Performance Comparison ISSCC'09 [5] ISSCC'12 [6] ISSCC'13 [7] VLSI'15 [8] Technology 90nm 65nm 65nm 65nm TX and RX Current mode Voltage mode Current mode CTLE-based driver+tia driver+sense amp. driver+sense amp. repeater Features No DFE No DFE No DFE No DFE This work 65nm Voltage mode driver+tia 2-tap TB-DFE Data Rate 4Gb/s 10Gb/s 3Gb/s 4Gb/s 10Gb/s Throughput (Gb/s/µm) 2 2.56 0.75 4 2 Link Length 10mm 6mm 10mm 2.5mm+2.5mm 10mm BER Bathtub < 10E-6 < 10E-12 < 10E-12 < 10E-12 < 10E-12 BER Eye Yes (< 10E-6) No Yes (< 10E-12) No Yes (< 10E-11) Energy TIA 14.4 Efficiency 35.6 174 9.5 48.4 DFE 30.9 (fj/b/mm) FFE 31.9 [5] B. Kim, et al., ISSCC, 2009. [6] D. Walter, et al., ISSCC, 2012. [7] S. Lee, et al., ISSCC, 2013. [8] M. Chen et al., VLSI, 2015. Slide 25
Conclusion A 10mm, 10 Gbps on-chip serial link implemented in 65nm GP An all-digital inverter based 2-tap half-rate time-based DFE In-situ BER eye diagram monitor Slide 26