A 24Gb/s Software Programmable Multi-Channel Transmitter A. Amirkhany 1, A. Abbasfar 2, J. Savoj 2, M. Jeeradit 2, B. Garlepp 2, V. Stojanovic 2,3, M. Horowitz 1,2 1 Stanford University 2 Rambus Inc 3 Massachusetts Institute of Technology
24Gb/s Transmitter FPGA Interface e A test instrument for verifying different transmission algorithms Multiple operation modes 2-channel or 4-chanennel Analog Multi-Tone (AMT) 2PAM, 4PAM, 8PAM, baseband Software programmable 2
High-Speed Electrical Links 3 Daughter card Backplane Network Routers Memory cards Chip B Chip A DRAM Memory controller PCB PCB CPU/Controller to DRAM CPU to GPU
State of the Art Links Rx W 1 W 2 W 3 W 4 Line Driver Tx Wb 1 Wb 2 Wb k Baseband 2PAM or 4PAM 4-5 tap discrete linear transmit equalizer 5-20 tap decision feedback equalizer (DFE) 4
Channel Characteristics in Links 0-10 Frequency Response Chip to Chip db -20-30 Multi-Drop (Memory) -40 Backplane -50 0 5 10 15 20 Frequency (GHz) Notches caused by reflections From impedance discontinuities E.g. vias, stubs, package, parasitic capacitance, etc Multi-Tone signaling can improve performance 5
A Practical AMT Architecture X N-1 N Equalizer (W N-1 ) Integrate Z N-1 X N-2 N Equalizer (W N-2 ) Channel Integrate t Z N-2 X 0 Z 0 N Equalizer (W 1 1) Transmitter Receiver Integrate MIMO DFE Small number of sub-channels (N) 2, 3, or 4 in most cases N-times over-sampled equalizer per sub-channel at the transmitter Multi-Input Multi-Output (MIMO) DFE at the receiver AMT is a generalization of a baseband system 6
Two-Channel Example 1.5 1 3 0.5 2 0 1-0.5 0-1 -1-1.5 0 0.5 1 1.5 2 2.5 3 3.5 4-2 -3 1 2 3 4 5 6 7 1.5 1 0.5 0-0.5-1 sampling -1.5 0 0.5 1 1.5 2 2.5 3 3.5 4 Interference zero at the sampling points Called a Trans-multiplexer 7
Evolution o of a Baseband Tx Equalizer 4-tap BB transmitter x 0 x 3 w 0 w 1 w 2 w3 x 1 x 2 2-way parallelize x 0 x 2 x 3 w 0 w 1 w 2 w 3 0 w 0 w 1 w 2 w 3 0 x 1 Shift x to the left Shift W to the right x 0 x 2 x 1 w 0 w 1 w 2 w 3 0 0 w 0 w 1 w 2 w 3 x 3 Represent as over-sampled equalizer x 0 2 x 2 x 1 2 w 0 w 1 w 2 w 3 0 0 w 0 w 1 w 2 w 3 x 3 8
AMT is a Generalization of Baseband 4-tap Baseband 2-Channel AMT (2-way parallelized) 4 taps per channel AMT has more degrees of freedom Better capable of shaping the transmit spectrum MIMO DFE is also a generalization of a BB DFE 9
Software Programmable Transmitter Equivalent functionality 16-tap FIR filter at 12GHz 2-bit inputs (4PAM) and 10-bit taps 10
Measured Eye Diagrams Baseband Mode AMT Mode 2PAM 2PAM 4PAM Ch1 Ch2 Ch3 Ch4 Un-Equalized Equalized Equalized 4-channel AMT (Equalized Post Processed) 12Gb/s 12Gb/s 24Gb/s 18Gb/s On an oscilloscope Rx implemented in Matlab 11
12GS/s Digital to Analog Converter 2-way output t multiplexed l current-mode DAC Termination supply 1.8V Unused current dumped to 1.0V to save power 18V 1.8V pp output swing Savoj, et al, 12GS/s Phase Calibrated CMOS DAC, Companion paper, Session 7 12
13 Digital Equalizer Datapath (One Phase) Mux 4x1 Mux 4x1 Mux 4x1 Mux 4x1 Mux 4x1 p Comp 4:2-1 st stage Mux 4x1 p Thermometer Encoder Comp 4:2 2 nd stage Flip Flop Comp 4:2-1 st stage p p rd stage p omp 4:2 2 nd stage p Comp 4:2-1 st stage p Mux 4x1 Mux 4x1 Mux 4x1 Comp 4:2 1 st stage p Mux 4x1 Flip Flo Mux 4x1 Flip Flo Flip Flo Adder Flip Flo Comp 4:2 3 r Flip Flo Flip Flo Flip Flo Flip Flo Mux 4x1 Mux 4x1 C Mux 4x1 Mux 4x1 Mux 4x1 Multiply 16 2-bit numbers by 16 10-bit numbers Multiplication using 4:1 multiplexers W and 3W stored in flops Add results using 4:2 compressor units 2-way parallelized to operate with a 1.5GHz clock
Equalizer Floorplan 450 400 350 Phase 1 output pins 300 Phase 2 Input pins 250 200 150 Phase 3 100 50 Phase 4 0 0 100 200 300 400 500 600 700 800 900 μm 14
Complete Equalizer with Routing Post Route layout in SOC Encounter 15
Transmitter Clocking Phase interpolator (PI) between DAC and equalizer Programmed offline Mesh 1.5GHz clock distribution in the equalizer Pattern generator clock branches off from equalizer grid Part of the clock distribution latency in the critical path 16
Performance Summary Measured Transmitter Performance Chip Micrograph Process Maximum Rate Digital Power Analog Power Area Output Swing 90nm CMOS 29Gb/s 350mW 160mW 0.8mm 2 1.6V pp 21mW/Gbps 17
Multi-Tone Operation Tx Rx 0 Multi-Drop Configuration C i = 1pF Frequency Response -10 db -20-30 -40-50 0 2 4 6 8 10 Frequency (GHz) Measured 3-Channel AMT, 9Gb/s 18
Multi-PAM Operation +3 2PAM/4PAM symbols +1-1 +1-1 2 +1-1 -3 Y = X 1 + 2X 2 (4PAM) (2PAM) (2PAM) Y(4PAM) w 0 w 1 w 2 w 3 X 1 (2PAM) X 2 (2PAM) w 0 w 1 w 2 w 3 2w 0 2w 1 2w 2 2w 3 Tx configuration in 8PAM/16PAM mode 19
Fractional Equalization Measured 8PAM Baseband, 18Gb/s 20
Cyclically y Time-Variant Equalization 3GHz - I Equalizer Phase 1 Equalizer Phase 3 Equalizer Phase 2 Equalizer Phase 4 1 0 1 0 6GS/s DAC 3GHz - Q 6GS/s DAC 6GHz 1 0 12GS/s DAC 4 different paths to output 4 different responses Time-Invariant Equalization SIDR = 26dB 28Gb/s Time-Variant a Equalization SIDR = 31dB 28Gb/s A. Amirkhany, et al, Time-Variant Characterization and Compensation of Wideband Circuits, CICC 2007 21
Conclusions A 4-way parallelized equalizer with each parallel branch programmed independently supports Analog Multi-Tone Multi-level baseband Fractional (over-sampled equalization) Cyclically time-variant equalization Power overhead due to digital implementation Instead of pseudo-dac Area overhead for storing more tap coefficients 22
Digital Implementation Overhead A4-tap 2PAM 6Gbps Tx 8-bit 2:1 MUX + w -w 4x8 Add four 8-bit numbers Compressor 8-bit Adder To 7-bit DAC x 4 Power 0.5mW 10.3 mw Includes clock power 5.0 mw inside flops Area 960 um 2 16,000 um 2 8,000 um 2 Total Power Overhead = 16.0 mw (2.6mW/Gbps) Total Area Overhead = 25,000um 2 Compared to a Pseudo-DAC implementation 23