IEEE Proof Web Version

Similar documents
A 24-Channel 300 Gb/s 8.2 pj/bit Full-Duplex Fiber-Coupled Optical Transceiver Module Based on a Single Holey CMOS IC

Low-power 2.5 Gbps VCSEL driver in 0.5 µm CMOS technology

To learn fundamentals of high speed I/O link equalization techniques.

A 3.9 ns 8.9 mw 4 4 Silicon Photonic Switch Hybrid-Integrated with CMOS Driver

ISSCC 2003 / SESSION 10 / HIGH SPEED BUILDING BLOCKS / PAPER 10.8

160-Gb/s Bidirectional Parallel Optical Transceiver Module for Board-Level Interconnects

IBM T. J. Watson Research Center IBM Corporation

An 8-Gb/s Inductorless Adaptive Passive Equalizer in µm CMOS Technology

A10-Gb/slow-power adaptive continuous-time linear equalizer using asynchronous under-sampling histogram

The GBTIA, a 5 Gbit/s Radiation-Hard Optical Receiver for the SLHC Upgrades

5Gbps Serial Link Transmitter with Pre-emphasis

A 10-Gb/s Multiphase Clock and Data Recovery Circuit with a Rotational Bang-Bang Phase Detector

Integrated Optoelectronic Chips for Bidirectional Optical Interconnection at Gbit/s Data Rates

JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 27, NO. 7, APRIL 1,

Source Coding and Pre-emphasis for Double-Edged Pulse width Modulation Serial Communication

A 5-Gb/s 156-mW Transceiver with FFE/Analog Equalizer in 90-nm CMOS Technology Wang Xinghua a, Wang Zhengchen b, Gui Xiaoyan c,

A Fully Integrated 20 Gb/s Optoelectronic Transceiver Implemented in a Standard

High-Speed Circuits and Systems Laboratory B.M.Yu. High-Speed Circuits and Systems Lab.

AN increasing number of video and communication applications

Transmission-Line-Based, Shared-Media On-Chip. Interconnects for Multi-Core Processors

ECEN689: Special Topics in Optical Interconnects Circuits and Systems Spring 2016

ISSCC 2006 / SESSION 13 / OPTICAL COMMUNICATION / 13.2

A 1.5 Gbps Transceiver Chipset in 0.13-mm CMOS for Serial Digital Interface

ECEN620: Network Theory Broadband Circuit Design Fall 2014

The Development of the 1060 nm 28 Gb/s VCSEL and the Characteristics of the Multi-mode Fiber Link

Low Thermal Resistance Flip-Chip Bonding of 850nm 2-D VCSEL Arrays Capable of 10 Gbit/s/ch Operation

A Variable-Frequency Parallel I/O Interface with Adaptive Power Supply Regulation

Comparison of Bandwidth Limits for On-card Electrical and Optical Interconnects for 100 Gb/s and Beyond

ALTHOUGH zero-if and low-if architectures have been

Time Table International SoC Design Conference

10 Gb/s Radiation-Hard VCSEL Array Driver

System demonstrator for board-to-board level substrate-guided wave optoelectronic interconnections

A 10Gbps Analog Adaptive Equalizer and Pulse Shaping Circuit for Backplane Interface

1.25Gbps/2.5Gbps, +3V to +5.5V, Low-Noise Transimpedance Preamplifiers for LANs

High-speed free-space based reconfigurable card-to-card optical interconnects with broadcast capability

Low Jitter, Low Emission Timing Solutions For High Speed Digital Systems. A Design Methodology

ECEN 720 High-Speed Links: Circuits and Systems

SINCE the performance of personal computers (PCs) has

Single-Ended to Differential Converter for Multiple-Stage Single-Ended Ring Oscillators

A 5-8 Gb/s Low-Power Transmitter with 2-Tap Pre-Emphasis Based on Toggling Serialization

10GBASE-S Technical Feasibility

Optoelectronic Oscillator Topologies based on Resonant Tunneling Diode Fiber Optic Links

A 56Gb/s PAM-4 VCSEL driver circuit

OPENETICS. P/N Gb/sQSFP+SR4Transceiver PRODUCT FEATURES APPLICATIONS STANDARD. Specialist Manufacturer Voice Data Security.

WITH the rapid proliferation of numerous multimedia

VITESSE SEMICONDUCTOR CORPORATION. Bandwidth (MHz) VSC

InP-based Waveguide Photodetector with Integrated Photon Multiplication

ISSCC 2004 / SESSION 26 / OPTICAL AND FAST I/O / 26.6

ECEN 720 High-Speed Links Circuits and Systems

Sensitivity evaluation of fiber optic OC-48 p-i-n transimpedance amplifier receivers using sweep-frequency modulation and intermixing diagnostics

FTLD12CL3C. Product Specification 150 Gb/s (12x 12.5Gb/s) CXP Optical Transceiver Module PRODUCT FEATURES

+3.3V, 2.5Gbps Quad Transimpedance Amplifier for System Interconnects

SFP-10G-M 10G Ethernet SFP+ Transceiver

Signal Integrity Design of TSV-Based 3D IC

High-Speed Interconnect Technology for Servers

Fractional- N PLL with 90 Phase Shift Lock and Active Switched- Capacitor Loop Filter

QFX-SFP-10GE-SR (10G BASE-SR SFP+) Datasheet

** Dice/wafers are designed to operate from -40 C to +85 C, but +3.3V. V CC LIMITING AMPLIFIER C FILTER 470pF PHOTODIODE FILTER OUT+ IN TIA OUT-

WHITE PAPER LINK LOSS BUDGET ANALYSIS TAP APPLICATION NOTE LINK LOSS BUDGET ANALYSIS

06-011r0 Towards a SAS-2 Physical Layer Specification. Kevin Witt 11/30/2005

on-chip Design for LAr Front-end Readout

Product Specification 100GBASE-SR10 100m CXP Optical Transceiver Module FTLD10CE1C APPLICATIONS

Electronic-Photonic ICs for Low Cost and Scalable Datacenter Solutions

High Performance ZVS Buck Regulator Removes Barriers To Increased Power Throughput In Wide Input Range Point-Of-Load Applications

10GBd SFP+ Short Wavelength (850nm) Transceiver

EMPOWERFIBER 10Gbps 300m SFP+ Optical Transceiver EPP SRC

A 24-Gb/s Double-Sampling Receiver for Ultra-Low-Power Optical Communication

ECE137b Second Design Project Option

IN RECENT years, low-dropout linear regulators (LDOs) are

Signal Technologies 1

PROLABS J9150A-C 10GBd SFP+ Short Wavelength (850nm) Transceiver

A 70 Gbps NRZ optical link based on 850 nm band-limited VCSEL for data-center intra-connects

Ultra-high-speed Interconnect Technology for Processor Communication

Vertical-cavity surface-emitting lasers (VCSELs) for green optical interconnects

NON-AMPLIFIED PHOTODETECTOR USER S GUIDE

NON-AMPLIFIED HIGH SPEED PHOTODETECTOR USER S GUIDE

Optical Interconnection and Clocking for Electronic Chips

10Gb/s SFP+ Optical Transceiver Module 10GBASE-SR/SW

PROLABS GP-10GSFP-1S-C 10GBd SFP+ Short Wavelength (850nm) Transceiver

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2012

ECEN689: Special Topics in Optical Interconnects Circuits and Systems Spring 2016

A 0.18µm SiGe BiCMOS Receiver and Transmitter Chipset for SONET OC-768 Transmission Systems

Spatial Investigation of Transverse Mode Turn-On Dynamics in VCSELs

Data Sheet. Description. Features. Transmitter. Applications. Receiver. Package

/$ IEEE

IN HIGH-SPEED wireline transceivers, a (DFE) is often

Equalize 10Gbase-CX4 and Copper InfiniBand Links with the MAX3983

PROLABS JD121B-C. 10 Gigabit 1550nm SingleMode XFP Optical Transceiver, 40km Reach.

Gigabit Transmission in 60-GHz-Band Using Optical Frequency Up-Conversion by Semiconductor Optical Amplifier and Photodiode Configuration

11.1 Gbit/s Pluggable Small Form Factor DWDM Optical Transceiver Module

XFP-10GLR-OC192SR-C. 10 Gigabit XFP Transceiver, LC Connectors, 1310nm, SingleMode Fiber 10km

ECEN 720 High-Speed Links: Circuits and Systems. Lab3 Transmitter Circuits. Objective. Introduction. Transmitter Automatic Termination Adjustment

Product Specification RoHS-6 Compliant 10Gb/s 850nm Multimode Datacom XFP Optical Transceiver

Microcircuit Electrical Issues

Operation of VCSELs Under Pulsed Conditions

Petar Pepeljugoski IBM T.J. Watson Research Center

ISSCC 2003 / SESSION 4 / CLOCK RECOVERY AND BACKPLANE TRANSCEIVERS / PAPER 4.3

Photo-Electronic Crossbar Switching Network for Multiprocessor Systems

AN-1098 APPLICATION NOTE

Highly flexible polymeric optical waveguide for out-of-plane optical interconnects

Transcription:

JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 30, NO. 4, FEBRUARY 15, 2012 1 Transmitter Predistortion for Simultaneous Improvements in Bit Rate, Sensitivity, Jitter, and Power Efficiency in 20 Gb/s CMOS-Driven VCSEL Links Alexander V. Rylyakov, Clint L. Schow, Senior Member, IEEE, Benjamin G. Lee, Member, IEEE, Fuad E. Doany, Christian Baks, and Jeffrey A. Kash, Fellow, IEEE Abstract The effect of applying feed-forward equalization (FFE) on the transmitter side is studied for three different full optical links. In contrast to all previous works, the FFE settings are optimized for a complete link, rather than just the vertical-cavity surface-emitting laser output. The approach results in dramatic improvements in total link performance: 6 db in sensitivity, 3X in timing margin, and 2X in power efficiency at 15 Gb/s, and a record 5.7 pj/bit at 20 Gb/s. Index Terms CMOS analog integrated circuits, driver circuits, equalization, feed-forward equalization (FFE), optical communication, optical receivers, optoelectronic devices, photodetectors, photodiodes (PDs), pre-emphasis, semiconductor lasers. I. INTRODUCTION MULTIMODE vertical-cavity surface-emitting laser (VCSEL) transceivers dominate the short-reach optical market, from serial Ethernet and Fiber-channel transceivers to parallel transceivers and active optical cables for high-performance computing and switch/routers. VCSEL-based interconnects offer many advantages including 1) low device fabrication cost with wafer-scale burn-in and testing, 2) inexpensive assembly utilizing injection molded plastic optical systems with wide alignment tolerances, and 3) simple driver design due to low VCSEL thresholds and reasonable impedances. However, as data rates move to 20 Gb/s and higher, challenges emerge. Single VCSELs have been demonstrated at 30 40 Gb/s at several wavelengths [1], [2], but yielding adequate bandwidth in a Manuscript received July 29, 2011; revised September 27, 2011; accepted September 28, 2011. This work was supported by Defense Advanced Research Projects Agency under Contract MDA972-03-3-0004. A. V. Rylyakov, C. L. Schow, F. E. Doany, and C. W. Baks are with the IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598 USA (e-mail: sasha@us.ibm.com; cschow@us.ibm.com; doany@us.ibm.com; cbaks@us.ibm.com). B. G. Lee is with the IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598 USA, and also with Columbia University, New York, NY 10027 USA (e-mail: bglee@us.ibm.com). J. A. Kash is with Columbia University, New York, NY 10027 USA (e-mail: jeffkash@us.ibm.com). [Please check whether the affiliations of the authors are OK as typeset.]color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JLT.2011.2171917 Fig. 1. Optical link block diagram. The definition of the communication channel is generalized to include VCSEL, MMF, PD, and the RX front end. reliable production device may prove difficult. One of the most arduous challenges to achieving high data rates in multimode links is the receivers. High data rates are more easily achieved with small photodiodes (PDs), but as devices shrink, assembly costs rise as alignment tolerances tighten. Achieving sufficient receiver bandwidth through the receiver (RX) front-end and limiting amplifier (LA) will be challenging, especially for CMOS designs optimized for power efficiency and compatibility with reasonably sized PDs (25 40 m diameter). Unlike a typical short-reach electrical link (backplane or chip-to-chip), the multimode fiber (MMF) is not introducing excessive amounts of loss or dispersion, even over distance scales on the order of a hundred meters. As a result, short-reach optical interconnects feature nearly ideal channel performance with little or no need for equalization. From the purely electrical point of view, however, the full end-to-end communication channel is not limited to the MMF alone. Electrical data need to be converted into an optical form, received, and amplified. It is natural, therefore, to broaden the definition of the channel to include the VCSEL, the PD and the RX front end, as illustrated in Fig. 1. When one considers this broadly defined channel at speeds of 20 Gb/s and above, its characteristics are far from ideal and communication over this channel can definitely benefit from equalization. Feed-forward equalization (FFE) of VCSEL-based optical transmitters is a well-known technique [3] [14]. All previously reported results, however, have not utilized the full power of equalization, focusing exclusively on boosting the VCSEL bandwidth. It is important to note that for high-speed operation, the VCSEL is typically biased well into the linear mode of operation (far above threshold). The application of the FFE-shaped transmit signal through the VCSEL can, therefore, be used to speed up both the transmitter (TX) and the receiver, improving the overall link performance. Web Version 0733-8724/$26.00 2011 IEEE

2 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 30, NO. 4, FEBRUARY 15, 2012 In this paper, we study the effect of TX FFE for three different receivers, observing significant improvements in the overall performance for all three optical links. The TX FFE is tunable in a wide range and the settings are independently optimized for each full link, targeting the total end-to-end link performance metrics. The corresponding VCSEL optical outputs are recorded, but only to document the internal signals in the channel. At channel-optimized TX FFE settings the VCSEL optical output might be overequalized and appear to be deviating from the optimum point. But the final electrical data on the RX side show a significant benefit from the predistorted optical waveform generated by the TX. Application of TX FFE to the full link produces results that are hard or impossible to obtain by optimizing the VCSEL output alone. This paper is organized as follows. After a brief introduction, we start with overview of VCSEL equalization techniques in Section II. Section III describes the FFE-enabled VCSEL driver used in all experiments. Sections IV, V, and VI present the three different receivers, with results for all three corresponding full optical links. The results are summarized in the conclusion in Section VII. II. OVERVIEW OF VCSEL EQUALIZATION TECHNIQUES A number of methods for shaping the driver current waveform in order to speedup the VCSEL optical response have been reported in the literature [3] [14]. Several relatively complicated equalization schemes have been proposed to exactly cancel VCSEL dynamics [3], [4]. Although interesting from theoretical point of view, the desired current waveforms are hard to implement, especially at high data rates. For many practical purposes, however, VCSEL can be modeled as a low-pass filter, so a natural equalization technique would be to create a driver with a high-pass characteristic [5] [7]. The combined driver-plus-vcsel system would then become more broadband, enabling data transfer at data rates well above the VCSEL bandwidth. A peaked current waveform can be created by simply adding a speedup inductor to the driver [8], [9]. While this approach was shown to work, it has a number of disadvantages. It introduces a significant area overhead, but, most importantly, the parameters of the inductor-based high-pass filter are not tunable, making it harder to exactly match the VCSEL characteristic. In a digital system, the high-pass driver can be realized with a latch-based FFE [10]. One advantage of a digital FFE is that the filter tap spacing automatically tracks the bit interval. This feature, however, is only important for filters with a large number of taps. The power and area overheads associated with clock distribution and latching of high-speed data make a digital FFE system far less attractive than a continuous-time FFE [7], [11], where the delay between the filter taps is created by a tunable delay line. Regardless of the particular equalization scheme, all previously reported results focused on using FFE for optimizing the optical output of the VCSEL. Some equalization efforts [12], [13] specifically addressed the raise/fall time asymmetry of the VCSEL and [14] additionally included equalization of modal dispersion in a long MMF link. None of these works studied the benefits of transmitter-side equalization on a full optical link. Fig. 2. Block diagram of FFE-enabled VCSEL driver with waveforms. Fig. 3. FFE TX CMOS chip image with measured electrical outputs at 10 and 20 Gb/s. III. FFE-ENABLED TRANSMITTER The block diagram of the VCSEL driver with one-tap continuous-time FFE is shown in Fig. 2. This is the transmitter that was used in all full-link experiments described in this paper. The driver consists of a five-stage Cherry Hooper preamplifier (PA) followed by a current-mode logic (CML) output stage that implements the feed-forward equalizer. The FFE is powered by a single supply, VDD OS, and operates by tapping a portion of the PA output signal, delaying and buffering it, then subtracting it from the main portion of the PA signal (graphically shown in Fig. 2). Note that although not explicitly shown in Fig. 2, the transmitter is a fully differential design. Fig. 2 also includes a graphical illustration of the FFE circuit operation. The height of the equalization pulse, represented by an overshoot or undershoot at each data transition edge, is controlled by tap weight (via vb tab). The width of the equalization pulse is controlled by the delay (via vb delay). The preemphasis can be completely turned off by grounding the vb tab and vb delay pins. The transmitter and all three receiver circuits reported in this paper were fabricated in standard bulk digital 90 nm CMOS (IBM s CMOS9SF process). The micrograph of the TX chip is shown in Fig. 3 together with the measured electrical eye diagrams at 10 and 20 Gb/s with FFE enabled. The fully assembled FFE TX is shown in Fig. 4. The wirebond test site includes two 5 m diameter VCSELs to balance the load on the output stage. Only one VCSEL was used in the full-link experiments described below. Also shown in Fig. 4 are the decoupling capacitors for each of the power supplies used in the experiment and an interposer for landing the high-speed probes. The interposer was used to improve the Web Version

RYLYAKOV et al.: TRANSMITTER PREDISTORTION FOR IMPROVEMENTS IN BIT RATE, SENSITIVITY, JITTER, AND POWER EFFICIENCY 3 Fig. 5. Block diagram of the full link incorporating the RX1 receiver. IE E W E eb P r Ve oo rs f ion Fig. 4. FFE TX wirebond test site. mechanical robustness of the test site and the electrical reproducibility of the measurements. IV. TRANSMITTER EQUALIZED FULL LINK WITH RECEIVER 1 TX and RX assemblies were made by attaching CMOS chips to test cards and wire bonding all power, ground and bias connections. Short wire bonds were used to connect the VCSELs and PDs to their respective ICs. All high-speed electrical I/O was applied and collected using coplanar GSGSG microwave probes, while lensed 50 m MMF probes were used for both the TX and RX optical coupling. A 30 GHz bandwidth sampling oscilloscope was used for all eye-diagram measurements, and a 17 GHz bandwidth Newport D-25xr PD was employed for PRBS and the transmitter eye-diagrams. The pattern was signal applied at the FFE TX input was 500 mv. The VCSEL and PD arrays for all of the links presented here were designed and fabricated by Emcore Corporation. Both of the devices are conventional 850 nm top-emitting/detecting mesa structures fabricated on un-thinned semi-insulating substrates using a volume production process. Further details of the VCSEL and PD growth and fabrication can be found in [15]. The VCSEL in the FFE transmitter had an aperture of 5 m, a series resistance of approximately 50, and a slope efficiency mw/ma. All of the receivers utilized 25 m diameter of devices with a responsivity of 0.55 A/W, and a capacitance of 76 ff at their nominal 3 V bias point. A block diagram of the full link containing the CMOS receiver that will be referred to as RX1 is shown in Fig. 5. The RX1 receiver is the first design we implemented in 90 nm CMOS [9] and its bandwidth was lower than expected. Although this receiver did operate up to 20 Gb/s when tested with a high-speed reference VCSEL, it did not support full CMOS links at 15 20 Gb/s with transmitters constructed with first-generation 90 nm CMOS laser drivers. The RX1 design was a fully differential transimpedance amplifier (TIA) followed by a five-stage Cherry Hooper LA, two CML buffer stages and a final differential output driver. The TIA is ac coupled through on-chip capacitors nominally valued at 2.5 pf. The low-frequency cutoff of the receiver, including the front-end ac coupling and the active offset compensation loop surrounding MHz, sufficient to minimize the power penalty the LA, is due to baseline wander assuming balanced 8b/10b data encoding. Device-level schematics for the RX1 chip can be Fig. 6. TX optical output measured with a high-speed reference PD and singleended RX electrical output for the RX1 link (a) without and (b) with transmitter predistortion. Although only explicitly shown on the 15 Gb/s eye diagrams, the vertical scales are the same at the other data rates. found in [16]. Power supplies were separated for the TIA, LA, and output buffers to minimize both power consumption and switching-induced power supply noise. Fig. 6 presents the TX output and RX electrical output eye diagrams measured for the RX1 link at data rates from 15 to 20 Gb/s with and without TX predistortion. All data in this section were obtained with the FFE TX operating under nominal conditions, where it consumed 61 mw with the FFE enabled and 56 mw with the FFE disabled. The RX1 receiver consumed 104 mw. The link total power dissipation was therefore 165 mw with, and 160 mw without predistortion. As Fig. 6 illustrates, the effect of transmitter predistortion is dramatic. While the TX eye diagrams are visibly distorted when the FFE circuit is operating, the quality of the RX electrical output is improved at all data rates with reduced jitter and wide-open eye diagrams at 17.5 and 20 Gb/s. The visible improvements in eye-diagram quality affected by TX predistortion translate into better sensitivity and timing

4 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 30, NO. 4, FEBRUARY 15, 2012 Fig. 7. Receiver sensitivity characteristics for the RX1 link at multiple data rates (a) without and (b) with TX predistortion. Fig. 8. Timing margin characteristics for the RX1 link at 15 Gb/s (a) without and (b) with TX predistortion. margin. Fig. 7 presents the receiver sensitivity characteristics for the RX1 link as a function of data rate with and without TX predistortion. The RX sensitivity is improved by db at 15 Gb/s. With the TX FFE enabled, the link is operational at 20 Gb/s, whereas without equalization the link is limited to data rates Gb/s. With TX predistortion, the receiver sensitivity at 20 Gb/s at a bit error ratio (BER) of is db m and improves to better than db m at 10 Gb/s. The timing margin characteristics for the RX1 link are shown in Fig. 8 at a data rate of 15 Gb/s. TX predistortion improves the horizontal eye opening at by 0.42 unit interval (UI), or 28 ps, at 15 Gb/s. With TX predistortion, the timing margin (at ) is 0.24 UI (12 ps) at 20 Gb/s and 0.78UI (78 ps) at 10 Gb/s. It is instructive to plot the power efficiency of optical links as a function of data rate. Fig. 9 presents the power efficiency Fig. 9. Power efficiency as a function of data rate for the RX1 link with and without TX predistortion. of the RX 1 link when operated with and without TX predistortion for speeds up to 20 Gb/s. The curve in Fig. 9 was obtained as follows. At a given data rate, the best power efficiency was obtained by minimizing the link power consumption while ensuring that two conditions were satisfied: 1) the BER remained less than ; and 2) the RX output voltage swing was greater than 200 mv. In general, an optical link will have an optimum data rate at which it operates most efficiently. At this optimum speed, the link is gain limited, i.e., the link is operating at a very low error rate and the power consumption is determined by meeting the second criteria of a minimum RX output voltage swing. Above the optimum rate, the extra power (mw) required to maintain performance typically outpaces the increased data rate (Gb/s), degrading the efficiency in mw/gb/s, or equivalently, pj/bit. At low data rates, the power efficiency curve also rises: since the link is operating at its gain-limited minimum power, the efficiency degrades as the data rate is lowered because the link power dissipation is amortized over fewer transferred bits. The power efficiency curve without TX predistortion clearly exhibits an optimum data rate, but the curve with predistortion is flattened. Between 10 and 17.5 Gb/s, the equalized RX1 link operates at a power efficiency of slightly less than 6 pj/bit. At 20 Gb/s, the power efficiency of the equalized RX1 link is 8.3 pj/bit. V. TRANSMITTER EQUALIZED FULL LINK WITH RECEIVER 2 The block diagram of the second optical link, using a CMOS receiver that will be referred to here as the RX2 design, is shown in Fig. 10. The RX2 chip is a second-generation chip that was targeted specifically at achieving higher bandwidth than the original RX1 circuit. Consequently, there are substantial differences between the RX 1 and RX2 designs. The RX2 LA was implemented with only Cherry Hooper stages, eliminating the CML stages in the output block of the RX1 chip. An additional stage was added to the LA, bringing the total to six, to compensate for the reduced gain incurred by removing the CML stages. The LA in the RX2 design directly drives an inductively peaked differential output stage. Finally, although the differential TIA architecture was unchanged between the RX1 and RX2 circuits, the value of the transimpedance resistors were reduced by a factor of 2 in the RX2 design in an effort to improve the speed and, therefore, sensitivity at higher data rates, at the expense of reduced sensitivity at lower data rates. Testing of the RX2 link proceeded in the same manner described previously for the RX1 link. Single-ended electrical Web Version

RYLYAKOV et al.: TRANSMITTER PREDISTORTION FOR IMPROVEMENTS IN BIT RATE, SENSITIVITY, JITTER, AND POWER EFFICIENCY 5 Fig. 10. Block diagram of the full link incorporating the RX2 receiver. Fig. 11. Single-ended receiver output eye diagrams for the RX2 link (a) without TX predistortion at an RX2 chip consumption of 85 mw, (b) with TX predistortion at the same receiver power dissipation as (a), and (c) with TX predistortion with the RX2 chip operating at a power consumption of 53 mw. The vertical scale is the same for all of the eye diagrams. output eye diagrams produced by the RX 2 link under multiple operating conditions as a function of data rate are presented in Fig. 11. Fig. 11(a) shows data for the link operating with no TX predistortion; the eye diagram at 17.5 Gb/s is starting to close and is completely closed at 20 Gb/s. Fig. 11(b) and (c) was obtained on the RX2 link with TX predistortion at two different receiver operating points. The data in Fig. 11(a) and (b) were obtained with the RX2 receiver consuming 85 mw, while in Fig. 11(c), the RX2 operated at a lower power dissipation of 53 mw. The FFE TX was set to the same conditions as in Fig. 5(a) and (b) so the TX eye diagrams are not repeated here. As with the RX1 link, TX predistortion yields visible improvements to the total link performance and extends the operating range of the link to 20 Gb/s. The receiver sensitivity characteristics of the RX2 link are shown in Fig. 12 for the same operating conditions used to obtain the eye diagrams of Fig. 11. The curves in Fig. 12(a) were taken with no TX predistortion at a receiver power dissipation of 85 mw, the data in (b) were obtained at the same RX setting but with TX predistortion enabled, and the characteristics in (c) were obtained at the lower RX power dissipation setting of 53 mw. Directly comparing Fig. 12(a) and (b), it is evident that under the same RX operating conditions, TX predistortion improves the sensitivity at by more than 4 db at 15 Gb/s and extends the capability of the link to 20 Gb/s. Fig. 13 plots the timing margin for the RX2 link with and without TX predistortion at the 85 mw RX power setting. At 15 Gb/s and, predistortion yields an improvement of 0.09 UI (6 ps). Comparing the 17.5 Gb/s data illustrates a much larger improvement with TX predistortion: without it, Fig. 12. Receiver sensitivity characteristics for the RX2 link as a function of data rate (a) without TX predistortion at an RX2 chip consumption of 85 mw, (b) with TX predistortion at the same receiver power dissipation as (a), and (c) with TX predistortion with the RX2 chip operating at a power consumption of 53 mw. Fig. 13. Link timing margin measured at the RX2 output at 15 and 17.5 Gb/s (a) without and (b) with TX predistortion. The RX2 chip power consumption was 85 mw for both (a) and (b). the link does not operate below, while with predistortion, the link has a 0.46 UI (31 ps) horizontal eye opening at. At 20 Gb/s, with TX predistortion, the timing margin is 0.42 UI (28 ps) at. Fig. 14 plots the sensitivity and timing margin penalties for transmission through up to 150 m of OM4 fiber. At 17.5 Gb/s, the sensitivity penalty incurred for 150 m transmission compared to back-to-back is db. Sensitivity measurements at 20 Gb/s were more challenging through fiber since the optical attenuator used for testing had to be eliminated to remove its insertion loss ( db), but the penalty for 150 m transmission is no greater than 2 db at 20 Gb/s. The timing margin curves in Fig. 12 indicate penalties of less than 0.05 UI (3 ps) and 0.1 UI (5 ps) for 150 m transmission at data rates of 17.5 Gb/s and 20 Gb/s, respectively. The power efficiency of the RX2 link as a function of data rate was measured in the same manner as described previously for the RX1 link and the results are plotted in Fig. 15. As with the RX1 link, the power efficiency curve as a function of data rate is significantly flattened with TX pre-distortion. The RX2 Web Version

6 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 30, NO. 4, FEBRUARY 15, 2012 Fig. 17. Block diagram of the RX3 link including a 10 PCB trace connected to one of the RX outputs. IE E W E eb P r Ve oo rs f ion Fig. 14. Sensitivity and timing margin data for the RX2 link operating with TX predistortion through up to 150 m of OM4 fiber. (a) Receiver sensitivity characteristics at 17.5 Gb/s. (b) Timing margin at 17.5 Gb/s. (c) Timing margin at 20 Gb/s. The RX2 chip power consumption was 85 mw for all the datasets. Fig. 18. Eye diagrams at 20 Gb/s taken at the TX output, the directly probed RX output, and the RX output that traverses 10 of PCB trace, for two link conditions: double equalized with both TX and RX FFE circuits enabled, and with no equalization. Fig. 15. Power efficiency as a function of data rate for the RX2 link with and without TX predistortion. Fig. 16. Block diagram of the double-equalized full link incorporating the RX3 receiver. link achieves its best power efficiency of 4.6 pj/bit at 15 Gb/s, and at 20 Gb/s the power efficiency is 5.7 pj/bit. These numbers are the best efficiencies reported to date for a complete optical Gb/s. link operating at a data rate VI. DOUBLE EQUALIZED FULL LINK WITH RECEIVER 3 The last link we report here is shown in Fig. 16 and incorporates a receiver chip that will be referred to as RX3. As with the RX2 circuit, RX3 is a second generation design. Both RX2 and RX3 share the same differential TIA and use the same Cherry Hooper stages in the LA. However, the RX3 chip only uses five Cherry Hooper LA stages. Following the LA is an FFE circuit similar to the equalizer in the TX chip that was described in Section III. The FFE output driver in the RX3 chip was specifically designed to improve signal integrity when driving through lossy package and printed circuit board (PCB) interconnects. In order Fig. 19. Receiver sensitivity characteristics for the RX3 link (a) at 15 Gb/s with TX and RX equalization enabled, with only TX equalization and with only RX equalization, and (b) at 10, 12.5, 15, 17.5, and 20 Gb/ with both TX and RX FFE circuits enabled. to illustrate the capabilities of the RX3 chip, one of its outputs long trace on a Nelco 4000 PCB was connected through a as shown in Fig. 17. In this configuration, there are two links: the optical link between the TX and RX and the electrical link through the PCB between the RX output and the test equipment. In the data that follow, the total link power dissipation was 206 mw. Fig. 18 presents eye diagrams at 20 Gb/s taken at the TX output, the directly probed RX output, and at the end of the PCB trace. The eye diagrams are shown for two link configurations: double equalized, meaning that both the TX predistortion and RX FFE output were enabled, and no equalization, where both TX and RX equalizers were turned off. The efficacy of the

RYLYAKOV et al.: TRANSMITTER PREDISTORTION FOR IMPROVEMENTS IN BIT RATE, SENSITIVITY, JITTER, AND POWER EFFICIENCY 7 equalizers is dramatically illustrated by Fig. 18. Without equalization the link is broken, even at the directly probed RX output. With both TX and RX equalization enabled, the link is operational even with the extra PCB trace included. Fig. 19 presents receiver sensitivity characteristics for the RX3 link. Fig. 19(a) plots the RX sensitivity at 15 Gb/s measured after the PCB trace under three different conditions: double equalization, transmitter equalization without receiver equalization, and receiver equalization without transmitter equalization. At 15 Gb/s, both TX and RX equalization is required to successfully close the optical and PCB electrical links. Fig. 19(b) plots the RX3 sensitivity characteristics measured after the PCB trace at multiple data rates up to 20 Gb/s. VII. CONCLUSION The effect of transmitter equalization is studied for three VCSEL-based multimode optical links. Unlike all previous works, the TX FFE settings are optimized for the entire link, including the receiver. Wide tuning range of the TX FFE allows strong predistortion of the VCSEL output, resulting in a dramatic improvement of all key metrics of the overall link performance. The example of the bandwidth-limited receiver RX1 described in Section IV is particularly striking: a link that struggled at 15 Gb/s was made to operate at 20 Gb/s. At 15 Gb/s, transmitter predistortion was shown to improve the sensitivity by db, horizontal eye opening by 3X, and the overall link efficiency by 2X. The demonstrated performance gains can be used to reliably build high-speed links using slower components, for example, receivers incorporating larger PDs. We believe that the technique of transmitter pre-distortion demonstrated here offers a path to better performance and operating margin at all data rates and will be a powerful tool in extending the dominance of VCSEL-based interconnects for short-reach applications to data rates beyond 25 Gb/s. ACKNOWLEDGMENT The PDs and VCSELs were designed and manufactured by Emcore Corp, and the authors would like to thank N. Li, K. Jackson, and the rest of the team at Emcore. REFERENCES [1] [Please provide the names of all the authors in Refs. [1] and [7] [16].]Y.-C. Chang et al., High-efficiency, high-speed VCSELs with 35 Gbit/s error-free operation, Elecron. Lett., vol. 43, pp. 1022 1023, Sep. 2007. [2] R. H. Johnson and D. Kuchta, 30 Gb/s directly modulated 850 nm datacom VCSELs, presented at the the Conf. Lasers Electro-Opt./ Quantum Electron. Laser Sci., San Jose, CA, May 4 9, 2008. [3] L. Illing and M. Kennel, Shaping current waveforms for direct modulation of semiconductor lasers, IEEE J. Sel. Topics Quantum Electron., vol. 40, no. 5, pp. 445 452, May 2004. [4] N. Dokhane and G. L. Lippi, Faster modulation of single-mode semiconductor lasers through patterned current switching: Numerical investigation, Proc. Enst. Elect. Eng. Optoelectronics, vol. 151, pp. 61 68, Apr. 2004. [5] A. Kern, A. Chandrakasan, and I. Young, 18 Gb/s optical IO: VCSEL driver and TIA in 90 nm CMOS, in Proc. Symp. VLSI Circuits Dig., Jun. 2007, pp. 276 277. [6] S. Palermo, A. Emami-Neyestanak, and M. Horowitz, A 90 nm CMOS 16 Gb/s transceiver for optical interconnects, IEEE J. Solid-State Circuits, vol. 43, no. 5, pp. 1235 1246, May 2008. [7] I. A. Young et al., Optical I/O technology for tera-scale computing, IEEE J. Solid-State Circuits, vol. 45, no. 1, pp. 235 248, Jan. 2010. [8] C. Kromer et al., A 100-mW 4 2 10 Gb/s transceiver in 80-nm CMOS for high-density optical interconnects, IEEE J. Solid-State Circuits, vol. 40, no. 12, pp. 2667 2679, Dec. 2005. [9] B. G. Lee et al., Low-power CMOS-driven transmitters and receivers, in Proc. Conf. Lasers Electro-Optics, May 2010, pp. 1 2. [10] M. Bruensteiner et al., 3.3-V CMOS pre-equalization VCSEL transmitter for gigabit multimode fiber links, IEEE Photon. Technol. Lett., vol. 11, no. 10, pp. 1301 1303, Oct. 1999. [11] Y. Tsunoda et al., 25-Gb/s transmitter for optical interconnection with 10-Gb/s VCSEL using dual peak-tunable pre-emphasis, presented at the Opt. Fiber Commun. Conf., Los Angeles, CA, Mar. 2011, Paper OThZ2. [12] D. Kucharski et al., A 20 Gb/s VCSEL driver with pre-emphasis and regulated output impedance in 0.13 m CMOS, in Proc. IEEE Int. Solid-State Circuits Conf., San Francisco, CA, Feb. 2005, pp. 222 594. [13] S. P. Voinigescu et al., Circuits and technologies for highly integrated optical networking ICs at 10 Gb/s to 40 Gb/s, in Proc. IEEE Custom Integr. Circuits Conf., May 2001, pp. 331 338. [14] D. Watanabe et al., CMOS optical 4-PAM VCSEL driver with modaldispersion equalizer for 10 Gb/s 500 m MMF transmission, in Proc. IEEE Int. Solid-State Circuits Conf., San Francisco, CA, Feb. 2009, pp. 106 107. [15] N. Y. Li et al., High-performance 850 nm VCSEL and photodetector arrays for 25 Gb/s parallel optical interconnects, presented at the Opt. Fiber Commun. Conf., San Diego, CA, Mar. 2010, Paper OTuP2. [16] C. L. Schow et al., A 24-channel, 300 Gb/s, 8.2 pj/bit, full-duplex fiber-coupled optical transceiver module based on a single Holey CMOS IC, J. Lightw. Technol., vol. 29, no. 4, pp. 542 553, Feb. 2011. Alexander V. Rylyakov received the M.S. degree in physics from the Moscow Institute of Physics and Technology, Moscow, Russia, in 1989, and the Ph.D. degree in physics from State University of New York at Stony Brook (SUNY Stony Brook) in 1997. From 1994 to 1999, he was with the Department of Physics, SUNY Stony Brook, where he was focused on the design and testing of integrated circuits based on Josephson junctions. In 1999, he joined IBM Thomas J. Watson Research Center, Yorktown Heights, NY, as a Research Staff Member. His current research interests include the areas of digital phase-locked loops and integrated circuits for wireline and optical communication. Clint L. Schow (SM 10) received the B.S. degree in 1994, and the M.S. and Ph.D. degrees from the University of Texas at Austin in 1997 and 1999, respectively, all in electrical engineering. In 1999, he joined IBM in Rochester, MN, assuming responsibility for the optical receivers used in IBM s optical transceiver business. From 2001 to 2004, he was with Agility Communications in Santa Barbara, CA, developing high-speed optoelectronic modulators and tunable laser sources for optical communications. In 2004, he joined the IBM Thomas J. Watson Research Center, Yorktown Heights, NY, as a Research Staff Member. He is the author or coauthor of more than 100 journal or conference articles. He has six issued and more than ten pending patents. His current research interests include parallel optical interconnect technologies and high-speed CMOS circuits for fiber-optic data links. Benjamin G. Lee (M 04) received the B.S. degree from Oklahoma State University, Stillwater, in 2004, and the M.S. and Ph.D. degrees from Columbia University, New York, in 2006 and 2009, respectively, all in electrical engineering. In 2009, he became a Postdoctoral Researcher at IBM Thomas J. Watson Research Center, Yorktown Heights, NY, where he is currently a Research Staff Member. He is also an Assistant Adjunct Professor of electrical engineering at Columbia University, New York. His research interests include silicon photonic devices, integrated optical switches and networks for high-performance computing systems and datacenters, and highly parallel multimode transceivers. Dr. Lee is a member of the IEEE Photonics Society and the Optical Society. He has served on the technical program committee for the Fourth and Fifth ACM/IEEE International Symposium on Networks-on-Chip. Web Version

8 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 30, NO. 4, FEBRUARY 15, 2012 Fuad E. Doany received the Ph.D. degree in chemical physics from the University of Pennsylvania, Philadelphia, in 1984. He was a Postdoctoral Fellow at the California Institute of Technology from 1984 to 1985. In 1985, he joined the IBM Thomas J. Watson Research Center, Yorktown Heights, NY. As a Research Staff Member at IBM, he focuses on laser spectroscopy, applied optics, projection displays, and laser material processing for electronic packaging. Since 2000, he has focused on high-speed optical link and systems design, and optoelectronic packaging. He is the author or coauthor of many technical papers and holds more than 50 U.S. patents. Christian W. Baks received the B.S. degree in applied physics from the Fontys College of Technology, Eindhoven, The Netherlands, in 2000, and the M.S. degree in physics from the State University of New York Albany, Albany, in 2001. He joined the IBM Thomas J. Watson Research Center, Yorktown Heights, NY, as an Engineer in 2001, where he is involved in high-speed optoelectronic package and backplane interconnect design specializing in signal integrity issues. Jeffrey A. Kash (SM 94 F 05) received the Ph.D. degree in physics from University of California, Berkeley. He joined IBM Research in 1981, initially studying femtosecond electron and exciton dynamics in semiconductors. In 1995, he coinvented picosecond imaging circuit analysis, an optical technique which is used today to debug advanced CMOS ICs. From 2000 to 2011, he was focused on the use of optical interconnects in next and future generations of supercomputers, directing DARPA-sponsored IBM programs for chip-to-chip optical interconnects and nanophotonic optical switches. In 2011, he moved to Columbia University, New York, where he continues to direct research programs in optics. He has published more than 150 papers in major technical journals and holds 23 patents. Dr. Kash is a Fellow of the American Physical Society. He also serves as a member of the IEEE Photonics Society Board of Governors and is also the Vice President for Membership. Web Version

JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 30, NO. 4, FEBRUARY 15, 2012 1 Transmitter Predistortion for Simultaneous Improvements in Bit Rate, Sensitivity, Jitter, and Power Efficiency in 20 Gb/s CMOS-Driven VCSEL Links Alexander V. Rylyakov, Clint L. Schow, Senior Member, IEEE, Benjamin G. Lee, Member, IEEE, Fuad E. Doany, Christian Baks, and Jeffrey A. Kash, Fellow, IEEE Abstract The effect of applying feed-forward equalization (FFE) on the transmitter side is studied for three different full optical links. In contrast to all previous works, the FFE settings are optimized for a complete link, rather than just the vertical-cavity surface-emitting laser output. The approach results in dramatic improvements in total link performance: 6 db in sensitivity, 3X in timing margin, and 2X in power efficiency at 15 Gb/s, and a record 5.7 pj/bit at 20 Gb/s. Index Terms CMOS analog integrated circuits, driver circuits, equalization, feed-forward equalization (FFE), optical communication, optical receivers, optoelectronic devices, photodetectors, photodiodes (PDs), pre-emphasis, semiconductor lasers. I. INTRODUCTION MULTIMODE vertical-cavity surface-emitting laser (VCSEL) transceivers dominate the short-reach optical market, from serial Ethernet and Fiber-channel transceivers to parallel transceivers and active optical cables for high-performance computing and switch/routers. VCSEL-based interconnects offer many advantages including 1) low device fabrication cost with wafer-scale burn-in and testing, 2) inexpensive assembly utilizing injection molded plastic optical systems with wide alignment tolerances, and 3) simple driver design due to low VCSEL thresholds and reasonable impedances. However, as data rates move to 20 Gb/s and higher, challenges emerge. Single VCSELs have been demonstrated at 30 40 Gb/s at several wavelengths [1], [2], but yielding adequate bandwidth in a Manuscript received July 29, 2011; revised September 27, 2011; accepted September 28, 2011. This work was supported by Defense Advanced Research Projects Agency under Contract MDA972-03-3-0004. A. V. Rylyakov, C. L. Schow, F. E. Doany, and C. W. Baks are with the IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598 USA (e-mail: sasha@us.ibm.com; cschow@us.ibm.com; doany@us.ibm.com; cbaks@us.ibm.com). B. G. Lee is with the IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598 USA, and also with Columbia University, New York, NY 10027 USA (e-mail: bglee@us.ibm.com). J. A. Kash is with Columbia University, New York, NY 10027 USA (e-mail: jeffkash@us.ibm.com). [Please check whether the affiliations of the authors are OK as typeset.]color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JLT.2011.2171917 Fig. 1. Optical link block diagram. The definition of the communication channel is generalized to include VCSEL, MMF, PD, and the RX front end. reliable production device may prove difficult. One of the most arduous challenges to achieving high data rates in multimode links is the receivers. High data rates are more easily achieved with small photodiodes (PDs), but as devices shrink, assembly costs rise as alignment tolerances tighten. Achieving sufficient receiver bandwidth through the receiver (RX) front-end and limiting amplifier (LA) will be challenging, especially for CMOS designs optimized for power efficiency and compatibility with reasonably sized PDs (25 40 m diameter). Unlike a typical short-reach electrical link (backplane or chip-to-chip), the multimode fiber (MMF) is not introducing excessive amounts of loss or dispersion, even over distance scales on the order of a hundred meters. As a result, short-reach optical interconnects feature nearly ideal channel performance with little or no need for equalization. From the purely electrical point of view, however, the full end-to-end communication channel is not limited to the MMF alone. Electrical data need to be converted into an optical form, received, and amplified. It is natural, therefore, to broaden the definition of the channel to include the VCSEL, the PD and the RX front end, as illustrated in Fig. 1. When one considers this broadly defined channel at speeds of 20 Gb/s and above, its characteristics are far from ideal and communication over this channel can definitely benefit from equalization. Feed-forward equalization (FFE) of VCSEL-based optical transmitters is a well-known technique [3] [14]. All previously reported results, however, have not utilized the full power of equalization, focusing exclusively on boosting the VCSEL bandwidth. It is important to note that for high-speed operation, the VCSEL is typically biased well into the linear mode of operation (far above threshold). The application of the FFE-shaped transmit signal through the VCSEL can, therefore, be used to speed up both the transmitter (TX) and the receiver, improving the overall link performance. Print Version 0733-8724/$26.00 2011 IEEE

2 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 30, NO. 4, FEBRUARY 15, 2012 In this paper, we study the effect of TX FFE for three different receivers, observing significant improvements in the overall performance for all three optical links. The TX FFE is tunable in a wide range and the settings are independently optimized for each full link, targeting the total end-to-end link performance metrics. The corresponding VCSEL optical outputs are recorded, but only to document the internal signals in the channel. At channel-optimized TX FFE settings the VCSEL optical output might be overequalized and appear to be deviating from the optimum point. But the final electrical data on the RX side show a significant benefit from the predistorted optical waveform generated by the TX. Application of TX FFE to the full link produces results that are hard or impossible to obtain by optimizing the VCSEL output alone. This paper is organized as follows. After a brief introduction, we start with overview of VCSEL equalization techniques in Section II. Section III describes the FFE-enabled VCSEL driver used in all experiments. Sections IV, V, and VI present the three different receivers, with results for all three corresponding full optical links. The results are summarized in the conclusion in Section VII. II. OVERVIEW OF VCSEL EQUALIZATION TECHNIQUES A number of methods for shaping the driver current waveform in order to speedup the VCSEL optical response have been reported in the literature [3] [14]. Several relatively complicated equalization schemes have been proposed to exactly cancel VCSEL dynamics [3], [4]. Although interesting from theoretical point of view, the desired current waveforms are hard to implement, especially at high data rates. For many practical purposes, however, VCSEL can be modeled as a low-pass filter, so a natural equalization technique would be to create a driver with a high-pass characteristic [5] [7]. The combined driver-plus-vcsel system would then become more broadband, enabling data transfer at data rates well above the VCSEL bandwidth. A peaked current waveform can be created by simply adding a speedup inductor to the driver [8], [9]. While this approach was shown to work, it has a number of disadvantages. It introduces a significant area overhead, but, most importantly, the parameters of the inductor-based high-pass filter are not tunable, making it harder to exactly match the VCSEL characteristic. In a digital system, the high-pass driver can be realized with a latch-based FFE [10]. One advantage of a digital FFE is that the filter tap spacing automatically tracks the bit interval. This feature, however, is only important for filters with a large number of taps. The power and area overheads associated with clock distribution and latching of high-speed data make a digital FFE system far less attractive than a continuous-time FFE [7], [11], where the delay between the filter taps is created by a tunable delay line. Regardless of the particular equalization scheme, all previously reported results focused on using FFE for optimizing the optical output of the VCSEL. Some equalization efforts [12], [13] specifically addressed the raise/fall time asymmetry of the VCSEL and [14] additionally included equalization of modal dispersion in a long MMF link. None of these works studied the benefits of transmitter-side equalization on a full optical link. Fig. 2. Block diagram of FFE-enabled VCSEL driver with waveforms. Fig. 3. FFE TX CMOS chip image with measured electrical outputs at 10 and 20 Gb/s. III. FFE-ENABLED TRANSMITTER The block diagram of the VCSEL driver with one-tap continuous-time FFE is shown in Fig. 2. This is the transmitter that was used in all full-link experiments described in this paper. The driver consists of a five-stage Cherry Hooper preamplifier (PA) followed by a current-mode logic (CML) output stage that implements the feed-forward equalizer. The FFE is powered by a single supply, VDD OS, and operates by tapping a portion of the PA output signal, delaying and buffering it, then subtracting it from the main portion of the PA signal (graphically shown in Fig. 2). Note that although not explicitly shown in Fig. 2, the transmitter is a fully differential design. Fig. 2 also includes a graphical illustration of the FFE circuit operation. The height of the equalization pulse, represented by an overshoot or undershoot at each data transition edge, is controlled by tap weight (via vb tab). The width of the equalization pulse is controlled by the delay (via vb delay). The preemphasis can be completely turned off by grounding the vb tab and vb delay pins. The transmitter and all three receiver circuits reported in this paper were fabricated in standard bulk digital 90 nm CMOS (IBM s CMOS9SF process). The micrograph of the TX chip is shown in Fig. 3 together with the measured electrical eye diagrams at 10 and 20 Gb/s with FFE enabled. The fully assembled FFE TX is shown in Fig. 4. The wirebond test site includes two 5 m diameter VCSELs to balance the load on the output stage. Only one VCSEL was used in the full-link experiments described below. Also shown in Fig. 4 are the decoupling capacitors for each of the power supplies used in the experiment and an interposer for landing the high-speed probes. The interposer was used to improve the Print Version

RYLYAKOV et al.: TRANSMITTER PREDISTORTION FOR IMPROVEMENTS IN BIT RATE, SENSITIVITY, JITTER, AND POWER EFFICIENCY 3 Fig. 5. Block diagram of the full link incorporating the RX1 receiver. Fig. 4. FFE TX wirebond test site. mechanical robustness of the test site and the electrical reproducibility of the measurements. IV. TRANSMITTER EQUALIZED FULL LINK WITH RECEIVER 1 TX and RX assemblies were made by attaching CMOS chips to test cards and wire bonding all power, ground and bias connections. Short wire bonds were used to connect the VCSELs and PDs to their respective ICs. All high-speed electrical I/O was applied and collected using coplanar GSGSG microwave probes, while lensed 50 m MMF probes were used for both the TX and RX optical coupling. A 30 GHz bandwidth sampling oscilloscope was used for all eye-diagram measurements, and a 17 GHz bandwidth Newport D-25xr PD was employed for transmitter eye-diagrams. The pattern was PRBS and the signal applied at the FFE TX input was 500 mv. The VCSEL and PD arrays for all of the links presented here were designed and fabricated by Emcore Corporation. Both of the devices are conventional 850 nm top-emitting/detecting mesa structures fabricated on un-thinned semi-insulating substrates using a volume production process. Further details of the VCSEL and PD growth and fabrication can be found in [15]. The VCSEL in the FFE transmitter had an aperture of 5 m, a series resistance of approximately 50, and a slope efficiency of mw/ma. All of the receivers utilized 25 m diameter devices with a responsivity of 0.55 A/W, and a capacitance of 76 ff at their nominal 3 V bias point. A block diagram of the full link containing the CMOS receiver that will be referred to as RX1 is shown in Fig. 5. The RX1 receiver is the first design we implemented in 90 nm CMOS [9] and its bandwidth was lower than expected. Although this receiver did operate up to 20 Gb/s when tested with a high-speed reference VCSEL, it did not support full CMOS links at 15 20 Gb/s with transmitters constructed with first-generation 90 nm CMOS laser drivers. The RX1 design was a fully differential transimpedance amplifier (TIA) followed by a five-stage Cherry Hooper LA, two CML buffer stages and a final differential output driver. The TIA is ac coupled through on-chip capacitors nominally valued at 2.5 pf. The low-frequency cutoff of the receiver, including the front-end ac coupling and the active offset compensation loop surrounding the LA, is MHz, sufficient to minimize the power penalty due to baseline wander assuming balanced 8b/10b data encoding. Device-level schematics for the RX1 chip can be Fig. 6. TX optical output measured with a high-speed reference PD and singleended RX electrical output for the RX1 link (a) without and (b) with transmitter predistortion. Although only explicitly shown on the 15 Gb/s eye diagrams, the vertical scales are the same at the other data rates. found in [16]. Power supplies were separated for the TIA, LA, and output buffers to minimize both power consumption and switching-induced power supply noise. Fig. 6 presents the TX output and RX electrical output eye diagrams measured for the RX1 link at data rates from 15 to 20 Gb/s with and without TX predistortion. All data in this section were obtained with the FFE TX operating under nominal conditions, where it consumed 61 mw with the FFE enabled and 56 mw with the FFE disabled. The RX1 receiver consumed 104 mw. The link total power dissipation was therefore 165 mw with, and 160 mw without predistortion. As Fig. 6 illustrates, the effect of transmitter predistortion is dramatic. While the TX eye diagrams are visibly distorted when the FFE circuit is operating, the quality of the RX electrical output is improved at all data rates with reduced jitter and wide-open eye diagrams at 17.5 and 20 Gb/s. The visible improvements in eye-diagram quality affected by TX predistortion translate into better sensitivity and timing Print Version

4 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 30, NO. 4, FEBRUARY 15, 2012 Fig. 7. Receiver sensitivity characteristics for the RX1 link at multiple data rates (a) without and (b) with TX predistortion. Fig. 8. Timing margin characteristics for the RX1 link at 15 Gb/s (a) without and (b) with TX predistortion. margin. Fig. 7 presents the receiver sensitivity characteristics for the RX1 link as a function of data rate with and without TX predistortion. The RX sensitivity is improved by db at 15 Gb/s. With the TX FFE enabled, the link is operational at 20 Gb/s, whereas without equalization the link is limited to data rates Gb/s. With TX predistortion, the receiver sensitivity at 20 Gb/s at a bit error ratio (BER) of is db m and improves to better than db m at 10 Gb/s. The timing margin characteristics for the RX1 link are shown in Fig. 8 at a data rate of 15 Gb/s. TX predistortion improves the horizontal eye opening at by 0.42 unit interval (UI), or 28 ps, at 15 Gb/s. With TX predistortion, the timing margin (at ) is 0.24 UI (12 ps) at 20 Gb/s and 0.78UI (78 ps) at 10 Gb/s. It is instructive to plot the power efficiency of optical links as a function of data rate. Fig. 9 presents the power efficiency Fig. 9. Power efficiency as a function of data rate for the RX1 link with and without TX predistortion. of the RX 1 link when operated with and without TX predistortion for speeds up to 20 Gb/s. The curve in Fig. 9 was obtained as follows. At a given data rate, the best power efficiency was obtained by minimizing the link power consumption while ensuring that two conditions were satisfied: 1) the BER remained less than ; and 2) the RX output voltage swing was greater than 200 mv. In general, an optical link will have an optimum data rate at which it operates most efficiently. At this optimum speed, the link is gain limited, i.e., the link is operating at a very low error rate and the power consumption is determined by meeting the second criteria of a minimum RX output voltage swing. Above the optimum rate, the extra power (mw) required to maintain performance typically outpaces the increased data rate (Gb/s), degrading the efficiency in mw/gb/s, or equivalently, pj/bit. At low data rates, the power efficiency curve also rises: since the link is operating at its gain-limited minimum power, the efficiency degrades as the data rate is lowered because the link power dissipation is amortized over fewer transferred bits. The power efficiency curve without TX predistortion clearly exhibits an optimum data rate, but the curve with predistortion is flattened. Between 10 and 17.5 Gb/s, the equalized RX1 link operates at a power efficiency of slightly less than 6 pj/bit. At 20 Gb/s, the power efficiency of the equalized RX1 link is 8.3 pj/bit. V. TRANSMITTER EQUALIZED FULL LINK WITH RECEIVER 2 The block diagram of the second optical link, using a CMOS receiver that will be referred to here as the RX2 design, is shown in Fig. 10. The RX2 chip is a second-generation chip that was targeted specifically at achieving higher bandwidth than the original RX1 circuit. Consequently, there are substantial differences between the RX 1 and RX2 designs. The RX2 LA was implemented with only Cherry Hooper stages, eliminating the CML stages in the output block of the RX1 chip. An additional stage was added to the LA, bringing the total to six, to compensate for the reduced gain incurred by removing the CML stages. The LA in the RX2 design directly drives an inductively peaked differential output stage. Finally, although the differential TIA architecture was unchanged between the RX1 and RX2 circuits, the value of the transimpedance resistors were reduced by a factor of 2 in the RX2 design in an effort to improve the speed and, therefore, sensitivity at higher data rates, at the expense of reduced sensitivity at lower data rates. Testing of the RX2 link proceeded in the same manner described previously for the RX1 link. Single-ended electrical Print Version

RYLYAKOV et al.: TRANSMITTER PREDISTORTION FOR IMPROVEMENTS IN BIT RATE, SENSITIVITY, JITTER, AND POWER EFFICIENCY 5 Fig. 10. Block diagram of the full link incorporating the RX2 receiver. Fig. 11. Single-ended receiver output eye diagrams for the RX2 link (a) without TX predistortion at an RX2 chip consumption of 85 mw, (b) with TX predistortion at the same receiver power dissipation as (a), and (c) with TX predistortion with the RX2 chip operating at a power consumption of 53 mw. The vertical scale is the same for all of the eye diagrams. output eye diagrams produced by the RX 2 link under multiple operating conditions as a function of data rate are presented in Fig. 11. Fig. 11(a) shows data for the link operating with no TX predistortion; the eye diagram at 17.5 Gb/s is starting to close and is completely closed at 20 Gb/s. Fig. 11(b) and (c) was obtained on the RX2 link with TX predistortion at two different receiver operating points. The data in Fig. 11(a) and (b) were obtained with the RX2 receiver consuming 85 mw, while in Fig. 11(c), the RX2 operated at a lower power dissipation of 53 mw. The FFE TX was set to the same conditions as in Fig. 5(a) and (b) so the TX eye diagrams are not repeated here. As with the RX1 link, TX predistortion yields visible improvements to the total link performance and extends the operating range of the link to 20 Gb/s. The receiver sensitivity characteristics of the RX2 link are shown in Fig. 12 for the same operating conditions used to obtain the eye diagrams of Fig. 11. The curves in Fig. 12(a) were taken with no TX predistortion at a receiver power dissipation of 85 mw, the data in (b) were obtained at the same RX setting but with TX predistortion enabled, and the characteristics in (c) were obtained at the lower RX power dissipation setting of 53 mw. Directly comparing Fig. 12(a) and (b), it is evident that under the same RX operating conditions, TX predistortion improves the sensitivity at by more than 4 db at 15 Gb/s and extends the capability of the link to 20 Gb/s. Fig. 13 plots the timing margin for the RX2 link with and without TX predistortion at the 85 mw RX power setting. At 15 Gb/s and, predistortion yields an improvement of 0.09 UI (6 ps). Comparing the 17.5 Gb/s data illustrates a much larger improvement with TX predistortion: without it, Fig. 12. Receiver sensitivity characteristics for the RX2 link as a function of data rate (a) without TX predistortion at an RX2 chip consumption of 85 mw, (b) with TX predistortion at the same receiver power dissipation as (a), and (c) with TX predistortion with the RX2 chip operating at a power consumption of 53 mw. Fig. 13. Link timing margin measured at the RX2 output at 15 and 17.5 Gb/s (a) without and (b) with TX predistortion. The RX2 chip power consumption was 85 mw for both (a) and (b). the link does not operate below, while with predistortion, the link has a 0.46 UI (31 ps) horizontal eye opening at. At 20 Gb/s, with TX predistortion, the timing margin is 0.42 UI (28 ps) at. Fig. 14 plots the sensitivity and timing margin penalties for transmission through up to 150 m of OM4 fiber. At 17.5 Gb/s, the sensitivity penalty incurred for 150 m transmission compared to back-to-back is db. Sensitivity measurements at 20 Gb/s were more challenging through fiber since the optical attenuator used for testing had to be eliminated to remove its insertion loss ( db), but the penalty for 150 m transmission is no greater than 2 db at 20 Gb/s. The timing margin curves in Fig. 12 indicate penalties of less than 0.05 UI (3 ps) and 0.1 UI (5 ps) for 150 m transmission at data rates of 17.5 Gb/s and 20 Gb/s, respectively. The power efficiency of the RX2 link as a function of data rate was measured in the same manner as described previously for the RX1 link and the results are plotted in Fig. 15. As with the RX1 link, the power efficiency curve as a function of data rate is significantly flattened with TX pre-distortion. The RX2 Print Version