Ultra-Low Power and Ultra-Low Voltage RF CMOS Circuits and System Design Techniques

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Ultra-Low Power and Ultra-Low Voltage RF CMOS Circuits and System Design Techniques"

Transcription

1 Ultra-Low Power and Ultra-Low Voltage RF CMOS Circuits and System Design Techniques by Mahdi Parvizi, M.A.Sc. Department of Electrical and Computer Engineering McGill University Montreal, Quebec, Canada A thesis submitted to McGill University In partial fulfilment of the requirements of the degree of Doctor of 2015 All rights reserved.

2 ACKNOWLEDGEMENTS The first person I would like to thank is my supervisor, Dr. Mourad El-Gamal. His advice and helpful insight went far beyond the technical aspects and helped me understand and analyse my research from various perspectives, including the ever important big picture. I am grateful for his continued support and for his confidence in my abilities. A special thanks to my family. Words cannot express how grateful I am to them. I would like to thank my wife Fateme for her continued support throughout this endeavor. I would also like to thank my parents and my brother and sister for their encouragement and support. There are many others who made this research successful, and helped me maintain my sanity through many days (and nights) spent in the lab. The list includes Karim Allidina, Ali Gorji, Mohammad Mahani, Frederic Nabki, Paul-Vahe Cicek, Sareh Mahdavi, Mohannad Elsayed, and Ahmad Alfaifi. Furthermore, I would like to extend my gratitude to Ciena for providing an internship opportunity and also to my colleagues and my manager at Ciena Dr. Naim Ben Hamida for the supports and useful technical discussions. The CMOS chip fabrications were enabled by CMC Microsystems which provided access to the design kit and fabrication runs. ii

3 ABSTRACT The main focus of this thesis is to mitigate circuit design challenges under ultra-low power (ULP) and ultra-low voltage (ULV) conditions to implement ULP and ULV circuits and systems suitable for low power wireless sensor networks operating from the energy scavenged from the environment. The design of power efficient transceivers requires careful optimization at the circuit level. ULV circuit design challenges are fully investigated in this work to address its importance in low power transceivers operating with energy harvesting. It is shown that ULV supply has severe impacts on the performance of a MOS transistor. New solutions like forward body biasing, new biasing scheme optimized for ULP and ULV design, current-reuse and etc. are introduced to mitigate these performance degradations. Several ULP and ULV wideband RF low noise amplifiers (LNAs) are designed and fabricated to prove the concept. The first one is a resistive shunt-feedback LNA fabricated in a 90-nm TSMC CMOS using multiple ULP and ULV techniques. This prototype achieves high figure of merit (FoM) while consuming sub-mw power from a 0.5-V supply. The second one is an inductorless, wideband LNA implemented in an IBM 0.13-µm CMOS technology which uses a current-reused tunable active shunt-feedback for input impedance matching. Employing ULP design techniques, this LNA consumes only 400-µW while illustrating high FoM. The third fabricated LNA, also implemented in an IBM 0.13-µm CMOS, utilizes multiple ULP and ULV design techniques which lead to the best FoM in the literature to the best of the author knowledge, with only 0.25-mW power from a 0.5-V supply. Finally, a UWB current-reuse noise cancelling LNA is implemented that achieves high performance while consuming 410-µW from a 0.4-V supply which makes it suitable for systems operating from energy harvesting. As a low power system-level solution, impulse radio ultra-wideband (IR-UWB) technology with short pulses in the time domain which allows the transceiver to be duty cycled, is iii

4 considered for ULP wireless sensor networks. In this work, a non-coherent chirp-fsk IR- UWB receiver based on continuous time slicing and low power injection locked clock recovery is implemented. The RF front-end of the receiver is implemented in an IBM 0.13µm CMOS. It achieves a tuned voltage gain of dB with a NF of 7-dB and the static power consumption of only 1.75-mW from a 0.5-V supply voltage. Overall, the proposed techniques, circuits and system improve the power efficiency both at circuit and system level and are suitable for ULP and ULV applications such as energy scavenged wireless sensor networks. iv

5 ABRÉGÉ L intérêt principal de cette thèse est de réduire les défis reliés à la conception de circuits opérant dans des conditions de ultra-basse puissance (UBP) et ultra-bas voltage (UBV) afin d implémenter des circuits et systèmes applicables aux réseaux de capteurs de basse puissance opérant á l aide d énergie récupérée de sources ambiantes. La conception de transmetteur/récepteur require une attention particulire à l optimisation au niveau circuit. Dans ce travail, les défis reliés à la conception de circuits UBV sont investigués afin d adresser leur importance dans la conception de transmetteur UBP opérant l aide d énergie ambiante récupérée. Il est démontré que l alimentation en mode UBV a des impactes sévères sur la performance des transistors MOS. De nouvelles solutions tels que l application de tension de polarisation à sens direct, une nouvelle méthodologie de génération de tension de polarisation pour la conception UBP et UBV, la réutilisation de courant etc. sont introduites afin de mitiger ces dégradations de performance. Plusieurs amplificateur à faible bruit (AFB) opérant en mode UBP/UBV sont conçus et fabriqués afin de démontrer les concepts proposés. Le premier est un AFB à boucleshunt résistive utilisant multiples techniques de conception UBP/UBV est fabriqué dans la technologie 90-nm TSMC CMOS. Ce prototype réalise un facteur de qualité élevé tout en consommant une puissance sous la barre de 1-mW en étant alimenté à l aide d une source de 0.5-V. Le deuxième est un AFB à bande large sans inductance qui emploi une technique de boucle-shunt active basée sur la réutilisation du courant pour l adaptation d impédance. Cet AFB a été implémenté à l aide de plusieurs techniques de conception UBP/UBV dans la technologie 0.13-µm CMOS de IBM et consomme seulement 400-µW tout en démontrant un haut facteur de qualité. Le troisième AFB fabriqué, également dans la technologie 0.13-µm CMOS de IBM, utilise multiples techniques UBP et UBV a permis de réaliser le meilleur facteur de qualité rapporté, au meilleur des connaissances de l auteur, tout en opérant à l aide d une source d alimentation de 0.5-V. Finalement, un AFB a ultra-large bande (ULB) v

6 employant la réutilisation du courant et l annulation du bruit est implémenté et réalise un facteur de qualité élevé tout en consommant 410-µW à partir d une alimentation de 0.4-V; ce qui est attrayant pour les systèmes opérant l aide d énergie récupérée de sources ambiantes. Etant une solution à basse puissance au niveau système, la technologie de radio à impulsion à bande ultra-large (IR-UWB) utilisant de courte impulsions dans le domaine temporel permet au transmetteur/récepteur d opéré en cycle de service et est donc considérée pour les réseaux sans-fil UBP. Dans ce travail, un récepteur IR-UWB non-cohérent utilisant un procédé de communication chirp-fsk basé sur le découpage temporel continu et une technique de récupération d horloge basée sur le verrouillage par injection est implémenté. La parti RF du récepteur est fabriqué dans la technologie 0.13-µm CMOS de IBM. Ce récepteur réalise un gain de voltage ajusté de db avec un facteur de bruit (NF) de 7-dB et une consommation de puissance statique de seulement 1.75-mW à partir d une alimentation de seulement 0.5-V. En somme, les techniques ainsi que les circuits et systèmes proposés améliorent l efficacité de consommation au niveau du circuit ainsi que du systme et sont donc applicables aux applications UBP et UBV tels que les réseaux sans-fil opèrant à l aide d énergie récupérée de sources ambiantes. vi

7 TABLE OF CONTENTS ACKNOWLEDGEMENTS ii ABSTRACT iii ABRÉGÉ LIST OF TABLES v xi LIST OF FIGURES xii LIST OF ACRONYMS xx 1 Introduction Motivation Overview of IR-UWB Technology Why IR-UWB Technology for Low Power Applications? Background of Ultra-Low Power and Ultra-Low Voltage Design Research Goal Contributions A Sub-mW Ultra-Wideband Low Noise Amplifier Design Technique A 0.4-mW Compact Inductorless Wideband Low Noise Amplifier Short Channel Effects Mitigation Employing Forward Body Biasing Technique to Realize a 0.5-V 250-µW GHz Current-reuse CMOS LNA A 0.4-V Ultra Low-Power UWB CMOS LNA Employing Noise Cancellation An Ultra Low Power, Low Voltage CMOS Squarer Circuit for Non-Coherent IR-UWB Receivers An Ultra-Low Power Injection Locked Clock Recovery Scheme for a Chirp-2FSK-UWB Receiver Thesis Outline Review of State-of-The-Art Works Introduction IR-UWB Modulation Schemes ON-OFF Keying (OOK) Pulse Position Modulation (PPM) Binary Phase Shift Keying (BPSK) vii

8 2.2.4 Transmitted Reference Synchronized-OOK (S-OOK) Chirp-Frequency Shift Keying (C-FSK) IR-UWB Receivers Coherent Receivers Non-Coherent Receivers Comparison of Non-Coherent UWB Receivers Synchronization in IR-UWB Receivers Power Consumption Breakdown of Low Data rate Receivers Low Noise Amplifier Design Techniques Wideband Low Noise Amplifier Design Sub-mW Wideband Low Noise Amplifiers Ultra-Low Voltage LNA Design Comparison of Wideband LNAs Squarer Designs Summary Ultra-Low Voltage Design Challenges Introduction MOS Transistor Operating Points Review Transconductance Efficiency Output Conductance Intrinsic Voltage Gain Transit Frequency Noise Figure Linearity Summary Ultra-Low Voltage and Ultra-Low Power Design Techniques Introduction Ultra-Low Power Design Techniques Ultra-Low Power and Ultra-Low Voltage Biasing Scheme Current-reuse Forward Body Biasing Threshold Voltage Reduction Short Channel Effect Mitigation Improvements in Ultra-Low Voltage Circuit Design Impact of Forward Body Biasing on the Linearity of the Transistor Wideband Matching and Bandwidth Enhancement Techniques Shunt-Feedback Bandwidth Extension With g m -boosting Summary viii

9 5 A Sub-mW 0.5-V Resistive Shunt-Feedback Low Noise Amplifier Introduction Low Voltage Shunt-Feedback Low Noise Amplifier Design Limitations The Choice of the Feedback Resistor The Effect of the Output Resistance Low Power, Low Voltage Resistive Shunt Feedback Architecture Circuit Analysis of The Proposed ULP, ULV Shunt Feedback LNA Inductive Series Peaking in the Feedback Path Input Matching Voltage Gain Noise Figure Linearity Comparison between Different Bandwidth Extension Techniques Circuit Design Measurement Results Summary An Inductor-less Ultra-Low Power Tunable Active Shunt-Feedback LNA Introduction Ultra-Low Power Design Techniques Active Shunt-Feedback Complementary Current-reuse Inductorless Common-Gate LNA Circuit Description Forward Body Biased Tunable Feedback Input Impedance Voltage Gain Noise Figure Nonlinearity Stability Measurement Results and Discussions Summary A 0.5-V 250-µW Forward Body Bias Enhanced Complementary Current-reuse CMOS LNA Introduction LNA Circuit Description Input Impedance Tunable Feedback Coefficient Voltage Gain Inductive g m -boosting Technique Noise Figure Ultra-Low Voltage Design Techniques Ultra-Low Power Design Techniques ix

10 7.3 Measurement Results and Discussions Summary Ultra-Low Power Injection Locked Clock Recovery Scheme for a Chirp FSK UWB Receiver Introduction Chirp-FSK IR-UWB Signalling Link Budget Chirp-FSK IR-UWB Receiver Design Low Noise Amplifier RF Amplifier RF Front-End Measurement Results Squarer Demodulation Injection Locking Based Synchronization Scheme Clock Recovery Architecture and Circuits Injection Locking Phase Shifter Summary Summary and Future Work Research Summary Future Work References x

11 Table LIST OF TABLES page 2 1 Comparison of state-of-the-art low data rate non-coherent IR-UWB receivers Device Dimensions and Component Values Ultra-low power and ultra-low voltage design techniques equations for a resistive shunt feedback LNA Device Dimensions Performance summary and comparison with state-of-the-art LNAs Device Dimensions and Component Values Performance summary and comparison with state-of-the-art LNAs Comparison between shunt-feedback LNAs w/o explicit load Device Dimensions and Component Values Performance summary and comparison with state-of-the-art LNAs Performance summary of the squarer and comparison with prior published works Summary of the power consumption of the receiver and comparison with other works xi

12 Figure LIST OF FIGURES page 1 1 A typical wireless sensor network composed of sensing nodes and relay nodes The block diagram of a generic wireless sensor node The variation of the energy per bit, E b /N 0, versus link spectral efficiency, R/B The comparison of low data rate IR-UWB radios with narrow-band systems (a) duty-cycling in low data rate IR-UWB transceiver with OOK modulation (b) narrow-band radio with OOK modulation The block diagram of a multi-port ULV energy scavenging wireless transceiver sensor node to minimize the conversion losses in the energy harvesting section The modulation schemes used in impulse UWB technology (a) on-off keying modulation (OOK) (b) pulse position modulation (PPM) (c) binary phase shift keying (BPSK) (d) transmitted-reference (TR) (e) synchronized-ook (S-OOK) (f) chirp frequency shift keying (C-FSK) A typical coherent correlation based receiver ADC based receiver (a) The simplified architecture presented in [64] with high speed ADCs (b) The simplified block diagram of the receiver presented in [66] with a bank of 1-bit ADCs Analog correlation based receivers (a) The receiver presented in [61] (b) The analog correlation presented in [60] Block diagram of non-coherent receivers (a) The receiver presented in [42] (b) The energy detection based receiver presented in [45] (c) The non-coherent receiver with asynchronous demodulator presented in [38] The block diagram of a peak detection based receiver based on the work presented in [81] The block diagram of a super regenerative receiver based on the work presented in [79] The energy per bit versus maximum data rate of the receivers The block diagram of a conventional synchronization scheme in non-coherent IR-UWB receivers with a crystal oscillator and DLL xii

13 2 10 The block diagram of a injection locking scheme for synchronization in high data rate IR-UWB receivers The breakdown of power consumption in RF front-end of non-coherent receivers (a) a non-coherent receiver based on self-mixing and asynchronous demodulation [82] (b) a non-coherent receiver based on passive squaring and integration [42] The circuit schematic of (a) distributed amplifier, (b) inductive degeneration LNA with LC network to increase the input matching bandwidth The circuit schematic of a common-gate transistor as an input stage of an LNA to provide wideband input matching Noise cancellation scheme highlighted in (a) block diagram (b) common-gate transistor and an auxiliary common-source stage (c) resistive shunt feedback LNA with an auxiliary common-source stage for noise cancellation Common-gate LNAs in multiple feedback structures (a) capacitive cross coupling passive negative feedback (b) negative feedback (c) positive feedback (d) a combination of positive and negative feedback structures (e) A dual negative feedback architecture The circuit schematic of a resistive shunt feedback LNA, a technique to provide wideband input matching The circuit schematic of sub-mw LNAs in the literature (a) a common-gate LNA with T-match input matching and self-forward body biasing technique (b) active negative shunt-feedback LNA (c) hybrid common-gate/resistive shunt feedback LNA (only single-ended version of the LNA is shown) The circuit schematic of ultra-low power LNAs in the literature (a) a complementary current reuse LNA (b) a differential common-gate LNA using capacitor cross-coupling and forward body biasing [124] (c) common-gate LNA with g m -boosting amplifier stage(d) a common-gate LNA using forward body biasing [123] The FOM of the state-of-the-art LNAs in the literature versus their power consumption The FOM of the state-of-the-art LNAs in the literature versus their power consumption The effective gate-source and drain-source saturation voltages with respect to inversion coefficient The simulated g m /I D characteristics for an NMOS transistor in a 90nm CMOS technology for two different V DS values xiii

14 3 3 The variation of g ds of a MOS transistor in a 90-nm CMOS technology with respect to IC for 3 different V DS values The simulated intrinsic voltage gain of an NMOS transistor in a 90-nm CMOS for three different V DS values The simulated transit frequency of an NMOS transistor in a 90-nm CMOS technology for two different V DS values The simulated minimum NF of an NMOS transistor in a 90-nm CMOS technology for two different V DS values (a) The simulated third-order distortion of a MOSFET due to g m for three different V DS values (b) The third-order distortion due to g ds for three IC values The simulation of the extended biasing metric for an NMOS transistor for two different V DS values The application of complementary current-reuse in wideband LNAs (a) with second-order distortion cancellation [98] (b) g m -boosting [139] and (c) with noise cancellation [27] The variation of the required V GS versus inversion coefficient for two FBB values An NMOS transistor cross section for two V sub values (a) with V sub =0 the depletion region around the drain controls the channel (DIBL effect) (b) with FBB the impact of the drain on the channel is reduced The impact of FBB on the characteristics of an NMOS transistor. (a) Measured g m /g ds improvement due to FBB for two drain-source voltages. (b) The g m /g ds improvement at the IC = 1 (middle of the moderate inversion region) The g m of a (40-µm/120-nm) NMOS transistor for two V DS values versus IC. It is shown that by varying V sub, the change in g m with respect to the IC is very small, since both parameters increase as V T H is decreased Example of how FBB can recover the performance degradation caused by the supply voltage reduction for circuits with the same current consumption. (b) The g m /g ds improvement of a transistor at IC = 1 (middle of the moderate inversion region). The transistor shows the same g m /g ds for V DD =0.3-V and V DD = 0.5-V The simulation results for the extended biasing metric variation with FBB.. 71 xiv

15 4 9 The variation of (a) g ds and (b) g ds of a (40-µm/120-nm) NMOS transistor for two V sub values versus drain source voltage at IC=1. It is shown that by applying FBB, g ds improves while its third-order distortion coefficient,, is almost unchanged g ds 4 10 Shunt-feedback in a (a) common-source amplifier and (b) common-gate stage The schematic of the proposed inductive g m -boosting technique for commongate transistors and the elements included in the calculation of G m,eff The effective G m of a common-gate transistor with inductive g m -boosting for multiple inductor values and without inductive g m -boosting Three possible topologies to implement a resistive shunt feedback LNA (a) with resistive load, (b) with active load, (c) with current reuse The required g m versus R f to achieve S 11 of 18-dB, S 21 of 14-dB and NF of 6-dB The required g m to achieve S 11 of 18-dB, S 21 of 14-dB and NF of 6-dB The g m vs. R O plots for a resistive shunt feedback LNA with active load, for two different supply voltages along with the plot for the required g m vs. R O The g m vs. R O plots for a current reuse inverter type architecture for two different supply voltages along with the plot for the required g m vs. R O Resistive shunt feedback LNA (a) without bandwidth extension technique, (b) with inductive series peaking at the input, (c) with inductive series peaking in the feedback path The effective transconductance of the circuits shown in Fig (a) current reuse resistive shunt feedback LNA with inductive series peaking in the feedback loop, (b) its equivalent circuit model The effect of inductive series peaking in the feedback loop on the dominant poles of the circuit The voltage gain of the circuit in Fig. 5 8(a) for different values of L Resistive shunt feedback LNAs with current reuse scheme (a) without any bandwidth extension, (b) with inductive series peaking at the input, (c) with inductive series peaking in the feedback loop The S 21 simulation results for the circuits shown in Fig The S 11 simulation results for the circuits shown in Fig The NF simulation results for the circuits shown in Fig xv

16 5 15 The complete schematic of the proposed ultra-low power, ultra-low voltage LNA with buffer for measurement purposes The K-factor of the LNA The die micrograph of the proposed LNA Measurement and simulation results for the voltage gain and input matching The reverse isolation and output matching of the LNA The measurement and simulation results for the NF The measured input-output characteristics of the LNA Comparison of the state-of-the-art works in the literature with the designed LNA based on FOM I The block diagram of the shunt-feedback architecture employed in this design Inductorless common-gate complementary current-reuse architectures (a) NMOS-PMOS in conventional order (b) flipped NMOS-PMOS to enable current-reuse in the feedback transistor The complete schematic of the ULP LNA including the output buffer for testing purposes Variation of S11 and S21 of the LNA with substrate voltage of M3 (V sub ) Input impedance of the LNA versus g m3 for multiple values of g o1 + g o2 at 1-GHz The variation of S21 and S11 of the LNA for two C 1 values The noise mechanisms in the LNA. Only noise contribution of M1 is highlighted for simplicity The NF of the LNA only due to the load resistances modelled in MATLAB The impact of C 1 capacitor on (a) NF of the LNA (b) 3-dB bandwidth of the LNA and the NF value at 100-MHz The k-factor of the LNA for multiple forward body biasing voltages of transistor M3 (V sub ) The die micrograph of the LNA The measurement setup of the chip-on-board LNA with GSG probes The post-layout simulation and measured results for S21 and S11 of the LNA The measured S12 and S22 of the LNA xvi

17 6 15 The post-layout simulation and measurement results for NF of the proposed LNA The measured and simulated input-output characteristics of the LNA Comparison of the state-of-the-art works in the literature with the designed LNA based on (a) FOM I and (b) FOM II Simplified block diagram of the LNA with the feedback mechanisms The complete schematic of the proposed ULV and ULP LNA along with the buffer for measurement purposes The small-signal schematic of the proposed LNA for input impedance, voltage gain and NF analysis The real part of the input impedance of the LNA versus g m3 for multiple values of g oi at 1-GHz Simplified schematic of LNAs with active shunt-feedback (a) without explicit load impedance (this work) (b) with explicit load impedance [149] The S21 and S11 variations with the feedback coefficient tuned through the substrate voltage of M The S21 and S11 variations with the inductive peaking caused by L 3 at the gate of M The noise contribution of the transistors to the noise factor of the LNA at multiple operating frequencies The schematic of self forward biasing scheme with the equivalent circuit The die micrograph of the LNA The post-layout simulation and measured results for S21 and S11 of the LNA for two supply voltages The post-layout simulation and measurement results for NF of the proposed LNA for two supply voltages The measurement results for input and output characteristics of the LNA The FOM versus power consumption comparison of the proposed LNA with recent works from literature The comparison of chirp-fsk modulation scheme shown in (a) with other common modulation formats used in IR-UWB transceivers (b) on-off keying modulation (OOK) (c) pulse position modulation (PPM) (d) synchronized- OOK (S-OOK) xvii

18 8 2 The block diagram of the proposed chirp-fsk IR-UWB receiver with injection locked based clock recovery The schematic of (a)conventional noise cancelling LNA (b) proposed ULV, ULP current-reuse noise cancelling LNA. architecture The circuit schematic of the proposed ultra low voltage, low power currentreuse LNA with g m -boosting The simulated voltage gain, S11, S12 and S22 of the proposed LNA The simulated noise figure of the proposed LNA The simulated noise figure of the proposed LNA The schematic of the fabricated noise cancelling LNA The Die micrograph of the fabricated LNA The measured S21 and post layout simulation of the proposed LNA The measured S11 and post layout simulation of the proposed LNA The schematic of the designed RF gain stage tuned for two different bands The voltage gain simulation result of the designed RF gain stage tuned at 4.5-GHz frequency band with the frequency tunability of 400-MHz The placement of measurement buffers after the RF front-end of the receiver The die micrograph of the receiver The S11 measurement and post-layout simulation results of the receiver The S21 measurement and post-layout simulation results of the receiver The S22 and S12 measurement results of the receiver The NF measurement results of the RF front-end of the receiver The structure utilized for cancelling odd order terms in the Taylor series of the drain current of two MOSFETs The g m and its derivative with respect to the gate-source voltage. The NMOS width and length are W=40-µm and L=100-µm, respectively, Vds=250-mV Basic gain stage, with g m boosting amplifier The complete schematic of the proposed squarer (the bias circuit is not shown) (a) The fourth-order derivative of a Gaussian-pulse used as an input pulse and (b) the output waveform of the squarer xviii

19 8 25 The RMS gain of the squarer vs. the input signal amplitude Sensitivity of the squarer gain to the biasing voltage of the input transistors for an input signal of 10-mV Monte Carlo simulation results for the RMS gain The block diagram of the continuous time slicing scheme The block diagram of the proposed low power clock recovery scheme based on injection locking The detailed block diagram of the pulse width adjustment block along with the input and output pulses The time domain signals at multiple points in the receiver The block diagram of the injection locking to the ring oscillator The prediction of the phase noise of an injection locked VCO The phase noise of a free running ring oscillator and injection locked ring oscillator at 1-MHz oscillation frequency The AM noise to PM conversion gain for a free running ring oscillator and an injection locked one (a) The detailed block diagram of the phase shifter block (b) The simulation results of the phase shifter xix

20 LIST OF ACRONYMS ADC AM BER bps BPSK CDR C-FSK CMOS DAC DFF E b /N 0 DIBL DLL EIRP FBB FCC FD-SOI FSK FoM GPS IC IEEE IIP3 Analog to Digital Converter Amplitude Modulation Bit Error Rate bits per second Binary Phase Shift Keying Clock and Data Recovery Chirp Frequency Shift Keying Complementary Metal-Oxide-Semiconductor Digital to Analog Converter D Flip-Flop bit energy / noise power spectral density Drain Induced Barrier Lowering Delay Locked Loop Equivalent Isotropically Radiated Power Forward Body Biasing Federal Communications Commission Fully Depleted Silicon on Insulator Frequency Shift Keying Figure of Merit Global Positioning System Inversion Coefficient Institute of Electrical and Electronics Engineers Input-Referred Third Order Intercept Point xx

21 IL IoT IR-UWB ISM LNA LPF MOS NF NRZ OOK PFD PLL PM PPM RF RMS RZ SNR S-OOK TR ULP ULV UWB VCO WLAN WSN Injection Locking Internet of Things Impulse Radio Ultra Wideband Industrial, Scientific and Medical Low Noise Amplifier Low Pass Filter Metal-Oxide-Semiconductor Noise Figure Non-Return-to-Zero On-Off Keying Phase / Frequency Detector Phase Locked Loop Phase Modulation Pulse Position Modulation Radio Frequency Root Mean Square Return-to-Zero Signal to Noise Ratio Synchronized On-Off Keying Transmitted Reference Ultra-Low Power Ultra-Low Voltage Ultra-Wideband Voltage Controlled Oscillator Wireless Local Area Network Wireless Sensor Network xxi

22 CHAPTER 1 Introduction 1.1 Motivation With the advent of the Internet of Things (IoT), portable wireless devices have grown enormously in recent years. In fact, the number of connected devices to the Internet in the world has far exceeded the world population and it is expected that it will reach to 50 billions by 2020 [1]. It is also predicted that by the same year, the number of short-range wireless radios would outpace the number of mobile devices [2]. Short-reach wireless technologies are enabling low power, low cost and low complexity connections among multiple devices and helping to turn IoT into a reality. Wireless sensor networks are good examples which have become highly sought after in a myriad of applications which include health-care, wearable electronics, environmental monitoring, industrial settings and agriculture [3 7]. A wireless sensor network is composed of multiple sensor nodes that are deployed remotely to gather sensing data from the environment and relay the information to a processing unit or to the end user. Fig. 1 1 shows a custom sensor network including sensing nodes along with relay nodes. Fig. 1 2 highlights the blocks included in a typical wireless sensor node. The sensor node consists of a sensing unit to gather data; a signal conditioning circuitry to prepare the weak analog signal for transmission; an RF transceiver and an antenna to transmit and receive data; and a processor to control the overall functionality and a power unit which is either a battery or energy scavenging circuitry. The typical required data rates for most of the applications is between 100-kb/s to 10-Mb/s, mostly due to the long time constant of the physical processes being sensed. Nevertheless, a wireless sensor node must be able to operate for a long time without the need for battery replacement. At the same time, it must be low cost; have small feature 1

23 1.1. MOTIVATION Relay Node Sensor Figure 1 1: A typical wireless sensor network composed of sensing nodes and relay nodes. size; and be lightweight. In many of the aforementioned applications, the available size of the cell is so limited that it cannot be equipped with a large battery. A solution is to utilize energy scavenged from the environment to power the sensor node; however, the energy harvested from the environment, like from solar power [8 10], wireless power [11 16] and thermal energy [17,18], is usually very low. Therefore, the available power is limited in many applications which imposes severe restrictions on the power consumption of a single wireless sensor node. As a result, highly energy efficient communication schemes along with ultra-low power RF front-end circuits are required to maximize battery life and to allow operation from the energy harvested from the environment. Impulse Radio Ultra Wideband (IR-UWB) is a promising wireless technology that is capable of achieving low data rate communications (<10-Mb/s) over medium distances with very low power consumption. IR-UWB communications is based on the transmission of data using wideband pulses that have a very short duration in time, which means most of the transceiver circuitry can enter a low power state until a pulse is transmitted or received. At low data-rates, the time between pulses increases, which means a transceiver spends more time in a low power state; thus, the average power consumption of the system decreases. Ph.D. Thesis 2015 M. Parvizi

24 1.1. MOTIVATION Sensor RF Transceiver Signal conditioning Figure 1 2: The block diagram of a generic wireless sensor node. On the other hand, non-coherent energy detection schemes can be employed in the receiver allowing for low power and low cost implementations. These specifications combined make this communication scheme an ideal choice for applications that have low available power budgets, and at the same time require small amounts of information to be transferred, such as wireless sensor networks or radio frequency identification tags. While power saving can be achieved by using a low power communication scheme along with non-coherent detection and duty-cycling, design of ultra-low power and ultra-low voltage circuitry is also an important factor to enable a sensor node to operate from the energy scavenged from the environment. In general, the maximum allowed supply voltage of CMOS circuitry is reduced, as the feature size in the standard CMOS technologies is shrunk. However, this trend has been slowed down in the nano-meter gate-length devices mainly due to the fundamental limit of sub-threshold slope which is 60-mV/Dec [19]. In spite of this limitation, for low power devices, the supply voltage is expected to decrease to V [20, 21]. Ph.D. Thesis 2015 M. Parvizi

25 1.2. OVERVIEW OF IR-UWB TECHNOLOGY In general, this supply voltage reduction is desirable in low power transceivers in order to reduce the overall power consumption (P diss = V dd I diss ), it is more pronounced in systems powered by energy harvesting to minimize conversion losses. Nevertheless, reducing the supply voltage of CMOS circuitry has dramatic impacts on the performance of transistors like intrinsic gain (g m /g ds ), transit frequency (f t ), minimum noise figure, etc. [22] and it also leads to restrictions on the usable circuit topologies. Given the aforementioned points, the design of ultra-low power and ultra-low voltage RF circuitry and low power ultra-wideband systems for wireless sensor network applications is the main focus of this thesis. 1.2 Overview of IR-UWB Technology Although impulse radio ultra-wideband technology (IR-UWB) has gained a lot of interest recently, it was first used by Marconi in the late nineteenth century for data transmission. Then, the UWB concept became highly interesting in the 1960 s for radar systems mainly for military applications. Finally, in 2002, the Federal Communication Committee (FCC) made an ultra-wide bandwidth of GHz available for commercial applications [23]. IR-UWB radios are allowed to transmit low power ultra-short data pulses in this band. Because of the coexistence of IR-UWB radios with narrow-band systems in this band, like global positioning system (GPS) and the wireless local area networks (WLANs), the FCC has regulated a maximum transmission power for UWB radios to overlay the existing systems. The equivalent isotropically radiated power (EIRP) limit in the band of GHz is 41.3-dBm. Based on the standard, UWB systems should have either a 10-dB bandwidth of 500- MHz or a fractional bandwidth of 20%. Fractional bandwidth is defined as B/f c where B = f H f L denotes 10-dB bandwidth and f c = (f H + f L )/2. f H is upper frequency and f L is the lower frequency in 10-dB bandwidth [24]. Ph.D. Thesis 2015 M. Parvizi

26 1.3. WHY IR-UWB TECHNOLOGY FOR LOW POWER APPLICATIONS? 1.3 Why IR-UWB Technology for Low Power Applications? IR-UWB technology is highly energy efficient for short reach, low data rate applications. In this section, we briefly discuss two main advantages of UWB systems over low power narrowband radios. The first advantage is the high energy efficiency of the link which can be best seen from Shannon s channel capacity equation given by [25] C = B log 2 (1 + SNR), (1.1) where C is the channel capacity in bits per second, B is the bandwidth of the system, and SNR is the signal to noise ratio. The first takeaway from this equation is that the capacity increases linearly with bandwidth, but logarithmically with SNR. Therefore, in an UWB system, compared to a narrow-band communication higher capacity can be reached with lower SNR. The second takeaway, however, is more important for low power, low data rate communication links. If we assume that the data rate is less than the maximum capacity (C) of the channel and is represented by R, equation (1.1) is equal to ( R B = log E b R ), (1.2) N 0 B where E b is the average energy per bit, N 0 is noise per unit bandwidth, and consequently, N 0 B is the noise power and E b R is the signal power. Also, E b /N 0 is the normalized SNR and R/B is the link spectral efficiency. From Eq. (1.2) E b /N 0 can be found by E b = 2(R/B) 1 N 0 R/B. (1.3) Fig. 1 3 plots E b /N 0 as a function of the link spectral efficiency, R/B. As can be seen in the figure, the minimum required E b /N 0 is reached when R/B <0.1. Therefore, it can be concluded that the underutilization of available bandwidth in a communication channel is highly energy efficient [26]. As a consequence, low data rate IR-UWB links which highly Ph.D. Thesis 2015 M. Parvizi

27 1.4. BACKGROUND OF ULTRA-LOW POWER AND ULTRA-LOW VOLTAGE DESIGN 1, E b /N o R/B Figure 1 3: The variation of the energy per bit, E b /N 0, versus link spectral efficiency, R/B. under utilize the channel (R<<B, e.g. 1-Mb/s << 2-GHz) require very low E b /N 0 to detect the signal. The other important advantage of IR-UWB radios stems from the ability of duty-cycling the transceivers. IR-UWB systems employ narrow pulses in time domain which translate into high bandwidth in frequency domain to transmit data. In a low data rate transceiver, the time between pulses is high (for example in a 1-Mb/s data rate the time between pulses is greater than 980-ns). This important characteristics of the transceiver can be exploited to reduce the power consumption, as shown in Fig As can be seen in the figure, unlike low power low data rate narrow-band links which have to be ON all the time, the transceiver is inactive for a long period of time. In fact, theoretically, duty cycling can be as significant as 3% which can reduce the active power by more than 30 times. 1.4 Background of Ultra-Low Power and Ultra-Low Voltage Design Radios operating from energy scavenged from the environment have strict restrictions on the maximum power consumption and maximum power supply. For example, Fig. 1 5 highlights a wireless sensor node operating with the energy harvested from a wireless Ph.D. Thesis 2015 M. Parvizi

28 1.4. BACKGROUND OF ULTRA-LOW POWER AND ULTRA-LOW VOLTAGE DESIGN IR-UWB OOK ON OFF ON Carrier Based OOK t t (a) (b) Figure 1 4: The comparison of low data rate IR-UWB radios with narrow-band systems (a) duty-cycling in low data rate IR-UWB transceiver with OOK modulation (b) narrow-band radio with OOK modulation. transmission. It is composed of a two-port antenna [15] to facilitate simultaneous energy harvesting and data transmission and reception. Moreover, the goal in the energy harvesting section is to reduce the conversion losses which has led to some changes in the system. The first change involves the elimination of the charge pump circuit that is often exploited to boost the rectified voltage to the levels that needed by circuitry. This is because the low supply voltage allows the output of the rectifier to be directly coupled to a voltage regulator. Furthermore, the number of the diodes in the rectifier section is reduced since a high voltage multiplication is not required in the RF to DC rectifier section. This leads to lower supply voltages for CMOS circuitry. However, reducing the supply voltage of CMOS circuitry has dramatic impacts on the performance of transistors like intrinsic gain (g m /g ds ), transit frequency (f t ), minimum noise figure, etc. [22], which will be discussed in details in Chapter 3. Consequently, circuits operating from very low supply voltages have become very important and are under active research [22, 27 33]. Additionally, ultra-low power design limits the achievable bandwidth, gain, noise figure and linearity. In addition, the circuit design options are much more limited due to the requirements of ultra-low power (ULP) designs. Ph.D. Thesis 2015 M. Parvizi

29 1.5. RESEARCH GOAL Multi-Port Antenna Port #1: Matched to the LNA Wireless Transceiver ULP ADC Sensor #N. Sensor # V RF-DC Rectifier Port #2: Matched to the Rectifier V Voltage Regulator Figure 1 5: The block diagram of a multi-port ULV energy scavenging wireless transceiver sensor node to minimize the conversion losses in the energy harvesting section. These limitations motivate using a combination of circuit design techniques to improve the characteristics of the transistors under ULV conditions without additional power consumption in order to realize ULP, ULV, and wideband solutions. 1.5 Research Goal The goal of this research work is to design and implement ultra-low power and ultra-low voltage techniques, circuits and systems for a low power, low data rate IR-UWB receiver tailored for wireless sensor networks. The emphasis is on the development of ultra-low power and ultra-low voltage LNAs and squarers for a non-coherent IR-UWB receiver. Also, a chirp- FSK IR-UWB receiver is implemented and a clock recovery technique is introduced to reduce the complexity and power consumption in synchronization of chirp-fsk IR-UWB receivers. The design target in the design of wideband LNAs is to achieve ultra-low power consumption (<1-mW) with ultra-low supply voltages. While achieving this goal, ultra-low voltage design challenges in CMOS circuitry is investigated and discussed and new low power techniques to overcome the shortcomings due to supply voltage reduction is introduced. Also, for a squarer circuit, the main target is to achieve low power design with high conversion gain. Furthermore, the new circuits and techniques for ultra-low power and ultralow voltage design are combined and a non-coherent IR-UWB receiver with a low power and low complexity synchronization scheme is realized. Ph.D. Thesis 2015 M. Parvizi

30 1.6. CONTRIBUTIONS 1.6 Contributions The main contributions of this work are summarized in the following subsections A Sub-mW Ultra-Wideband Low Noise Amplifier Design Technique The low noise amplifier (LNA) is the first active component in the front-end of the receiver, and is generally considered as one of the most power hungry blocks. The high power consumption stems from the fact that an LNA must provide simultaneous wideband matching, high gain, low noise and high linearity; all of which typically require high power and high supply voltages. These combined specifications have made the design of low power and low voltage UWB LNAs a challenging research topic. This work presents a design methodology for an ultra-low power (ULP) and ultra-low voltage (ULV) ultra-wideband (UWB) resistive-shunt feedback low noise amplifier (LNA). Additionally, ultra-low voltage circuit design challenges are discussed and a new biasing metric for ULV and ULP designs in deep sub-micron CMOS technologies is introduced. Then, series inductive peaking in the feedback loop is analysed and employed to enhance the bandwidth and noise performance of the LNA. Furthermore, exploiting the new biasing metric, the design methodology, and series inductive peaking in the feedback loop, a 0.5-V, 0.75-mW broadband LNA with a current reuse scheme is implemented in a 90-nm CMOS technology. The contributions associated with this work are highlighted below [22,34]. The author s contribution was to generate the main idea of the work and also to design, analyse, implement, fabricate and measure the circuits. The contribution of Dr. K. Allidina was advice on circuit simulation and Layout techniques. M. Parvizi, K. Allidina, and M.N. El-Gamal, A sub-mw, ultra-low- voltage, wideband low-noise amplifier design technique, IEEE Trans. Very Large Scale Integ. Syst., 23(6): , June M. Parvizi, K. Allidina, and M. N. El-Gamal, Ultra-Low Power RF Systems and Building Blocks, in Wireless Transceiver Circuits: System Perspectives and Circuit Aspects, W. Rhee and K. Iniewski, Eds.: CRC Press, Ph.D. Thesis 2015 M. Parvizi

31 1.6. CONTRIBUTIONS Mahdi Parvizi, K. Allidina, Mourad El-Gamal, An ultra-low power, low voltage, wideband CMOS LNA with partial noise and distortion cancellation, in IEEE Solid State Circuit Conference (ISSCC) Student Research Preview (SRP), This paper has won the best presentation award of ISSCC SRP A 0.4-mW Compact Inductorless Wideband Low Noise Amplifier For the receivers which are used in low data rate applications such as wireless sensor networks, a higher NF (and reduced sensitivity) is tolerated to achieve low power consumption. Hence, The main objective in the design of a low noise amplifier, as the first active block in the RF front-end of a receiver, is to achieve low power consumption with low area occupation while delivering reasonable gain, NF and linearity. This work presents and analyses the design of a 1-V ultra-low power, compact, and wideband low noise amplifier (LNA). The proposed LNA uses common-gate NMOS and PMOS transistors as input devices in a complementary current-reuse structure. Low power input matching is achieved by employing an active shunt-feedback architecture while the current of the feedback stage is also reused by the input transistor to improve the current efficiency of the LNA. Also, a forward body biasing scheme is exploited to tune the feedback coefficient. The complementary characteristics of the input stage leads to partial second-order distortion cancellation. The proposed inductorless LNA is implemented in a IBM 0.13-µm 1P8M CMOS technology and occupies only mm 2. The contributions associated with this work is highlighted below [35]. The author s contribution was to generate the main idea of the work and also to design, analyse, implement, fabricate and measure the circuits. The contribution of Dr. K. Allidina was advice on circuit simulation and measurement. M. Parvizi, K. Allidina, and M.N. El-Gamal, An ultra-low power wide-band inductorless CMOS LNA with tunable active shunt-feedback, Revise and Resubmitted for publication in IEEE Transaction of Microwave Theory and Techniques, pp. 1 10, June Ph.D. Thesis 2015 M. Parvizi

32 1.6. CONTRIBUTIONS Short Channel Effects Mitigation Employing Forward Body Biasing Technique to Realize a 0.5-V 250-µW GHz Current-reuse CMOS LNA The restrictions imposed by ultra-low power consumption and ultra-low voltage design must be mitigated through novel circuit design techniques without additional power consumption to enable the realization of high performance ULP and ULV RF circuitry. This work examines the use of forward body biasing scheme to mitigate short channel effects in ultra-low voltage circuits with no additional power consumption. It is shown that forward body biasing boosts the output resistance of a transistor such that the intrinsic gain reduction due to low supply voltages can be compensated. This technique, combined with multiple low power and low voltage techniques, is then used to implement a low noise amplifier tailored for ultra-low power and ultra-low voltage applications. The LNA is realized in a IBM 0.13-µm 1P8M CMOS technology, and it occupies 0.39-mm 2 area. The contributions associated with this work is shown below [36]. The author s contribution was to generate the main idea of the work and also to design, analyse, implement, fabricate and measure the circuits. The contribution of Dr. K. Allidina was advice on circuit simulation and measurement. M. Parvizi, K. Allidina, and M.N. El-Gamal, Short channel effects mitigation employing forward body biasing technique to realize a 0.5-v 250-µW GHz current-reuse CMOS LNA, Submitted for publication as a journal paper in IEEE Journal of Solid State Circuits, pp. 1 12, June A 0.4-V Ultra Low-Power UWB CMOS LNA Employing Noise Cancellation An ultra-low power (410-µW) and ultra-low voltage (0.4-V) LNA is realized for ultrawideband applications using thermal noise cancellation scheme. A current-reuse technique is used to lower the power consumption, and an inductive gm-boosting technique is exploited to increase the gain and improve input matching at high frequencies. The circuit is implemented in a 90nm TSMC CMOS technology. Ph.D. Thesis 2015 M. Parvizi

33 1.6. CONTRIBUTIONS The contributions associated with this work is shown below [27, 34]. The author s contribution was to generate the main idea of the work and also to design, analyse, and implement the circuits. The contribution of Dr. K. Allidina was advice on circuit simulation and Layout techniques. Also, the contribution of Dr. F. Nabki was to advice on the analysis and presentation of the work. M. Parvizi, K. Allidina, F. Nabki, and M. N. El-Gamal, A 0.4V ultra low-power UWB CMOS LNA employing noise cancellation, in IEEE Int. Symp. on Circuits and Syst., pages , May An Ultra Low Power, Low Voltage CMOS Squarer Circuit for Non- Coherent IR-UWB Receivers The design of a low power, low voltage and high gain CMOS squarer circuit for use in non-coherent impulse radio ultra wideband (IR-UWB) receivers is the focus of this section. The proposed squaring function is implemented based on the intrinsic CMOS transistor characteristics in the sub-threshold region, where the second-order derivative of the drain current is maximized. Additionally, a capacitor cross-coupling g m boosting technique is exploited to increase the conversion gain of the squarer. Utilizing the aforementioned schemes, a low power and high gain squarer is realized in a TSMC 90-nm CMOS technology. The contributions associated with this work is highlighted below [28]. The author s contribution was to generate the main idea of the work and also to design, analyse and implement the circuits. The contribution of Dr. K. Allidina was advice on circuit simulation. M. Parvizi, K. Allidina, and M.N. El-Gamal, An ultra low power, low voltage CMOS squarer circuit for non-coherent IR-UWB receivers, in the IEEE Int. Circuits Syst. Symp., pages , May An Ultra-Low Power Injection Locked Clock Recovery Scheme for a Chirp- 2FSK-UWB Receiver Low power IR-UWB receivers are mostly based on non-coherent detection to reduce the complexity and power consumption of the RF front-end; however, they usually suffer from complex clock and data recovery and time synchronization schemes. In this work, a Ph.D. Thesis 2015 M. Parvizi

34 1.7. THESIS OUTLINE new synchronization scheme based on injection locking is presented for a chirp-2fsk-uwb receiver to reduce the complexity and the power consumption of the receiver. The contributions associated with this work is shown below [37]. The author s contribution was to generate the main idea of the work and also to design, analyse, and implement the circuits. The contribution of Dr. K. Allidina was advice on circuit simulation. M. Parvizi, K. Allidina, and M.N. El-Gamal, An ultra-low power injection locked clock recovery scheme for a chirp-2fsk-uwb receiver, To be submitted for publication as a journal paper in Transaction of Circuits and Systems II, pp. 1 5, Oct Thesis Outline This thesis is organized as follows: Chapter 2 presents an overview of IR-UWB signals and modulation techniques, followed by a literature review of coherent and non-coherent IR- UWB receivers. Then, an overview of state-of-the-art receiver circuitry including low noise amplifier (LNA) and squarer will be discussed. Chapter 3 presents an ultra-low voltage design challenges and low drain-source impacts on a single MOS transistor. The review of ultra-low voltage design challenges is necessary as one of the focuses of this thesis is on the development of ultra-low voltage design techniques for wideband circuits. Chapter 4 presents the design techniques to overcome ultra-low power and ultra-low voltage challenges. The emphasis is on both transistor characteristic improvements and also circuit-level techniques. Chapter 5 introduces a sub-mw resistive-shunt feedback LNA which achieves one the best figure-of merits (FOM) in the literature employing multiple design techniques. Chapter 6 highlights a compact and inductorless LNA that uses complementary current-reuse techniques along with active shunt-feedback input impedance matching to achieve ultra-low power design. Next, on Chapter 7 an ultra-low power and ultra-low voltage LNA that uses forward body biasing along with multiple novel design techniques to realize a wideband LNA solution is presented. Chapter 8 introduces the chirp-fsk-uwb receiver implemented in 0.13µm CMOS technology and the new injection locking based clock recovery scheme for low power and fast synchronization. Ph.D. Thesis 2015 M. Parvizi

35 CHAPTER 2 Review of State-of-The-Art Works 2.1 Introduction This chapter provides a review of state-of-the-art works in the literature on IR-UWB receivers, in addition to the review on the main RF circuitry used in these receivers including low noise amplifiers and squarer circuits. First, different modulation schemes used in IR- UWB transceivers and their characteristics will be discussed, while the focus is on low power, low complexity implementation. Then, IR-UWB receivers with both coherent and non-coherent detection schemes will be presented, followed by a review of synchronization techniques in these types of receivers. Next, a breakdown of power consumption in the published receivers will be discussed. After that, various wideband LNA design techniques in the literature will be presented. The LNA design will be studied in more details by reviewing the designs in the literature with low power (<1-mW) and low supply voltages (<1-V). The design of energy detectors (squarer circuits) as one of the main building blocks of non-coherent receivers will be studied next. Finally, a summary of the designs will be provided. 2.2 IR-UWB Modulation Schemes Modulation schemes play an important role in the energy efficiency and simplicity of the transceiver design in IR-UWB technology. There are multiple modulation techniques which are compatible to IR-UWB technology. The most widely used ones include: on-off keying (OOK) [38 44], pulse position modulation (PPM) [42 48] binary phase shift keying (BPSK) [47, 49, 50], transmitted reference (TR) [51 53], synchronized-ook (S-OOK) [54], and Chirp UWB frequency shift keying (C-FSK) [55, 56]. We will briefly discuss the advantages and disadvantages of each modulation scheme. 14

36 2.2. IR-UWB MODULATION SCHEMES ON-OFF Keying (OOK) On-off keying (OOK) is one of the popular modulation schemes for non-coherent reception. In this scheme, a 1 is represented by the existence of a pulse and 0 is represented by the absence of a pulse, as shown if Fig. 2 1 (a). It is represented mathematically by s i (t) = σ i p(t), σ i = 0, 1 (2.1) where p(t) is the UWB pulse in the time domain. As can be seen in (2.1) the bit 1 is created when σ i = 1 and the bit 0 is created when σ i = 0. Hence, in OOK modulation a pulse is only sent when bit 1 is transmitted. This provides a situation where the energy per pulse can be doubled compared to other modulation schemes that use a pulse for both bit 1 and bit 0 (e.g. BPSK, PPM,...), while still meeting the FCC regulations. On the other hand, OOK modulation is spectrally inefficient and is susceptible to interferers [40] and multipath signals [57]. For example, the use of OOK modulation with non-coherent detection makes the detection of the bit 0 difficult at the presence of interferers. Moreover, the use of consequent 0 s might lead to synchronization loss at the receiver Pulse Position Modulation (PPM) In pulse position modulation (PPM) the position of the pulse within a time frame determines the transmitted bit, as shown in Fig. 2 1 (b). PPM modulation is represented mathematically by s i (t) = p(t τ i ), τ i = 0, τ 0 (2.2) where τ i is the delay for each pulse and τ 0 is the delay for bit 0. PPM modulation has the advantage of simple generation by only controlling the delay of the pulse. The other advantage of PPM modulation scheme is the differential signalling which leads to reduced noise and distortion at the reception. For example, the received signals at both frames is compared with each other to determine the received bit. However, this comes at the cost of reduced power efficiency since the transceiver has to operate at least two times longer compared to OOK modulation. Ph.D. Thesis 2015 M. Parvizi

37 2.2. IR-UWB MODULATION SCHEMES Binary Phase Shift Keying (BPSK) Binary phase shift keying employs polarity change in the pulse to create bits 0 and 1, as shown in Fig. 2 1 (c). It is best represented mathematically by s i (t) = σ i p(t). σ i = 1, 1 (2.3) As can be seen in the figure and in (2.3), bit 0 is constructed by 180 degree phase shift of the pulse which represents 1. There is an inherent 3-dB advantage in SNR in BPSK modulation over systems operating with OOK and PPM since BPSK is an antipodal modulation whereas PPM is an orthogonal modulation technique [58]. Moreover, like OOK modulation, BPSK transceiver is active only for one bit period; which as discussed previously improves the efficiency of the transceiver. However, since the information is embedded in the phase of the pulses, non-coherent detection can not be used for detection which leads to more complicated and power hungry detectors Transmitted Reference Transmitted reference transceivers, employs a reference pulse in each bit with a known polarity followed by a data pulse whose polarity depends on the transmitted bit (similar to BPSK) to transmit data. Then, the reference pulse is utilized at the receive side to perform comparison with the data pulse. If the data pulse is similar to the reference pulse, the transmitted bit is 1. If the received pulse is antipodal to the reference pulse then it is interpreted as 0. Fig. 2 1 (d) highlights this signalling scheme. Transmitted reference modulation can be represented mathematically by s i (t) = p(t) + σ i p(t). σ i = 1, 1 (2.4) This scheme is used in UWB systems to create a reference pulse which goes through the same channel as the data pulse and can be used to demodulate the pulse [53]. Moreover, it overcomes the uncertainty in the detection of the pulses introduced by the large number of multipath signals at the receiver. However, the main disadvantage of this signalling scheme Ph.D. Thesis 2015 M. Parvizi

38 2.2. IR-UWB MODULATION SCHEMES is that the transceiver has to be active for a larger period of time compared to reference-less signalling schemes like OOK and PPM. This in turn leads to lower overall efficiency in the transceiver Synchronized-OOK (S-OOK) Synchronized-OOK modulation [54] is a specific type of transmitted-reference signalling. It is defined with two pulses for each bit: a synchronization pulse that always exists in each bit, and the data pulse which is OOK modulated. This modulation scheme is shown in Fig. 2 1 (e). The S-OOK modulation can be represented mathematically by s i (t) = p(t) + σ i p(t). σ i = 0, 1 (2.5) In practice, the number of pulses in each bit increases by 1.5 pulses/bit on average, which reduces the maximum allowed energy per pulse. The main advantage of S-OOK is that the demodulation and synchronization only requires a relative comparison between synchronization pulse and the data pulse. The S-OOK receiver demodulates the transmitted pulses with similar to transmitted-reference detection but with a simpler hardware. However, this leads to a lower efficiency at the transceiver since the time that the transceiver has to be ON is increased by a factor of Chirp-Frequency Shift Keying (C-FSK) Chirp UWB frequency shift keying (C-FSK) uses ultra-wideband pulses with different center frequencies for data transmission. This modulation scheme is highlighted in Fig. 2 1 (f). The chirp-fsk UWB can be represented in time domain by [56] s(t) = A c sin[(ω 0 + β t) t], 0 < t < t p (2.6) where ω 0 is the initial chirp frequency, β is the rate of the chirp, and t p is the duration of the chirp pulse. Chirp modulation can trade the pulse amplitude with pulse width to achieve evenly spread spectrum over 500-MHz bandwidth. Hence, this modulation scheme is voltage scalable with CMOS technology. The main advantage of C-FSK UWB modulation Ph.D. Thesis 2015 M. Parvizi

39 2.2. IR-UWB MODULATION SCHEMES (a) (b) (c) (d) (e) (f) Figure 2 1: The modulation schemes used in impulse UWB technology (a) on-off keying modulation (OOK) (b) pulse position modulation (PPM) (c) binary phase shift keying (BPSK) (d) transmitted-reference (TR) (e) synchronized-ook (S-OOK) (f) chirp frequency shift keying (C-FSK) scheme over other modulation schemes is that unlike OOK modulation it takes advantage of differential signalling to increase the noise, multipath and interference immunity. Moreover, Ph.D. Thesis 2015 M. Parvizi

40 2.3. IR-UWB RECEIVERS unlike PPM, transmitted reference and S-OOK modulation schemes, it does not require high active time to transmit and receive pulses. 2.3 IR-UWB Receivers In general IR-UWB receivers are divided into two main categories: coherent and noncoherent receivers. The operation principle of coherent receivers is based on the comparison of a locally generated template pulse with the incoming signal. The coherent detection increases the immunity of the receiver to the noise, multipath signals and interferences. However, it requires precise timing in the receiver which usually leads to complicated and high power synchronization schemes. Moreover, this correlation is mostly implemented in the digital domain which leads to a wideband analog to digital converters (ADC) that are very power consuming. On the other hand, in non-coherent receivers there is no template pulse at the receiver and either the energy or the peak of the received pulse is used to determine the transmitted bit. This scheme is suitable for low data rate applications and its immunity to noise and interference is lower compared to coherent receivers. In this section, we will discuss the characteristics of the coherent and non-coherent receivers in details and will review the state-of-the art works in each type of reception Coherent Receivers As mentioned above, coherent receivers correlate a locally generated template pulse with the received pulses to demodulate the transmitted bit [59 62]. If the correlation of the pulses is greater than a specific threshold then a pulse is received. Fig. 2 2 shows a correlation based coherent receiver principle of operation. As can be seen in the figure, a template pulse generator is required in the receiver. This is neededd since the transmitted pulse is distorted based on the wireless channel characteristics [63]. Hence, the template generation block has to determine the channel characteristics and apply them into the template pulse to achieve a high performance transceiver. In the following subsections, we will briefly review the multiple coherent receiver schemes and state-of-the-art designs in each category. Ph.D. Thesis 2015 M. Parvizi

41 2.3. IR-UWB RECEIVERS LNA BPF LPF ADC Data Out Pulse Gen. Figure 2 2: A typical coherent correlation based receiver ADC Based Receivers The first category of IR-UWB coherent receivers is ADC based. The main advantage of ADC based receivers is the reconfigurability in the generation of template pulse which leads to improvements in the receiver performance. Digital implementation of templates provides the required reconfigurability, hence, the template pulse must be stored in digital memories and the correlation happens in digital domain. This requires a high speed ADC covering the whole UWB frequency band [64, 65]. In the topology presented in [64], and shown in Fig. 2 3 (a) a wideband single-ended LNA is followed by a second-order bandpass filter for interference rejection. A single-ended to differential converter with a tunable notch filter to suppress narrowband interferers is placed after the bandpass filter. Then the signals are quadratically down-converted and directly converted into digital by using two off-chip parallel 500Msps ADCs. The use of high speed ADCs leads to a flexible receiver capable of doing multiple parallel processing. However, the complexity of the integrated design and the high power consumption of high speed ADCs are the main drawbacks of this design. In some designs a 1-bit ADC is suggested to reduce the power consumption of the receiver [66 69]. For example, the receiver presented in [66] uses a bank of 32 time-interleaved 1-bit ADCs with a sampling rate of 2-Gs/s to digitize the low UWB frequency band. Fig. 2 3 (b) shows the block diagram of this receiver. The main reason to reduce the resolution Ph.D. Thesis 2015 M. Parvizi

42 2.3. IR-UWB RECEIVERS LPF 500Msps 5-bit ADC I LNA BPF Balun Q LPF 500Msps 5-bit ADC (a) Time Interleaved 1 bit- ADC TIA Gain 1 bit- ADC.. Digital Backend. 1 bit- ADC (b) Figure 2 3: ADC based receiver (a) The simplified architecture presented in [64] with high speed ADCs (b) The simplified block diagram of the receiver presented in [66] with a bank of 1-bit ADCs. of the ADC is to save on power consumption compared to high speed high resolution ADCs. Furthermore, the choice of 1-bit ADC is made in this work by considering that in the interference dominant environments the low resolution of ADCs does not degrade the overall performance. The digital back-end is not implemented on-chip in this work. The receiver described in [67] uses a similar approach to [66], however, with a full digital back-end implementation. The analog front-end is composed of an LNA, wideband amplifier and 32 phase-interleaved ADC. Moreover, a timing circuitry generates the required 32 phases for the operation of the ADC. The receiver shows a good performance however, at the cost of complicated digital back-end and high power consumption. The receiver presented in [68] uses a 1-bit ADC design concept to implement a 4 5-GHz IR-UWB receiver. To overcome the high sampling rate and hence high power consumption in the ADCs, this work combines the non-coherent energy detection and asynchronous analogto digital conversion schemes. The operation principle is as follows: first the received signal Ph.D. Thesis 2015 M. Parvizi

43 2.3. IR-UWB RECEIVERS is quantized using progressive saturation in two separate channels each consisted of LNA and gain stages. The output of gain stages are XNORed to implement squaring or autocorrelation function. The theory behind is that if there is no pulse at the input, the available thermal noise will be digitized and the output will fluctuate between 0 and 1 randomly leading to an average of 0.5 at the output of the integrator, whereas at the presence of a pulse, the output of amplifier chain will match each other and XNOR output will be 1. A single channel ADC with 1-GHz sampling rate follows the integrator output. The power consumption of the analog front-end of the receiver is 7-mW which is 20% of the overall power consumption (excluding baseband). Correlation in Analog Domain Based Receivers The use of high speed ADCs can be avoided by implementing the pulse correlation in the analog domain rather than after the ADC in digital domain [59 61, 70 72]. This leads to reduced sampling rate from Nyquist rate to pulse rate. This lower sampling rate can be used to implement ADCs with higher resolution to increase robustness against interferers. Moreover, it generally leads to lower power consumption. Nevertheless, considering that channel compensation is limited in analog domain, the implementation of the template pulse in analog domain causes a degraded performance in the receiver. Here, we will review a few receivers based on analog domain pulse correlation. In the receiver presented in [61], quadrature analog correlation is implemented in the 3 G-Hz UWB band. Fig. 2 4 (a) shows the simplified block diagram of the receiver. As can be seen in the figure, the quadrature analog correlation is fulfilled by first mixing the incoming signal with in-phase (I) and quadrature (Q) local oscillator and then integrating the resulting function. In fact, the ideal analog pulse template is replaced by a rectangular window to reduce the complexity and the power consumption of the receiver. However, this approximation leads to 1-dB of loss in non-line-of-sight channels. The correlation in analog domain is followed by a 4-bit flash ADC in both I and Q channels. The PPM modulation is used in this work, hence, two serially cascaded delay lines are required to provide the Ph.D. Thesis 2015 M. Parvizi

44 2.3. IR-UWB RECEIVERS I LPF 4-bit ADC LNA LO Q LPF 4-bit ADC (a) I LPF LNA LO Quadrature IF Pulse Template Memory Demodulator Q LPF (b) Figure 2 4: Analog correlation based receivers (a) The receiver presented in [61] (b) The analog correlation presented in [60]. timing in the analog front-end the first one for the required delays for PPM pulses and the second one to provide the timing window for the integrator in the quadrature analog correlator. The total power consumption of the receiver is 28.8-mW. More than 75% of the power is consumed by the RF front-end of the receiver which highlights the importance of low power RF design and effective duty cycling in the receiver. The quadrature analog correlation is implemented in [71] for a UWB system operating in the MHz band. The architecture is similar to [61] which is discussed previously. However, the whole receiver including the synchronization block and the digital back-end circuitry is also implemented on chip, nevertheless, it requires an off-chip crystal reference to generate accurate timing. The data rates supported by this receiver ranges from 305-Kb/s to 39-Mb/s. The receiver achieves energy efficiency of 108-pJ/bit in the data detection mode. Digital pulse templates leads to better channel estimation and improved SNR. The receiver in [60] uses digital templates which are stored in high speed memories and a digital Ph.D. Thesis 2015 M. Parvizi

45 2.3. IR-UWB RECEIVERS to analog converter (DAC) to create the analog template pulse. The simplified block diagram of the receiver is shown in Fig. 2 4 (b). As can be seen in the figure, the signal is first amplified by a distributed LNA which covers the whole 3 10-GHz operation band of the receiver. The quadrature mixers down-convert the signal. The 3-rd order low pass filter after the mixer is followed by a 6-bit discrete-time G m multiplier to combine the functionality of a DAC and an analog multiplier. The output current of the multiplier is integrated using a switched capacitor circuitry. The receiver consumes 156-mW power. About 14% of the power is consumed in the LNA and mixer and 54% in the multipliers which again highlights the importance of low power RF design Non-Coherent Receivers The complexity of the design and high power consumption of coherent receivers preclude their use in low power and low data rate applications. Non-coherent receivers usually provide a better solution for these applications with a low power consumption [39,40,42,73,74]. The non-coherent receivers are generally based on energy detection, peak detection or super regenerative detection. Energy detector based receivers square the incoming pulse and integrate the output in a specific time window [41, 42, 48, 75]. The output of the integrator is compared with a threshold to determine the transmitted bit. As a result, the duration of the time window plays an important role in capturing the pulse and/or interferers/noise [74, 76]. On the other hand, peak detector based receivers [38, 77, 78] compares the incoming signal with a threshold to determine if a pulse is received or not. Hence, the peak detector based receivers are not sensitive to the duration and placement of the integration window. The other type of non-coherent receivers are based on super regenerative amplification and energy detection [39, 79, 80]. The principle of operation of these types of receivers is based on triggering an oscillation with the input pulses. Then the generated oscillation will be used for an easy envelope detection and the pulses will be decoded based the start up time of Ph.D. Thesis 2015 M. Parvizi

46 2.3. IR-UWB RECEIVERS the oscillator. However, the main drawback of these type of receivers is that any noise or interference can trigger the oscillation and reduce the sensitivity of the receiver. In the following subsections we will briefly review the state of the art designs in all types of non-coherent receivers. Energy Detection Based Receiver The receiver presented in [42] uses non-coherent energy detection scheme to demodulate PPM pulses. The simplified block diagram of the analog front-end of the receiver is highlighted in Fig. 2 5 (a). As can be seen in the figure, the incoming pulses are amplified by an LNA followed by 6 band-pass RF gain stages. Then the filtered output of the gain stages is squared using a passive squarer circuitry. The baseband signal is then amplified and integrated in a time window of 30-ns for each PPM pulse. The required timing and synchronization of the receiver is implemented off-chip using an FPGA. This works employs duty-cycling to improve the energy efficiency of the receiver, however the fact that the receiver has to be active for 60-ns to demodulate the PPM pulses reduces the effectiveness of the technique. The non-coherent receiver presented in [45] uses a energy detection scheme to demodulate the incoming signal. Fig. 2 5 (b) shows the block diagram of this receiver. It uses PPM modulation in the band of 7.25 to 8.5-GHz. The received signals are amplified by an LNA followed by a VGA. Then the amplified signal enters an squarer block followed by an integrator. The integration window is 15-ns. This work only implemented the RF front-end circuitry and achieves 0.84-nJ/b energy efficiency while burning 4.2-mW power. The receiver presented in [38] employs OOK modulation in a energy detection based receiver. The block diagram of the receiver is highlighted in Fig. 2 5 (c). As can be seen in the figure, the RF front-end of the receiver is comprised of an LNA followed by two RF amplifiers. The output of the first and the second RF amplifiers is used to feed a self mixer for energy detection. The baseband signal is amplified by a programmable gain amplifier (PGA) and then used to make decision using a comparator. The demodulation scheme used Ph.D. Thesis 2015 M. Parvizi

47 2.3. IR-UWB RECEIVERS 0-5 RF Gain Stages BB Gain LNA BB ADC ADC (a) LNA VGA ( ) 2 (b) LNA RF Amp1 RF Amp1 PGA All Digital CDR (c) Figure 2 5: Block diagram of non-coherent receivers (a) The receiver presented in [42] (b) The energy detection based receiver presented in [45] (c) The non-coherent receiver with asynchronous demodulator presented in [38]. in this work is continuous-time slicing which leads to decoupling of demodulation and synchronization from each other and hence reduced acquisition time and power consumption compared to conventional gated-integration technique. The receiver provides self duty cycling to improve the efficiency of the receiver. The ON time of the receiver is 30-ns. The total power consumption of the receiver without duty cycling is 11.5-mW. Employing duty cycling the receiver achieves nJ/b energy efficiency. Peak Detection Based Receiver The work presented in [77] uses a dual band receiver based on peak detection to demodulate OOK signals in the band of 3 5-GHz. The receiver is comprised of 5 tunable bandpass gain stages followed by a peak detector which effectively serves as a 1-bit asynchronous ADC. The peak detector is a comparator with regenerative latch for fast decision making. The purpose of dual band operation is to separate the data and timing pulses Ph.D. Thesis 2015 M. Parvizi

48 2.3. IR-UWB RECEIVERS LNA RF Gain Stages V TH Bit holder Phase Det ector V TL + Figure 2 6: The block diagram of a peak detection based receiver based on the work presented in [81]. and simplifying synchronization and duty cycling. The receiver synchronization and digital back-end is implemented off-chip. The static power of the receiver is 7.5-mW. The receiver presented in [81] operates based on double thresholding scheme in the MHz band with BPSK modulated pulses. Fig. 2 6 shows the block diagram of the receiver. As can be seen in the figure the incoming signals are amplified by RF gain stages followed by two comparators which compare the incoming pulses against two thresholds to detect peaks. A phase detector follows the comparators to determine the received bit stream. Having two thresholds allows the system to rejects all the noises in between the thresholds, improving the resilience of the receiver against noise. The static power of the receiver is 1.6-mW to transmit 25-kb/s over 35-cm range. Super Regenerative Receiver The receiver in [79] uses super regenerative detection in the band of 3 5-GHz to demodulate OOK modulated pulses. The block diagram of the receiver is shown in Fig As shown in the figure, the receiver is composed of an LNA followed by a single-ended to differential converter. Then the output of the single-ended to differential converter is injected to a resonant core. The operation of the resonant core is controlled by a quench signal. It basically determines the time that the oscillation can start. Moreover, a digital PLL fine tunes the center frequency of the LC resonant tank. An envelope detector follows the resonant core to detect the envelope variation at the output of the VCO. The output of the Ph.D. Thesis 2015 M. Parvizi

49 2.3. IR-UWB RECEIVERS LNA Balun Envelop Detect. -Gm Quench Signal Figure 2 7: The block diagram of a super regenerative receiver based on the work presented in [79]. envelope detector is latched to the proper 0 or 1 using a comparator. The received input pulse is detected based on the start up time of the oscillator. The receiver presented in [39] uses the OOK modulation in the 3 5-GHz band. The receiver includes an LNA followed by the super regenerative amplifier or the resonant tank. The squelch pulse is configured in a way that it shorts the differential input of the super regenerative amplifier while keeping it in the inactive state. An envelope detector, sample-and-hold and comparators follows the super regenerative amplifier. The overall power consumption of the receiver is 6.6-mW, achieving 0.32-nJ/bit efficiency Comparison of Non-Coherent UWB Receivers To perform a fair quantitative comparison between IR-UWB receivers some background information is required to be considered. The first is the dependency of the sensitivity of the receiver to the data rate. As the data rate decreases the sensitivity improves. This is mainly due to the total energy that the receiver can detect. For instance, if the sensitivity of a receiver is 50-dBm at 10-Mb/s, the sensitivity will be improved by 10-dB if the data rate is scaled down to 1-Mb/s. However, it should be noted that the minimum detectable signal for the receiver is always the same. Ph.D. Thesis 2015 M. Parvizi

50 2.3. IR-UWB RECEIVERS Table 2 1: Comparison of state-of-the-art low data rate non-coherent IR-UWB receivers Ref. [48] [42] [45] [54] [39] [82] JSSC JSSC JSSC JSSC RFIC ISSCC Technology (nm) Receiver Type ED ED ED Async- Async- SRA ED ED Max Data Rate (Mb/s) Modulation PPM PPM, PPM S-OOK OOK OOK Sensitivity (dbm) Sensitivity Normalized to 1-Mb/s (dbm) 100kb/s OOK 16Mb/s 5Mb/s 1Mb/s 20.8Mb/s 2Mb/s Frequency (GHz) NA Area (mm 2 ) Front-end Power not Back-end Power Off- Offchip provided (mw) Chip 0.1 Energy/bit highest data rate The second point is the energy per bit metric that is used to show the energy efficiency of a receiver. While this is a useful metric for comparison, it is not fair to compare energy efficiency at two different data rates. Furthermore, the sensitivity of the receivers should be the same to do a fair comparison. Because a receiver might be consuming less power in RF gain stages and therefore has less sensitivity but it will have better efficiency. Finally, some receivers are only based on RF front-end integration and lack the digital back-end and synchronization. Hence, this point should be noted for comparison, as well. Therefore, in this comparison first we limit ourselves to non-coherent and low data rate receivers to be more fair in comparison, secondly a tab is added to show normalized Ph.D. Thesis 2015 M. Parvizi

51 Energy/bit (nj) 2.4. SYNCHRONIZATION IN IR-UWB RECEIVERS 10 [54] [48] 1 [39] [45] [42] [82] [44] Max Data Rate (Mb/s) Figure 2 8: The energy per bit versus maximum data rate of the receivers. sensitivity to the data rate of 1-Mb/s. Table 2 1 shows the comparison of the state-of-the-art non-coherent receivers in the literature. One of the important aspects of IR-UWB based receivers is the ability to duty cycle the receiver, and it manifests itself in the receivers with duty-cycling [38,42]. This highlights the need for a fast and low power synchronization scheme to reduce the preamble, especially in sensor network applications with low data payload. Fig. 2 8 compares the energy per bit of the non-coherent receivers versus the data rate. As can be seen in the figure, the complete receivers [42, 44] have higher energy per bit. 2.4 Synchronization in IR-UWB Receivers Pulse-level synchronization is the most basic level of synchronization in IR-UWB receivers. However, the required level of precision depends on the type of receiver. For example, in a coherent receiver based on analog correlation, the required accuracy is determined by the amount of error introduced by moving the template pulse from the received pulse. For a few nano-second pulses a time mismatch in the order of ±100-ps introduces significant Ph.D. Thesis 2015 M. Parvizi

52 2.4. SYNCHRONIZATION IN IR-UWB RECEIVERS amount of error. Generally, it can be said that the required timing accuracy is roughly 1/10 th of the pulse width [83]. This budget should include the jitter contribution of the VCO and the clock distribution network which makes the requirement even tighter. Hence, providing an accurate timing for coherent receiver is complex and power consuming. On the other hand, in non-coherent receivers that operate based on integration window, the required accuracy is determined by the length of the integration window. Considering that the length of the integration window is roughly 10 times larger than the pulse width, the required accuracy is relaxed and can be delivered with lower complexity and hence lower power consumption. Even though the timing is relaxed, these type of receivers have to search for a large time span since the pulse rate is low. This means that there is a large empty space between pulses that the receiver has to search to find the pulses. For instance, for a data rate of 1-Mb/s and integration window of 10-ns, there are 100 time windows that must be searched for. This adds complexity and increases the power consumption in the non-coherent receivers. Fig. 2 9 illustrates a block diagram of a conventional energy detection based receiver to highlight the synchronization challenges. As can be seen in the figure, an accurate clock is generated in the receiver using a crystal oscillator. Then, multiple phases of this clock is generated using a delay locked loop (DLL). These delayed versions of the clock are used to detect the incoming pulses with the aid of signal processing schemes. As mentioned earlier, the whole time span between two adjacent pulses must be covered with DLL and searched for the pulse. Typically, a preamble before the payload with a known pattern is used for synchronization. This type of receiver imposes a trade-off between time needed for synchronization and the robustness against narrowband interferes and noise. For example, the time required for synchronization can be reduced by increasing the integration time window. By increasing the width of time window from 10-ns to 100-ns the required delayed version of the clock is reduced from 100 to 10 windows and the synchronization time is also improved by a factor Ph.D. Thesis 2015 M. Parvizi

53 2.4. SYNCHRONIZATION IN IR-UWB RECEIVERS LNA ADC Demodulator Synchron izer DLL Phase Select Figure 2 9: The block diagram of a conventional synchronization scheme in non-coherent IR-UWB receivers with a crystal oscillator and DLL. of 10. However, the wide integration bandwidth leads to integration of interferes and noise with the incoming pulses and increases the probability of false detection. Another approach for synchronization is based on all digital clock and data recovery (CDR) presented in [38, 54]. The use of continuous time slicing technique used in the demodulator leads to decoupling the demodulation and the synchronization [38] and reduces the required time for synchronization. Hence, the timing information can be extracted from the return to zero (RZ) baseband signals. The all-digital CDR is composed of a tri-state PFD (which is used in the acquisition mode only), a Hogge phase detector (used in tracking mode to maintain the phase lock) followed by a loop filter. Fig shows a synchronization scheme based on pulse injection locking for a high data rate UWB receiver [84]. This technique eliminates the need for a clock and data recovery system as local oscillator in the receiver is injection locked to the received pulses. The other advantage is that the architecture is based on feed-forward and there is no stability issues as found in feedback systems. Moreover, the simple synchronization scheme reduces the power consumption in the receiver. However, since the local oscillator clock operates in the 3 5-GHz band the injection is sub-harmonic and the effective injection ratio is given by Ph.D. Thesis 2015 M. Parvizi

54 2.5. POWER CONSUMPTION BREAKDOWN OF LOW DATA RATE RECEIVERS LNA Injection Locking Pulse ADC Injection-Locked Rec over ed C lk Figure 2 10: The block diagram of a injection locking scheme for synchronization in high data rate IR-UWB receivers. N = 1 α β DR inj W pulse, (2.7) where α is the probability of the transmitted data be 1, β is the roll-off coefficient due to pulse shaping, DR inj is the data rate and W pulse is the width of the transmitted pulse. As can be seen in (8.26) the sub-harmonic injection ratio, N, might become large as the OOK modulation is chosen in this work which reduces β and degrades the phase noise and the bandwidth of the injection locked signal. On the other hand, injection locking with the current approach is not applicable to low data rate receivers since the DR inj is very low in these type of receivers, hence, limits the injection locking bandwidth and phase noise of the recovered clock even more. Consequently, a new approach is required to apply injection locking into low data rate receivers. The clock recovery scheme presented in this work is based on injection locking for low data rate receivers which achieves low power operation with a low sub-harmonic injection ratio. the details of the technique is provided when we present the proposed receiver. 2.5 Power Consumption Breakdown of Low Data rate Receivers As discussed earlier in this chapter, RF and analog front-end of the receiver is considered the most power consuming block in most of the receivers [38, 42, 45, 61, 68, 82]. The main Ph.D. Thesis 2015 M. Parvizi

55 2.5. POWER CONSUMPTION BREAKDOWN OF LOW DATA RATE RECEIVERS Total power =8.83mW LNA (1mW) RF-Amp (1.9mW) self-mixer (2mW) (a) Baseband Amp(1.43mW) Demodulator(1.8mW) Bias Circuit(0.7mW) Total power =22.6mW LNA(5.9mW) RF Amp(14.3mW) Self-mixer(0mW) Baseband Amp(1.5mW) 14.3 Leakage(0.64mW) (b) Crystal Oscillator & Clk tree(0.3mw) Figure 2 11: The breakdown of power consumption in RF front-end of non-coherent receivers (a) a non-coherent receiver based on self-mixing and asynchronous demodulation [82] (b) a non-coherent receiver based on passive squaring and integration [42]. reason for this high power consumption is that the RF front-end in UWB receivers has to provide wideband input matching, gain and moderate noise performance which combined leads to high power consumption. To investigate the power consumption in the RF/analog front-end of non-coherent UWB receiver, Fig shows the breakdown of power consumption in different blocks in two non-coherent UWB receivers. The chart in Fig (a) is based on the work in [82] and in Fig (b) is based on the receiver presented in [42]. As can be seen in the charts, low noise amplifier is one the most power consuming blocks in the receiver chain (12% and 26% in Fig (a) and (b), respectively). LNA is the first Ph.D. Thesis 2015 M. Parvizi

56 2.6. LOW NOISE AMPLIFIER DESIGN TECHNIQUES active component in the front-end and has to provide input matching, high gain, with low noise figure simultaneously. RF gain stages also burn huge power as they need to operate in with wide bandwidth and high gain (21% and 63% in Fig (a) and (b), respectively). Self-mixer or the squarer can be power consuming depending on the implementation type. For example, an active self-mixer in Fig (a) consumes 22% of the total power, however, the passive squarer in Fig (b) does not consume any DC power. Baseband amplifiers, demodulators and bias circuitry also contribute to the overall power consumption. As can be seen in the literature, while power saving can be achieved by simplicity and novelty in the system-level design of the receiver, like the choice of optimum modulation scheme, non-coherent detection scheme and low complexity synchronization scheme, low power circuit level design has a huge impact on the overall power consumption. Therefore, besides the focus on the low power receiver with a new synchronization scheme, this work also focuses on the low power and low voltage implementation of low noise amplifiers and squarers which are among the most power consuming blocks in the receiver. The details of the designs will be discussed in the next chapters. 2.6 Low Noise Amplifier Design Techniques In this section, we will review the state-of-the-art techniques to implement wideband, low power and low voltage LNAs. First, we will discuss wideband LNA design techniques, then ultra-low power LNA implementation will be reviewed and finally ultra-low voltage low noise amplifiers will be discussed Wideband Low Noise Amplifier Design The LNA is the first active component in the front-end of the receiver, and is generally considered as one of the most power hungry blocks. The high power consumption stems from the fact that an LNA must provide simultaneous wideband matching, high gain, low noise and high linearity, all of which typically require high power and high supply voltages. These combined specifications have made the design of low power and low voltage UWB LNAs Ph.D. Thesis 2015 M. Parvizi

57 2.6. LOW NOISE AMPLIFIER DESIGN TECHNIQUES Rd Ld/2 L d L d/2... M1 M2 Mn Vout RS R g L g /2 L g L g /2 (a) ZL LC Network Vout RS L1 C1 L g M1 L 2 C2 LS Vbias (b) Figure 2 12: The circuit schematic of (a) distributed amplifier, (b) inductive degeneration LNA with LC network to increase the input matching bandwidth. a challenging research topic. There are various well-known techniques to design wideband LNAs in the literature. In this section we will briefly review these techniques. Distributed Amplifier A frequent approach for designing wideband LNAs is to employ distributed amplifiers [85 87] as shown in Fig. 2 12(a), which provide high bandwidths that can span into the multi gigahertz range. However, distributed amplifiers suffer from high power consumption and occupy a large chip area due to the need for multiple stages and a large number of inductors. These drawbacks preclude distributed amplifiers use in area efficient and low power applications. Ph.D. Thesis 2015 M. Parvizi

58 2.6. LOW NOISE AMPLIFIER DESIGN TECHNIQUES Inductive Degeneration The second technique involves using inductive degeneration scheme along with an LC network to provide wideband matching as shown in Fig (b) [88 91]. The input impedance of an inductively degenerated MOS transistor is a series RLC network and can be found by Z in (s) = 1 sc gs + s (L g + L s ) + ω T L s, (2.8) where ω T is the transit frequency of the device given by g m /C gs. As can be seen in (2.8) inductive degeneration leads to a real part in the input impedance, though at a narrow frequency band. To provide a wideband solution LC network is utilized which leads to a high number of bulky inductors on chip. On the other hand, the required g m is determined by the input matching and it cannot be reduced to lower the power consumption. Common-Gate Amplifier The other approach is to use a common-gate transistor as the input stage of the LNA as illustrated in Fig [92,93]. Common-gate transistors exhibit better stability and reverse isolation compared to common-source transistors due to the absence of Miller effect. Hence, common-gate transistor are capable of providing wideband input impedance matching. The input impedance of a common-gate stage LNA can be found by Z in (s) = 1 ( 1 + Z ) T L(s), (2.9) g m r o1 where g m is the transconductance of the transistor M1, r o1 is the drain-source resistance of M1 and Z T L is the total impedance seen at the load. As can be seen in (2.9) a commongate transistor can provide an impedance of 1/g m considering the output conductance of the transistor is very small. However, in deep sub-micron technologies and by lower supply voltages and consequently the drain-source voltage of the transistors, the output conductance is not negligible and has to taken into account. By setting g m to 20-mS, a wideband 50-Ω input match is achieved. In spite of wideband matching, this strategy faces two major problems. The first one is a high minimum noise Ph.D. Thesis 2015 M. Parvizi

59 2.6. LOW NOISE AMPLIFIER DESIGN TECHNIQUES Rd RS Vout M1 RS = 50Ω gm 20-mS Figure 2 13: The circuit schematic of a common-gate transistor as an input stage of an LNA to provide wideband input matching. figure (NF). The noise factor of a common-gate topology is given by ( γ ) ( ) r o1 F = 1 +, (2.10) α r o1 + Z T L(s) where γ is thermal noise coefficient and α = g m /g d0 [94]. As can be seen in (2.10) the NF is bounded by 1 + γ/α at the matching condition. The second limitation is that the g m is bounded by the input matching criterion and cannot be increased to improve the NF and gain, or decreased to reduce the power consumption. Noise Cancellation Noise cancellation scheme has been used to improve the noise performance of the input matching devices while providing the same input resistance [95 104]. Fig (a) illustrates the block diagram of the noise cancellation. As highlighted on the block diagram, the principle behind noise cancellation is to identify two nodes at the input and the output of the main amplifier where the signal appears out of phase whereas the noise is in phase [105]. This technique has been employed in both common-gate and resistive shunt-feedback amplifiers as shown in Fig (b) and (c). As can be seen in the figures, the noise at the input and output of the main amplifier (common-gate transistor and resistive shunt feedback amplifier) appears in phase and this can be exploited to cancel the thermal noise of the matching stage. Ph.D. Thesis 2015 M. Parvizi

60 2.6. LOW NOISE AMPLIFIER DESIGN TECHNIQUES Rd Rd R S X Main Path Vout Y Σ M1 Vout+ Vout- Auxiliary Path RS M2 (a) (b) Rd R S Rf M3 M2 Vout + M1 (c) Figure 2 14: Noise cancellation scheme highlighted in (a) block diagram (b) common-gate transistor and an auxiliary common-source stage (c) resistive shunt feedback LNA with an auxiliary common-source stage for noise cancellation. Noise cancellation techniques have been shown to improve the noise figure, but this comes at the cost of higher power consumption due to extra stages and high supply voltages. LNAs With Positive and Negative Feedback Structures A combination of positive and negative feedback techniques has been used to break the trade-off between NF, gain and input matching of a common-gate LNA [99, ]. The capacitor cross-coupling used in [111] and shown conceptually in Fig (a), reduces the power consumption by reducing the required g m for input matching by (1 + A) ratio under input matching condition, where A is the feedback coefficient. However, this technique degrades the unilateral behaviour of a common-gate transistor and hence its stability. Moreover, there is an intrinsic limitation in the inverting amplification gain which is always less than unity because of passive implementation of g m -boosting technique. Ph.D. Thesis 2015 M. Parvizi

61 2.6. LOW NOISE AMPLIFIER DESIGN TECHNIQUES A passive negative feedback around a common-base amplifier is utilized in [110] to improve the lower bound of noise performance. A CMOS implementation and simplified idea of this work is shown as a block diagram in Fig (b). The technique suffers from low bandwidth mainly due to a large parasitic capacitance at the output which precludes this technique for wideband applications. A shunt positive feedback is employed in [109] which adds a degree of freedom to determine the value of g m as highlighted in Fig (c). However, the stability of the LNA with positive feedback might be compromised and it should be studied carefully to avoid any oscillation. Moreover, positive feedback increases the required g m for input matching. The input impedance of the LNA in [109] can be found by Z in = 1 g m1 (1 A pos ). (2.11) Considering that the A pos < 1 for stability reasons (in the case of LNA in [109] is 0.5) the required g m and hence the power consumption increases by applying this technique. A combination of positive and passive negative feedback networks are employed in [106] and conceptually highlighted in Fig (d). This work achieves a very low noise performance and reduces the power consumption compared to the structure in [109] where only a positive feedback was used. However, the power consumption is still well above sub-mw operation. A dual negative feedback architecture is introduced in [107] and shown in Fig (e). This LNA overcomes the aforementioned common-gate trade-offs without compromising its stability. Even though this LNA achieves reasonable performance it comes at the cost of high power consumption (12.6-mW) due to the extra stages and introduced parasitic capacitances. Resistive Shunt Feedback The resistive shunt feedback architecture is another viable solution for wideband LNA design. This involves placing a feedback resistor around a common source amplifier to realize a wideband 50-Ω input match as shown in Fig [ ]. The input impedance of this Ph.D. Thesis 2015 M. Parvizi

62 2.6. LOW NOISE AMPLIFIER DESIGN TECHNIQUES Rd Rd Aneg Rd R S M1 Vout RS V out M1 R S Apos M1 V out -Aneg (a) (b) (c) Rd Rd Bneg RS A pos M1 Vout R S Vout M1 -Aneg -Aneg (d) (e) Figure 2 15: Common-gate LNAs in multiple feedback structures (a) capacitive cross coupling passive negative feedback (b) negative feedback (c) positive feedback (d) a combination of positive and negative feedback structures (e) A dual negative feedback architecture. LNA is given by And its voltage gain can be found by r o R L + R f 1 + g m (r o R L ). (2.12) (r o R L ) (1 g m R f ). (2.13) (r o R L + R f ) As can be seen in (2.12) the input impedance of the resistive shunt feedback LNA can be controlled through feedback resistor, R f, and g m of the transistor. In order to achieve low power input matching, low values of R f are desired; however, this will lead to low voltage Ph.D. Thesis 2015 M. Parvizi

63 2.6. LOW NOISE AMPLIFIER DESIGN TECHNIQUES Rd Rf Vout R S M1 Figure 2 16: The circuit schematic of a resistive shunt feedback LNA, a technique to provide wideband input matching. gain as can be seen in (2.13) (and high noise figure). Hence, this trade-off between input matching and the NF/gain of the LNA, and satisfying both criteria simultaneously generally leads to increased power consumption [ ] Sub-mW Wideband Low Noise Amplifiers In this section we will review the wideband LNA designs which are consuming less than 1-mW of power. The LNA, as the first active block in the RF front-end of a receiver, has to provide simultaneous wideband matching, low noise, high gain, and modest linearity, all of which require high power consumption. Hence, considering these low power LNA design challenges, the design of ultra-low power (<1-mW), wideband LNAs has been an active research topic e.g. [27, ]. In circuit presented in [118] and shown in Fig (a), an ultra-wideband common-gate LNA with a T-match input network and self-body bias is presented. The T-match network which is composed of a series resistance and an inductance, is employed to improve the input matching bandwidth at low frequencies. This LNA achieves ultra-low power operation (0.99- mw), but with a low gain of 7.9-dB and a large chip area of 0.73-mm 2, which precludes its use in area-sensitive applications. Ph.D. Thesis 2015 M. Parvizi

64 2.6. LOW NOISE AMPLIFIER DESIGN TECHNIQUES Shunt-feedback technique around a common-gate transistor is adopted in [119] and shown in Fig (b) to achieve good matching with low power consumption and compact size. The input impedance of this LNA, as discussed in previous section, can be found by R in = 1 g m1 (1 + A), (2.14) where A is the feedback coefficient. As can be seen in the equation the required g m for input matching is reduced by the factor of (1+A). Hence the power consumption can be reduced by the same factor. In spite of the aforementioned advantages in this LNA, the feedback coefficient is bounded to the main transistor current and cannot be set independently. This reduces the design options to set the input matching, gain and noise figure. Moreover, the bandwidth of the LNA is not high enough even with schematic simulations. A wideband and ultra-low power LNA is presented in [120] and is shown in Fig (c). In this work a hybrid amplifier comprised of resistive shunt-feedback and commongate amplifiers is introduced. It utilizes current-reuse technique to further reduce the power consumption. The input impedance of this hybrid LNA can be found by R in = r o + R f 2 ((g m1 + g m2 ) r o + 1) 1 4g m. (2.15) As can be seen in (2.15) the required g m for input matching is reduced by a factor of 4. This will lead to 4-times lower power consumption. However, since the circuit is differential the improvement in power consumption is only 2 times. In spite of its low power operation, the bandwidth of this LNA is limited to sub-ghz frequencies and it occupies a large chip area (0.26-mm 2 ), due to the use of large capacitors Ultra-Low Voltage LNA Design The design of ultra-low voltage wideband LNAs has been an active research topic e.g. [22,27, ]. Fig presents the state-of-the-art LNAs that achieve good performance from ultra-low voltage supplies. Fig (a) shows complementary current-reuse LNA. Even though the LNA is designed for narrow band applications, the concept can be employed Ph.D. Thesis 2015 M. Parvizi

65 2.6. LOW NOISE AMPLIFIER DESIGN TECHNIQUES Vout Z L Rd Lc M2 -Ane g M1 Vout RB M1 RS RS L1 R S1 LS1 T-match Network (a) (b) M2 +Vin Af Vout -V in M1 (c) Figure 2 17: The circuit schematic of sub-mw LNAs in the literature (a) a common-gate LNA with T-match input matching and self-forward body biasing technique (b) active negative shunt-feedback LNA (c) hybrid common-gate/resistive shunt feedback LNA (only singleended version of the LNA is shown). for ultra-low power LNAs. The complementary current-reuse technique will be discussed in section Fig (b) shows an LNA that uses capacitive cross-coupling and forward body biasing (FBB) to achieve low power (1-mW) and ultra-low voltage (0.5-V) operation, but the bandwidth of the LNA is much less than the target for ultra-wideband systems. This Ph.D. Thesis 2015 M. Parvizi

66 2.6. LOW NOISE AMPLIFIER DESIGN TECHNIQUES 0.6V, 0.9mW M2 L2 L3 M3 0.5V, 1mW Vout Ld2 Ld3 C1 RL Out+ M1 Vb R b R b M2 RL Out- Ld1 V sub V sub M1 RS Vin L1 Vin Off-Chip Balun (a) 0.8V, 0.4mW (b) 0.6V, 4.2mW Out+ M1 ZL g m -boost Amplifiers ZL Out- M2 VG L1 M1 L2 L d Rd Vout V in Ls V sub M2 V sub (c) (d) Figure 2 18: The circuit schematic of ultra-low power LNAs in the literature (a) a complementary current reuse LNA (b) a differential common-gate LNA using capacitor crosscoupling and forward body biasing [124] (c) common-gate LNA with g m -boosting amplifier stage(d) a common-gate LNA using forward body biasing [123]. architecture also uses an off-chip balun, which increases the cost and introduces problems with full integration. The LNA shown in Fig (c) achieves low operating power (0.5-mW) with low supply voltage of (0.8-V) with active g m -boosting technique. The g m -boosting technique reduces the required g m for input matching by the factor of (1+G boost ) as a result, the power consumption is also reduced with the same factor. However, the use of multiple amplifiers for g m -boosting Ph.D. Thesis 2015 M. Parvizi

67 2.6. LOW NOISE AMPLIFIER DESIGN TECHNIQUES increases the parasitic capacitances and limits the bandwidth of the LNA. Moreover, the use of an off-chip balun at the input increases the cost and introduces integration problems. Fig (d) shows an ultra-low voltage LNA that uses forward body biasing to enable operation from a 0.6-V supply voltage [123]. Forward body biasing reduces the threshold voltage of the transistors and facilitates low voltage operation of the LNA. However, this is achieved at the cost of high power consumption (P diss = 3.72-mW) which precludes its use in power efficient designs Comparison of Wideband LNAs Low noise amplifiers in the literature have different gain, bandwidth, noise figure and power consumption. In some designs, the voltage gain, noise figure or the bandwidth is traded with power consumption or supply voltage to implement a low power, low voltage solution. Therefore, a performance figure of merit has to be developed to perform a fair comparison. In this section, we will use a performance figure of merit (FOM) to compare the performance of state-of-the-art LNAs in the literature versus power consumption and supply voltage. The performance figure of merit used here is given by [125] ) F OM I = 20 log 10 ( S21av.[lin] BW [GHz] P dc[mw ] ( F av.[lin] 1 ), (2.16) The FOM is based on the average voltage gain in linear scale (S 21 ), 3-dB bandwidth (BW ), power consumption (P diss ) and the average noise factor in the bandwidth of the LNA. Fig shows the FOM of the state-of-the-art LNAs versus their power consumption. As highlighted in the figure, the desired performance is on the top left of the figure which is equivalent to an LNA with high performance and low power consumption. Firstly, it can be seen that only a few designs are consuming less than 1-mW of power. Secondly, LNAs with high FOM are mostly burning high power, which are not suitable for applications with low power budget. Furthermore, it can be clearly seen that low power, high performance designs with wide bandwidth are highly required to fill this gap. Ph.D. Thesis 2015 M. Parvizi

68 FOM FOM 2.6. LOW NOISE AMPLIFIER DESIGN TECHNIQUES Desired Performance Sub-mW Power Consumption (mw) Figure 2 19: The FOM of the state-of-the-art LNAs in the literature versus their power consumption Desired Performance Ultra-Low Voltage Supply Voltage (V) Figure 2 20: The FOM of the state-of-the-art LNAs in the literature versus their power consumption. Moreover, Fig illustrates the FOM of the LNA versus supply voltage. Again, the desired performance area is highlighted in the figure. It can be seen that, the number of LNAs with ultra-low supply voltages are very low. Furthermore, in most cases low supply voltage LNAs (Vdd <1-V) are not capable of providing high performance. As mentioned earlier, Ph.D. Thesis 2015 M. Parvizi

69 2.7. SQUARER DESIGNS lower supply voltage reduces the overall power consumption in the transceiver and allows for providing the power of the transceiver from energy scavenged from the environment with low conversion losses. Therefore, design of an ultra supply voltage LNA with high performance is highly required to meet the requirements of a low power, low voltage transceiver for sensor network application. 2.7 Squarer Designs As mentioned in section an efficient self-mixer or a squarer circuit is required to perform energy detection. Moreover, as mentioned in section 2.5 active self-mixers are one the main power consuming blocks in the non-coherent receiver chain. In general, squarers or selfmixers are divided into two categories: first category is active squarers which have conversion gain and burn DC power [ ], and the second category is passive squarers [42,129] which does not burn DC power but at the cost of high conversion loss. In this section, we will briefly review these types of squarers. Active squarers can be realized using Gilbert cell mixers [127] or MOS transistors operating in the saturation region [126, 128]. While these types of squarers provide conversion gain, but it comes at the cost of high power consumption. On the other hand, very low power consumption squarer circuits using transistors in the triode region can be realized by using passive squarers; however, they suffer from very high conversion loss. MOS transistor characteristics in the sub-threshold region [130] can also be employed to square the received pulse with very low power consumption. As can be seen, realization of a low power, high gain squarer is challenging and under active research. In this work, an ultra-low power squarer circuit with high gain is realized which is suitable for non-coherent IR-UWB receivers. The details of this circuit will be described when we discuss the proposed receiver. 2.8 Summary In this chapter, the modulation schemes used in IR-UWB transceivers are introduced and the advantages and disadvantages of each modulation technique was investigated. Then, Ph.D. Thesis 2015 M. Parvizi

70 2.8. SUMMARY state-of-the-art IR-UWB receivers in the literature including coherent and non-coherent based receivers were discussed. It was shown that due to the high power consumption of coherent receivers, they are not suitable for low power, low data rate applications. After that, a comparison of synchronization schemes was provided and it was shown that most of the receivers suffer from high power and complex clock recovery scheme. Then, a breakdown of power consumption in the RF front-end of IR-UWB receivers was presented. Next, a complete review of wideband low noise amplifier design techniques in the literature was given. It was highlighted that most of the designs in the literature does not provide good figure of merit under ULP and ULV conditions, and the design of ultra-low power and ultralow voltage LNAs are challenging. Finally, a brief review of squarer design techniques for non-coherent IR-UWB was provided. Ph.D. Thesis 2015 M. Parvizi

71 CHAPTER 3 Ultra-Low Voltage Design Challenges 3.1 Introduction The design of power efficient transceivers requires careful optimization at the circuit level. As the feature size in standard CMOS technologies is shrunk, the maximum allowed supply voltage is reduced as well. Supply voltage reduction is desirable in some applications like wireless sensor networks and systems operating on the energy scavenged from the environment to lower the power consumption and the conversion losses in the DC-DC converters. However, ultra-low supply voltages, (V DD < 0.6), which force the V DS of the transistors to get close to the threshold voltage of the transistor, have severe impacts on the characteristics of a MOS transistor, e.g. intrinsic gain, transit frequency, f T, noise figure (NF) and linearity. In this chapter, first we will review MOS transistor operating regions from EKV models point of view and then we will address the major impacts of ultra-low supply voltages on the performance of a MOS transistor, e.g. on the transconductance efficiency, output conductance, transit frequency, noise figure and linearity. 3.2 MOS Transistor Operating Points Review A MOS device can be biased in three inversion regions with different current and transconductance characteristics, and these are termed weak (WI) inversion, moderate inversion (MI), and strong inversion (SI). Assuming that the transistor is biased at the edge of the saturation region, the channel length modulation will be negligible and the drain-source current from weak through strong inversion can be expressed using the EKV model [131] by ( ) I D = I D0 ln e V GS V T H 2nU T. (3.1) where I D0 is the technology current as defined by I D0 = 2nµ 0 C OX UT 2 (W/L). W is the effective channel width, L is effective channel length, n is substrate factor whose value 50

72 3.2. MOS TRANSISTOR OPERATING POINTS REVIEW depends on process and varies from 1 to 2, µ 0 is the carrier mobility, U T defined as U T = kt/q is the thermal voltage, C OX is gate-oxide capacitance per unit area, V GS is gate-source voltage and V T H is the threshold voltage. Weak inversion occurs for a MOSFET with V GS sufficiently lower than V T H. In this region the channel is weakly inverted and drain diffusion current dominates and the drain current converges to I D = I D0 e V GS V T H 2nU T. (3.2) Strong inversion happens as V GS V T H is sufficiently high and the channel is strongly inverted and the drain drift current takes over. In this region, the drain current can be expressed by [132] I D = 1 µc OX W 2 n L (V GS V T H ) 2, (3.3) where µ includes velocity saturation and vertical field mobility degradation effects and is defined by µ = 1 + µ 0 ( ). (3.4) θ + 1 LE C (V GS V T H ) In this equation, E C is the electric field at which the mobility saturates and θ is a process dependent coefficient. The weak and strong inversion regions are two distinct operational regions of a MOSFET. The transition region in between these two is called moderate inversion region where both the drift and diffusion currents are important. As will be discussed later, moderate inversion is very important for ULP and ULV RF circuit design. The global inversion region of a transistor can be determined using its inversion coefficient (IC) defined by [131, 132] IC = I ( ) D = ln e V GS V T H 2nU T. (3.5) I D0 Weak inversion corresponds to IC < 0.1. If 0.1 < IC < 10 then the transistor is in moderate inversion and if IC > 10 the transistor is in strong inversion. It is necessary to find the required effective gate-source and drain-source saturation voltages for different inversion Ph.D. Thesis 2015 M. Parvizi

73 V eff and V DS,sat (V) 3.2. MOS TRANSISTOR OPERATING POINTS REVIEW 0.6 WI MI SI V eff V DS,sat Inversion Coefficient (IC) Figure 3 1: The effective gate-source and drain-source saturation voltages with respect to inversion coefficient. regions. The effective voltage, V eff = V GS V T H, can be derived from (3.5) and is given by ( V eff = 2nU T ln e ) IC 1. (3.6) Moreover, the drain-source saturation voltage which is important to ensure that the device is biased in saturation is defined by [131, 132] V DS,sat = 2U T IC UT. (3.7) Fig. 3 1 shows the effective voltage, V eff, and the drain-source saturation voltage, V DS,sat, with respect to IC. As can be seen in the figure the required V eff starts increasing very fast in the strong inversion region by square root of IC. The drain-source saturation voltage V DS,sat is almost constant in the weak inversion region and is equal to 4U T 104- mv. However, V DS,sat increases in moderate inversion region and strong inversion region by the square root of IC. Consequently, it can be understood that weak inversion provides the minimum effective and saturation voltages and that is the reason why it is suitable for ultra-low voltage designs. On the other hand, the required V DS,sat for moderate inversion Ph.D. Thesis 2015 M. Parvizi

74 3.3. TRANSCONDUCTANCE EFFICIENCY region varies between 108-mV 244-mV (independent of the technology used) which is still suitable for ultra-low voltage designs. 3.3 Transconductance Efficiency The ratio of transconductance to DC drain current (g m /I D ) is a conventional approach for designing low power analog CMOS circuits [133]. The g m /I D as a function of IC is given by [131] g m = 1 2 I D nu T 1 + 4IC + 1. (3.8) However, in this part the impact of V DS variation on the g m /I D will be studied. As V DS gets reduced, the achievable I D and correspondingly the g m of the device decreases due to channel length modulation. Interestingly, the I D and g m are reduced by the same factor; hence the g m /I D stays almost constant for different V DS values. Fig. 3 2 illustrates the simulation results for g m /I D curve for two V DS values with respect to the inversion coefficient (IC) for a transistor in the 90-nm CMOS technology with W =40-µm and L=100-nm in BSIM4 model. As can be seen in Fig. 3 2 and deduced from (3.8), the transconductance efficiency has a maximum of 1/nU T in the deep weak inversion region. The efficiency reaches 0.5 of the maximum at the center of the moderate inversion region and decreases with 1/ IC in the strong inversion region. 3.4 Output Conductance Output conductance, g ds, is the derivative of I D with respect to V DS. Drain-source voltage reduction causes the g ds to increase drastically. The g ds in the saturation region is proportional to [132] g ds I D V A + V DS I D0I C V A + V DS, (3.9) where V A is the Early voltage which includes both channel length modulation and drain induced barrier lowering (DIBL) effects. Eq. (3.9) suggests that g ds decreases as IC (and hence V GS ) increases. The value of V A due to channel length modulation is proportional to [132] V A (CLM) L (V DS V DS,sat ). (3.10) Ph.D. Thesis 2015 M. Parvizi

75 g m /I D (V -1 ) 3.4. OUTPUT CONDUCTANCE V DS =0.2V V DS =0.6V 10 5 Weak Inversion Moderate Inversion Strong Inversion Inversion Coeffiecient (IC) Figure 3 2: The simulated g m /I D characteristics for an NMOS transistor in a 90nm CMOS technology for two different V DS values. According to (3.10), the g ds decreases directly by increasing the gate length and excess drain-source voltage above V DS,sat. The other important contributor that affects V A in short channel devices operating at low levels of inversion is DIBL. DIBL is commonly referred to as the decrease in threshold voltage, V T H, when V DS increases, especially in short channel devices. V A variations due to DIBL can be expressed by [132] V A (DIBL) = I D ( ) = V g T H m V DS 1 ( ) ( ). (3.11) g m V T H I D V DS Fig. 3 3 shows g ds with respect to IC for an NMOS transistor in a 90-nm TSMC CMOS technology with W=70-µm and L=100-nm for three values of V DS. As expected, g ds in weak inversion and moderate inversion increases almost linearly with IC. In the strong inversion region the rate of increase is reduced by CLM until the device enters triode region, at which point g ds increases rapidly. This shows that high levels of inversion lead to a high g ds, which limits the maximum achievable voltage gain. Moreover, g ds increases as V DS decreases, which gives rise to lower output resistance and lower gain. As highlighted on Ph.D. Thesis 2015 M. Parvizi

76 g ds (I/V) 3.5. INTRINSIC VOLTAGE GAIN Vds=0.2 Vds=0.4 Vds= Weak Inversion Moderate Inversion Strong Inversion Inversion Coefficient (IC) Figure 3 3: The variation of g ds of a MOS transistor in a 90-nm CMOS technology with respect to IC for 3 different V DS values. the figure, in the middle of moderate inversion region (IC=1) the output conductance gets deteriorated by a factor of 1.5 as the drain-source voltage shrinks from 0.6-V to 0.2-V. 3.5 Intrinsic Voltage Gain The intrinsic voltage gain of a MOS transistor is the small signal low frequency gain of a common source MOSFET with an ideal current source as load. This is simply the ratio of g m /g ds. Employing (3.9) and (3.8) the intrinsic gain of a CMOS transistor can be found by ( ) ( g m gm VA + V DS = I D g ds I D I D ) = 1 nu T 2 (V A + V DS ) 1 + 4IC + 1. (3.12) It is imperative to study the impact of V DS reduction on the intrinsic gain of a MOS transistor. As discussed in previous section, the g ds value increases by reducing the V DS value. Therefore, by considering the fact that, reducing the V DS, decreases the g m, the intrinsic voltage gain of the transistor, g m /g ds, gets lowered noticeably. g m /g ds is a good tool to characterize the achievable gain of the device. The simulation results for the intrinsic voltage gain of an NMOS transistor in a 90-nm CMOS technology with W=40-µm and Ph.D. Thesis 2015 M. Parvizi

77 3.6. TRANSIT FREQUENCY Vds=0.2 Vds=0.4 Vds=0.6 g m /g ds Weak Inversion Moderate Inversion Strong Inversion Inversion Coefficient (IC) Figure 3 4: The simulated intrinsic voltage gain of an NMOS transistor in a 90-nm CMOS for three different V DS values. L=100-nm for different values of V DS are shown in Fig As can be seen in the figure, there is about 45% reduction in the intrinsic voltage gain in moderate inversion region. 3.6 Transit Frequency The other important characteristic of a MOSFET that should be studied is the transit frequency, f T. The f T of a device is the frequency where the gate-to-drain current gain, h 21, is unity for a grounded-source device. The f T can be expressed as a function of IC by [132] ( ) ( ) 2IC µ f T = 0 U T 4IC πl 2 (Ĉgsi + Ĉgbi), (3.13) where Ĉgsi and Ĉgbi are intrinsic gate-source and gate-bulk capacitances, respectively, which are normalized to gate-oxide capacitance. These capacitances vary depending on the region of operation and can be found by Ĉ gsi = 0 3, 1 3, 2 3, Ĉgbi = ( 3 3, 2 3, 1 ) n 1 3 n, (3.14) Ph.D. Thesis 2015 M. Parvizi

78 Transit Freqency (f T ) (GHz) 3.7. NOISE FIGURE V DS =0.6V V DS =0.2V %27 50 WI MI SI Inversion Coefficient (IC) Figure 3 5: The simulated transit frequency of an NMOS transistor in a 90-nm CMOS technology for two different V DS values. for weak inversion, moderate and strong inversion, respectively. In strong inversion and under full velocity saturation the f T can be expressed by f T = E C µ 0 4πL(n 1 (3.15) ). 3 The V DS reduction, also lowers the f T. This reduction can be explained by the fact that f T is proportional to the g m which decreases by V DS reduction. Fig. 3 5 illustrates the simulation results for the transit frequency of an NMOS transistor in a 90nm CMOS technology with W=40-µm and L=100-nm for two different values of V DS. As demonstrated on the figure, the achieved f T for a V DS of 0.2-V decreases by 27% compared to f T for a V DS of 0.6-V. It should be noted that, when the V DS is at 0.2-V, the peak of the f T since the transistor enters the triode region at higher ICs. This f T happens at lower ICs reduction deteriorates the performance, namely, the NF and gain of the RF-front-end circuits. 3.7 Noise Figure The noise characteristics of a MOSFET is highly important for LNA design. The minimum noise figure, NF min, of a MOS transistor is the noise figure at the optimum source resistance. The NF min also varies with respect to V DS. The NF min is inversely proportional Ph.D. Thesis 2015 M. Parvizi

79 3.8. LINEARITY V DS =0.2V V DS =0.6V NF min Inversion Coeffiecient (IC) Figure 3 6: The simulated minimum NF of an NMOS transistor in a 90-nm CMOS technology for two different V DS values. to the square root of g m and hence increases as the V DS decreases. The simulated NF min of a NMOS in a 90nm CMOS technology with W=40-µm and L=100-nm for two values of V DS are shown in Fig It is interesting to note that the IC in which the minimum happens is almost unchanged. 3.8 Linearity Low supply voltages, also, limit the achievable linearity in the LNAs. To further investigate the effect of supply voltage on the non-linear behaviour of a MOSFET as a weakly non-linear system, the non-linear drain current (i ds ) of a transistor can be expressed in terms of v gs and v ds by a two-dimensional Taylor series I ds (v gs, v ds ) = g m v gs + g ds v ds + g mv 2 gs + g dsv 2 ds + g mv 3 gs + g dsv 3 ds, (3.16) where the Taylor coefficients can be derived from g k m = 1 k! k I DS ; g k VGS k ds = 1 k! k I DS. (3.17) VDS k The cross terms have been ignored for simplicity. It s been known that g m is the strongest contributor to the third-order distortion in the circuits. However, it will be shown that in Ph.D. Thesis 2015 M. Parvizi

80 g m "(A/V 2 ) " (A/V 2 ) 3.9. SUMMARY IC=0.1 IC=1 IC= V DS =0.2V V DS =0.4V V DS =0.6V Inversion Coefficient (IC) (a) g ds V (V) DS (b) Figure 3 7: (a) The simulated third-order distortion of a MOSFET due to g m for three different V DS values (b) The third-order distortion due to g ds for three IC values. deep sub-micron technologies and specifically at low V DS values, g ds would deteriorate the linearity of the circuit, as well. Fig. 3 7(a) shows the g m variation with respect to the IC for multiple values of V DS. As can be seen, as long as the transistor is in saturation, g m has the same characteristics for different V DS values. It should be noted that the zero crossing of the g m is not dependent on the V DS and happens at IC of 1.2. Despite the variation against process and temperature, this sweet spot has been used for circuit linearisation in the past [134]. Moreover, Fig. 3 7 (b) illustrates the g ds versus V DS for multiple values of IC. It can be seen that g ds increases by increasing IC which is due to higher current and lower output impedance at higher ICs. Also, it is interesting to note the variation of g ds with V DS. It is clear that g ds decreases significantly by V DS. For example, at IC of 1, the g ds is 20 times higher at V DS=0.2-V compared to V DS =0.6-V. Hence, the non-linearity due to g ds should be considered at low supply voltages. 3.9 Summary In this chapter the impact of ultra-low voltage supply voltage on the performance of CMOS transistors were discussed. Table (3 1) summarizes all of the impacts of supply Ph.D. Thesis 2015 M. Parvizi

81 3.9. SUMMARY voltage reduction that were discussed in this section for an NMOS transistor in a 90-nm CMOS technology by reducing V DS from 0.6-V to 0.2-V. Table 3 1: Device Dimensions and Component Values Design parameter Transconductance efficiency (g m /I D ) Intrinsic gain (g m /g ds ) Transit frequency (f T ) Noise factor Nonlinearity due to g m Nonlinearity due to g ds Ultra-low voltage supply impact Unchanged 45% reduction 27% reduction Increases slightly Unchanged (up to the triode region) 20 times larger (at IC=1) Ph.D. Thesis 2015 M. Parvizi

82 4.1 Introduction CHAPTER 4 Ultra-Low Voltage and Ultra-Low Power Design Techniques In the previous chapter some of the circuit design challenges with ultra-low voltage supplies was presented. In this chapter, methods of overcoming these challenges will be discussed while keeping in mind the goal of low power consumption with ultra-low supply voltages. At first, general techniques to realize ultra-low power circuits that are suitable for application in ultra-low voltage environments like new bias point for ULP and ULV operation, current-reuse and forward body biasing are discussed. Following this, bandwidth enhancement schemes and wideband matching techniques that are suitable for ultra-low power and ultra-low voltage circuits will be presented. 4.2 Ultra-Low Power Design Techniques Ultra-Low Power and Ultra-Low Voltage Biasing Scheme In this section, we will address the new biasing point to achieve ultra-low power and ultra-low voltage for a single transistor. The goal is to find a biasing region where the transistor has acceptable transit frequency (bandwidth), gain and noise performance while consuming very little power from a low supply voltage. As discussed earlier, for a commonsource MOSFET with a current source at the drain, the low frequency gain is g m /g ds and the noise factor can be found by F = 1 + (γ/α)(f/f T )g m R S. As stated before, the transconductance efficiency is defined by g m /I D. This has been incorporated in various biasing schemes figures of merit (FOMs) for low power circuit design, such as [135] (F OM = g m /I D f T ). However, as discussed earlier, V DS variation has severe impacts on the intrinsic gain (specifically on the g ds of a transistor). Hence, the effect of V DS on the intrinsic gain must be included to reflect the impact of low supply voltages. As such, the modified FOM used here is defined as the product of the transconductance efficiency, intrinsic gain and transit 61

83 Extended Biasing Metric 4.2. ULTRA-LOW POWER DESIGN TECHNIQUES 12 x 1012 V DS =0.2V 10 V =0.6V DS % Reduction Inversion Coefficient (IC) Figure 4 1: The simulation of the extended biasing metric for an NMOS transistor for two different V DS values. frequency as given below [22, 34] Biasing Metric ULP,ULV = (g m /I D ) (g m /g ds ) f T. (4.1) This FOM represents the bias point to achieve the lowest possible noise figure, and the highest possible gain and bandwidth while burning the lowest possible current. However, the bandwidth and NF can be traded with higher voltage gain and lower power consumption as we move to the lower inversion coefficients, and vice versa. Fig. 4 1 highlights the proposed FOM for two values of V DS. As can be seen, the peak for both curves occurs in the moderate inversion region; nevertheless, the overall performance of the transistor is degraded by 60% at ultra-low voltage supplies. Also, it is interesting to note the dependency of the FOM on V DS. As can be seen in the figure, the peak of the FOM is shifted towards the weak inversion region for V DS = 0.2-V. In general, the advantages associated with biasing the transistors in weak and moderate inversion regions are: Ph.D. Thesis 2015 M. Parvizi

84 4.2. ULTRA-LOW POWER DESIGN TECHNIQUES 1. The required bias voltages (V eff and V DS,sat ) in weak and moderate inversion regions are lower than strong inversion regions which facilitates low voltage design. 2. The transconductance efficiency, g m /I D, is larger in weak and moderate inversion region allowing low power design. 3. The intrinsic gain, g m /g ds, is higher in weak and moderate inversion regions and then reduces sharply in the strong inversion region which facilitates achieving high gain low power solutions. The lower electrical fields within the device due to lower bias voltages avoid velocity saturation and hot electron effects in the device. Without velocity saturation, f T scales with 1/L 2 rather than 1/L when velocity saturation exists. Therefore, scaling is more effective for devices biased in weak inversion and moderate inversion regions rather than strong inversion regions [131, 132] Current-reuse Current-reuse is an important technique in the implementation of ultra-low power circuits. In this scheme, the DC current is shared between two or more transistors while each transistor contributes to the total gain. It reduces the overall power consumption and improves the current efficiency. This technique has been widely used in the literature [119, 136, 137] to reduce the overall power consumption of the LNAs. Moreover, complementary current-reuse [22, 27, 98, 122, 138, 139] is a technique that employs both NMOS and PMOS transistors to take advantage of their complementary characteristics has led to ULP designs and some additional advantages like distortion and noise cancellation. Fig. 4 2 highlights three complementary current-reuse architectures that provide additional performance benefits to an LNA. Fig. 4 2(a) shows a complementary current-reuse scheme that provides second-order distortion cancellation through the complementary characteristics of the NMOS and PMOS transistors [98]. A g m -boosted current-reuse LNA for ultra-wideband applications is highlighted in Fig. 4 2(b) where the PMOS stage boosts the g m of the input common-gate stage to reduce the power required for wideband input matching [139]. Fig. Ph.D. Thesis 2015 M. Parvizi

85 4.3. FORWARD BODY BIASING V bp L3 Rb2 RL1 V out M2 M1 RS Vbn M1 C3 L3 C2 L2 C1 L1 RS L1 V X R1 C1 Vout+ Vbp M2 C1 R S Vbn R b1 M1 Vout Vin L2 R2 Vout- RL2 Vin M2 (a) (b) (c) Figure 4 2: The application of complementary current-reuse in wideband LNAs (a) with second-order distortion cancellation [98] (b) g m -boosting [139] and (c) with noise cancellation [27]. 4 2(c) shows a noise cancelling complementary current-reuse scheme for UWB receivers [27]. In this architecture, the PMOS transistor is the input common-gate stage, while the NMOS is the common-source amplifier amplifying the input signal and the noise of M1. When the signals from these two paths are combined at the output node, the input signal adds constructively while the noise from M1 is cancelled out. 4.3 Forward Body Biasing Forward body biasing has been widely used in the literature to improve multiple circuit characteristics [123, 124, ] In this section, we describe the impact of forward body biasing (FBB) on the characteristics of a MOS transistor including its impacts on the threshold voltage and the required gate-source voltage. Most importantly, a thorough analysis on the ability of FBB to mitigate short channel effects in the transistors is provided. Moreover, it will be shown that how these characteristics can be exploited to realize ultra-low voltage circuits Threshold Voltage Reduction Low supply voltages, greatly restrict the choice of overdrive and drain-source voltages that can be used for biasing a MOS transistor, which limits the achievable gain, linearity, Ph.D. Thesis 2015 M. Parvizi

86 4.3. FORWARD BODY BIASING and maximum operation frequency of the circuits. Considering the fact that the threshold voltage of the devices has not been reduced noticeably by technology scaling due to leakage currents, FBB is an attractive technique to use when implementing ULP and ULV circuits. The most well-known impact of FBB is the modulation of threshold voltage in FET devices. The threshold voltage (V T H ) of a MOS transistor can be found by ( V T H = V T H0 + α 2φF V BS ) 2φ F, (4.2) where α is a process dependent body effect parameter and φ F is the substrate Fermi potential with typical values of V 1/2. As can be seen in (6.2), increasing the body-source voltage decreases the V T H. A direct impact of V T H reduction is the lower gate-source voltage required to bias the transistor in the desired region of operation. These impacts can be best seen by investigation of the overdrive voltage. The overdrive voltage of a MOS transistor is given by [131] ( V eff = V GS V T H = 2nU T ln e ) IC 1. (4.3) To highlight the impact of FBB on the required gate-source voltage, Fig. 4 3 presents simulation results for the required V GS versus the inversion coefficient for two values of FBB voltage. As can be seen in the figure, FBB reduces the required V GS for a given IC which is very desirable for ULV design. For example, to obtain IC of 1, FBB reduces the required V GS by 100-mV. On the other hand, the drain-source saturation voltage (which is important to ensure that the device is biased in saturation) is defined by [131] V DS,sat = 2U T IC UT. (4.4) As can be seen in the equation, the required V DS is constant for a given IC, hence does not scale with FBB. Ph.D. Thesis 2015 M. Parvizi

87 Required V GS (V) 4.3. FORWARD BODY BIASING 0.1 Weak Inversion Moderate Inversion 100-mV Strong Inversion 0.01 V sub =0V V sub =0.5V Inversion Coefficient (IC) Figure 4 3: The variation of the required V GS versus inversion coefficient for two FBB values Short Channel Effect Mitigation Traditionally, FBB is only employed to reduce the V T H of transistors, and hence improve the speed of the circuits and facilitate operation from low supply voltages. However, it is shown in this section that besides the aforementioned benefits, FBB improves the intrinsic device characteristics by mitigating short channel effects [143,144]. In short channel devices, the gate is not the only terminal that has control over the channel - the drain also controls the channel through the size of the depletion region it creates. The impact of the drain over the channel potential rises as the drain voltage increases, and the result is a reduction in the V T H of the device. This phenomenon is called drain induced barrier lowering (DIBL), and is also responsible for drain conductance degradation, especially in short channel devices operating at low inversion levels. The impact of the drain on the channel can be mitigated through FBB. As the body potential increases, the drain depletion region decreases and hence the DIBL impact on the device diminishes. Fig. 4 4 shows the DIBL effect and the impact of FBB on a triple-well NMOS transistor in a bulk CMOS technology. Fig. 4 4 (a) shows an NMOS transistor with FBB=0 which has large depletion region around the drain contact. By applying a positive voltage to the body terminal, the depletion region around Ph.D. Thesis 2015 M. Parvizi

88 4.3. FORWARD BODY BIASING V G >V TH VD >V DS,sat N+ N+ P + P-well N-well (a) V G >V TH V D >V DS,sat V sub > 0 N+ N+ P + P-well N-well (b) Figure 4 4: An NMOS transistor cross section for two V sub values (a) with V sub =0 the depletion region around the drain controls the channel (DIBL effect) (b) with FBB the impact of the drain on the channel is reduced. the drain diminishes and the impact of the drain on the channel reduces, as highlighted in Fig. 4 4 (b). The main benefit of the reduced drain impact on the channel is that the output conductance, and thus the g m /g ds of a MOSFET, is improved. This enhancement is measured for two drain-source voltage values for an NMOS transistor with W/L = 40- µm/120-nm fabricated in a 0.13-µm IBM CMOS technology and plotted in Fig. 4 5 (a). It is interesting to note that g m /g ds is reduced by almost 30% when V DS is shrunk from 0.6-V to 0.2-V. For both of these V DS levels, FBB improves g m /g ds by 15%-20% in the middle of the moderate inversion (MI) region (IC=1) without any increase in current consumption. Fig. 4 5 (b) plots the g m /g ds versus V sub for an IC of 1. This plot illustrates the g m /g ds variation with respect to the substrate voltage for two drain-source voltages, with the drain current being equal in both cases. The improvement in the g m /g ds due to FBB is clear, and it comes with no additional current consumption. Ph.D. Thesis 2015 M. Parvizi

89 g m /g IC = FORWARD BODY BIASING V sub V DS =0.6V V sub =0V V sub =0.5V g m /g ds 10 5 V DS =0.2V Inversion Coefficient (IC) V DS =0.6V V DS =0.2V (a) 15% Improvement % Improvement V sub (V) Figure 4 5: The impact of FBB on the characteristics of an NMOS transistor. (a) Measured g m /g ds improvement due to FBB for two drain-source voltages. (b) The g m /g ds improvement at the IC = 1 (middle of the moderate inversion region). (b) The main reason for the improvement in the g m /g ds stems from the reduction in the output conductance of the transistor. This can be inferred from Fig. 4 6, which highlights the fact that using FBB causes a negligible change in the g m with respect to the IC Improvements in Ultra-Low Voltage Circuit Design The intrinsic gain enhancement discussed in the previous section can be used in ULV circuits to improve the design parameters without additional power consumption. Fig. 4 7 Ph.D. Thesis 2015 M. Parvizi

90 g m (S) 4.3. FORWARD BODY BIASING V sub =0.5V V sub =0V V DS =0.6V V DS =0.2V Inversion Coefficient (IC) Figure 4 6: The g m of a (40-µm/120-nm) NMOS transistor for two V DS values versus IC. It is shown that by varying V sub, the change in g m with respect to the IC is very small, since both parameters increase as V T H is decreased. (a) shows how FBB can be advantageous in ULV circuits. By using FBB, the g m /g ds of an ULV circuit can be boosted to the levels that would otherwise be obtained by using a higher supply voltage. This example shows that using FBB in a circuit operating from a 0.3-V supply can enhance the g m /g ds to resemble the performance of the same circuit when operated from a 0.5-V supply. Both circuits have the same current consumption, which makes it clear that the circuit operating from a higher supply voltage consumes more power. Fig. 4 7 (b) plots the g m /g ds at IC of 1 as a function of V DD for two V sub values to highlight the level of improvement in the g m /g ds that can be obtained in ULV circuits. These results show that FBB is a very useful scheme to realize ULV and ULP circuits, and this will be demonstrated through the design of a high performance LNA in the next section. The impact of FBB on the overall performance of an NMOS transistor can be evaluated by using the ULP and ULV MOS transistor biasing metric introduced in (4.1) [22] Fig. 4 8 illustrates the impact of FBB on the biasing metric for two drain-source voltages (V DS =0.6-V and V DS =0.2-V). As can be seen in this figure, FBB improves the overall performance of a MOS transistor for both drain-source voltages. In the example Ph.D. Thesis 2015 M. Parvizi

91 g m /g IC = FORWARD BODY BIASING V dd 1 =0.5V V dd 2 =0.3V I DS (g m /g ds ) 1 P 1 =V dd1 x I DS I DS (g m /g ds ) 2 P 2 =V dd2 x I DS Vin Vout1 (g m /g ds ) 1 = (g m /g ds ) 2 P 1 =1.6 x P 2 V in M1 FBB=0V Rsub M1 (40µ/0.12µ) (40µ/0.12µ) Vout2 Rsub FBB=0.5V V sub =0V V sub =0.5V (a) 10 8 Same intrinsic gain V DD (V) (b) Figure 4 7: Example of how FBB can recover the performance degradation caused by the supply voltage reduction for circuits with the same current consumption. (b) The g m /g ds improvement of a transistor at IC = 1 (middle of the moderate inversion region). The transistor shows the same g m /g ds for V DD =0.3-V and V DD = 0.5-V. where V DS =0.6-V a 15% improvement is achieved through FBB, due to the improvement in output conductance. This provides further validation that the FBB technique boosts the overall performance in in ULV circuits, without additional power consumption Impact of Forward Body Biasing on the Linearity of the Transistor The linearity of a MOS transistor is also affected by FBB, and it is important to compare the nonlinear behaviour of a forward body biased transistor with a grounded bulk transistor at the same current consumption. The simplified Taylor series representation of the drain Ph.D. Thesis 2015 M. Parvizi

92 Extended Biasing Metric 4.3. FORWARD BODY BIASING 12 x 1012 V DS =0.6V, V sub =0.5V 10 V DS =0.6V, V sub =0V V DS =0.2V, V sub =0.5V 8 V DS =0.2V, V sub =0V V sub 15% Improvement Inversion Coefficient (IC) Figure 4 8: The simulation results for the extended biasing metric variation with FBB. current is given by i ds = ( g m v gs + g ds v ds + g mb v bs + g m 2 v2 gs + g ds 2 v2 ds + g mb 2 v2 bs + g m 6 v3 gs + g ds 6 v3 ds + g where the Taylor coefficients can be derived from mb 6 v3 bs ), (4.5) g k m = 1 k! k I DS ; g k VGS k ds = 1 k! k I DS ; g k VDS k mb = 1 k! k I DS. (4.6) VBS k In this analysis the cross terms have been ignored for simplicity. Three main nonlinearity sources are considered for a MOS transistor: The second- and the third-order derivatives of g m, g ds and g mb. It has been known that g m is the strongest contributor to the thirdorder distortion [ ]. Both a grounded-bulk transistor and a forward body biased transistor have almost identical g m and g m profiles since it is assumed that both transistors carry the same current and the simulation results presented earlier showed that the g m /I D of a transistor is unchanged with FBB. As a result, we can assume that the nonlinearity contributed by g m and g m is the same in both cases. The second source of distortion is the g ds of the device. Fig. 4 9 (a) and (b) highlight the behaviour of g ds and g ds, respectively, versus V DS for two V sub values at IC of 1 (for the Ph.D. Thesis 2015 M. Parvizi

93 g ds " 4.3. FORWARD BODY BIASING 1.2 x 10-3 Vsub=0V 1.1 Vsub=0.5V g ds (a) 10-2 Vsub=0V Vsub=0.5V Drain Source Voltage (V DS ) (b) Figure 4 9: The variation of (a) g ds and (b) g ds of a (40-µm/120-nm) NMOS transistor for two V sub values versus drain source voltage at IC=1. It is shown that by applying FBB, g ds improves while its third-order distortion coefficient, g ds, is almost unchanged. same drain currents). As can be seen in the figure (and discussed earlier), FBB improves g ds. However, its impact on the g ds is negligible, which means this type of distortion will be the same in both grounded-bulk transistor and a forward body biased transistor. On the other hand, FBB creates a new source of distortion due to bulk transconductance. This term adds to the second-and third-order distortion terms and slightly degrades the linearity performance of a forward biased common-source transistor. Ph.D. Thesis 2015 M. Parvizi

94 4.4. WIDEBAND MATCHING AND BANDWIDTH ENHANCEMENT TECHNIQUES M2 Rd R S R f Vout -A M1 Vout R S M1 (a) (b) Figure 4 10: Shunt-feedback in a (a) common-source amplifier and (b) common-gate stage. 4.4 Wideband Matching and Bandwidth Enhancement Techniques Shunt-Feedback One of the main challenges in the design of ultra-low power and ultra-low voltage low noise amplifier design is to provide wideband input matching with the available power budget. The use of cascaded resonant circuits to provide wideband matching with very little additional noise is possible, but this can consume significant chip area [88]. A common-gate topology can also provide a wideband input match, however, a current of at least 1.7-mA is required to provide 50-Ω matching in the chosen 130-nm CMOS technology if the amplifier is in moderate inversion (with a g m /I D of 12-V 1 in this region). This criterion leads to the overall power consumption being greater than 1.7-mW with Vdd=1-V and 0.85-mW with Vdd=0.5-V. Hence, design techniques are required to enable low power input matching, preferably without increasing the noise figure. Shunt-feedback has shown great potential for providing wideband low power input matching [22, 106, 119, 148]. Fig illustrates a possible shunt-feedback implementation for both common-source and common-gate gain stages. In Fig (a), the conventional common-source resistive shunt-feedback amplifier is shown. The input resistance of this architecture can be found by R in = R O + R f 1 + g m R O, (4.7) Ph.D. Thesis 2015 M. Parvizi

95 4.4. WIDEBAND MATCHING AND BANDWIDTH ENHANCEMENT TECHNIQUES where R O is the output resistance of the NMOS in parallel with PMOS and g m = g m1 + g m2. Since the feedback is resistive, this results in a wideband and low power solution. Fig (b) outlines another viable option for low power wideband matching in which shunt-feedback around a common-gate amplifier is used [106, 119, 125, 149]. The effect of feedback on the input resistance of the amplifier can be found by R in = 1 g m (1 + A), (4.8) where A is the feedback factor. Since the input resistance is reduced by the gain of the feedback path, a wideband 50-Ω input match can be achieved while still maintaining very low power consumption. The details of the resistive and active shunt-feedback techniques in implementation of ultra-low power LNAs will be discussed in the next chapters Bandwidth Extension With g m -boosting Ultra-low power and ultra-low voltage design criteria reduce the possible design options and impose restrictions on the operating region of the transistors, which makes it difficult to achieve a high f T in the devices. An ultra-low power transconductance boosting scheme helps overcome this challenge, and becomes necessary when operating at high frequencies. Two different techniques have been developed in this work for common-gate LNAs and resistive shunt-feedback LNAs. In this section we will discuss the g m -boosting for common-gate LNAs and the other bandwidth extension technique will be discussed in the next chapter. Transconductance boosting can be achieved without any additional power consumption by adding and inductor at the gate of a common-gate transistor to boost its g m at the resonance frequency of the inductor and the parasitic gate-source capacitance [27]. This schematic of this technique is highlighted in Fig As can be seen in this figure, the parasitic gate-source capacitance creates a resonant circuit with the added inductor in the gate. To verify this scheme, the effective transconductance (G m,eff ) of a common-gate transistor is found with and without an inductor at its gate. The G m,eff of a common-gate Ph.D. Thesis 2015 M. Parvizi

96 4.4. WIDEBAND MATCHING AND BANDWIDTH ENHANCEMENT TECHNIQUES V bn L g Cgs M1 RS Figure 4 11: The schematic of the proposed inductive g m -boosting technique for commongate transistors and the elements included in the calculation of G m,eff. transistor can be found by deriving the output current of the device with respect to the input voltage, with the parasitic capacitances of the device and the source resistance included. The exact G m,eff of a common-gate transistor and its high frequency approximation is given by G m,eff = g m 1 + g m R s + jωc gs R s ω T ω 1 jr s. (4.9) It can be observed from (4.9) that the G m,eff decreases with frequency, however, placing an inductor at the gate of a common-gate transistor creates a feedback network which boosts the v gs of the transistor by 1/ (1 ω 2 C gs L g ). This gives rise to an effective g m of, G m,eff = At high frequencies (4.10) simplifies to g m 1 + g m R s + jωc gs R s ω 2 L g C gs. (4.10) G m,eff ω T ω 1 jr s ωl g. (4.11) Therefore, the additional inductor at the gate reduces the denominator of the G m,eff up to the resonance frequency of L g and C gs. These simplified equations are modelled in MATLAB and plotted in Fig using the design values for this LNA. This plot shows the case where no g m -boosting is implemented, and when g m -boosting is implemented with different inductance values. As can be seen, G m,eff decreases with frequency for a Ph.D. Thesis 2015 M. Parvizi

97 4.5. SUMMARY With g m boosting Without g m boosting Effective G m Increasing L g Frequency (GHz) Figure 4 12: The effective G m of a common-gate transistor with inductive g m -boosting for multiple inductor values and without inductive g m -boosting. conventional common-gate transistor, whereas, the inductive g m -boosting technique increases the G m,eff until the resonant frequency of L g and C gs is reached. 4.5 Summary In this chapter, multiple design techniques to improve the device characteristics and circuit performance under ultra-low power and ultra-low voltage conditions was proposed. A new biasing technique for ULP and ULV circuits was proposed. Also, current-reuse technique as a vital scheme to reduce the power consumption was presented. Moreover, it was shown that forward body biasing boosts the intrinsic gain of an ultra-low voltage circuit to the levels that would be obtained by using a higher supply voltage. At the end, wideband input matching techniques and bandwidth extension schemes were presented. Table (4 1) summarizes all the proposed design techniques. The proposed techniques target most aspects of ultra-low power and ultra-low voltage designs. Bandwidth extension without any increase in the power consumption, higher intrinsic gain with no additional power consumption, improved input impedance matching, higher current efficiency and new bias point for an individual transistor are the main results of the proposed techniques. Ph.D. Thesis 2015 M. Parvizi

98 4.5. SUMMARY Table 4 1: Ultra-low power and ultra-low voltage design techniques ULP and ULV design techniques ULP and ULV biasing scheme F OM ULP, ULV Current-reuse Forward body biasing Shunt-feedback Inductive g m -boosting Impact on the design New bias point for an individual device; Suitable for ULP, ULV design Improved current efficiency; Suitable for ULP design Reduced threshold voltage; increased intrinsic gain; NO power consumption penalty Ultra-low power input impedance matching Bandwidth extension; No power consumption penalty Ph.D. Thesis 2015 M. Parvizi

99 CHAPTER 5 A Sub-mW 0.5-V Resistive Shunt-Feedback Low Noise Amplifier 5.1 Introduction A low noise amplifier is the first active component in the front-end of the receiver, and is generally considered as one of the most power hungry blocks. The high power consumption stems from the fact that an LNA must provide simultaneous wideband matching, high gain, low noise and high linearity, all of which typically require high power and high supply voltages. These combined specifications have made the design of low power and low voltage UWB LNAs a challenging research topic. The resistive shunt feedback architecture is a viable solution for wideband LNA design. This involves placing a feedback resistor around a common source amplifier to realize a wideband 50-Ω input match [ ]. However, there is a trade-off between input matching and the NF of the LNA, and satisfying both criteria simultaneously generally leads to increased power consumption [ ]. As a result, novel circuit design techniques are required to lower the power consumption. In this chapter a combination of circuit design techniques that are suitable for ultra-low power and ultra-low voltage designs are exploited to realize a broadband resistive-feedback LNA in a TSMC 90-nm CMOS technology. The performance of the LNA is measured and is compared to state-of-the-art works. The principles in the proposed low voltage and low power design methodology presented here can be readily adapted and applied to other RF circuits. 5.2 Low Voltage Shunt-Feedback Low Noise Amplifier Design Limitations Resistive shunt-feedback is a viable option for ultra-wideband LNA design. It provides wideband input matching with the aid of a feedback network. However, low voltage and low power design impose severe restrictions on the design options. Table 5 1 summarizes 78

100 5.2. LOW VOLTAGE SHUNT-FEEDBACK LOW NOISE AMPLIFIER DESIGN LIMITATIONS Table 5 1: equations for a resistive shunt feedback LNA. Resistive Shunt Feedback Amplifier Input Impedance Voltage Gain Noise Factor NF 1 + 4γgm R o+r f 1+g mr o R o(1 g mr f ) (R o+r f ) (R αr S [ o+r f )(R S +R f ) (1 g mr f )(R o(1+g mr S )+(R S +R f )) ]2 R L M2 M2 Rf Rf Rf Vout V out RS M1 R S M1 RS M1 (a) (b) (c) Figure 5 1: Three possible topologies to implement a resistive shunt feedback LNA (a) with resistive load, (b) with active load, (c) with current reuse. the design equations for input matching, gain and noise factor at low frequencies for a resistive shunt feedback LNA with passive load as shown in Fig. 5 1(a). In these equations, R O = R L r o1, γ is the thermal noise coefficient, and α = g m /g d0. As can be seen in the equations, the input impedance, voltage gain and noise factor of a shunt feedback LNA are functions of output resistance, R O, and the feedback resistor R f. As explained in the previous chapters low supply voltage in nano-meter CMOS technologies limits the achievable g m and r o. Therefore, it is necessary to examine the effects of both R f and R O on the required g m for input matching, gain and noise figure The Choice of the Feedback Resistor The value of the feedback resistor is a determining factor in finding the optimum design parameters like g m, in order to meet the design specifications. Fig. 5 2 shows the required Ph.D. Thesis 2015 M. Parvizi

101 g m (A/V) 5.2. LOW VOLTAGE SHUNT-FEEDBACK LOW NOISE AMPLIFIER DESIGN LIMITATIONS Required g m for S 11 =-18dB Required g m for s 21 =14dB Required g m for NF=6dB 60 optimum value for R f R f ( ) Figure 5 2: The required g m versus R f to achieve S 11 of 18-dB, S 21 of 14-dB and NF of 6-dB. g m versus R f to get an S 11 of 18-dB, a voltage gain of 14-dB and a NF of 6-dB, using equations shown in Table 5 1. The S 11 of 18-dB is chosen to give an 8-dB safety margin to account for the parasitics in the circuit elements that are not modelled by the design equations and also the process and temperature variations. To obtain these g m values, R O is set to 300-Ω, based on the g ds value of the transistors simulated in the middle of moderate inversion region. By looking at Fig. 5 2, it is interesting to see that the g m required to satisfy the voltage gain criteria is dominant below R f =350-Ω. For R f >350-Ω the required g m for input matching dominates and the g m required to satisfy the NF of 6-dB is well below the g m required to satisfy the two other design specifications. In other words, the choice of R f determines whether the voltage gain or input matching criteria is dominant in choosing the g m of the transistor. Also, this plot clearly shows the trade-off between voltage gain/nf and input matching in the resistive shunt-feedback LNAs. The optimum value for R f is the intersection of the curves showing the g m required to satisfy the S 21 and S 11 targets, since this yields the lowest g m that meets both specifications. As can be seen in the figure, this minimum occurs at around R f =350-Ω. Ph.D. Thesis 2015 M. Parvizi

102 g m (A/V) 5.2. LOW VOLTAGE SHUNT-FEEDBACK LOW NOISE AMPLIFIER DESIGN LIMITATIONS Required g m for S 11 =-18dB Required g m for S 21 =14dB Required g m for NF=6dB ,000 R o ( ) Figure 5 3: The required g m to achieve S 11 of 18-dB, S 21 of 14-dB and NF of 6-dB The Effect of the Output Resistance As discussed earlier, the output resistance of a MOS transistor, r o, is not high enough to be ignored when in parallel with R L ; hence its effects on gain, input matching and NF must be taken into account. Fig. 5 3 shows the required g m to get an S 11 of 18-dB, a voltage gain of 14-dB and a NF of 6-dB by using the equations shown in Table 5 1, while R f is set to 350-Ω, as found in section As can be seen in this figure, the required g m for input matching, voltage gain, and NF increases sharply as the output resistance of the shunt feedback LNA decreases. The required g m for the NF target is noticeably lower than the g m needed to achieve the voltage gain and input matching targets. For this set of specifications, and the chosen value for R f, the voltage gain specification is the determining factor for selecting the appropriate g m at low values of R O Low Power, Low Voltage Resistive Shunt Feedback Architecture Having set the feedback resistor value and studying the effect of R O on the performance of a resistive shunt feedback LNA, the appropriate architecture for ULP and ULV LNA design will be discussed. Fig. 5 1 shows the three possible solutions for implementing a Ph.D. Thesis 2015 M. Parvizi

103 5.2. LOW VOLTAGE SHUNT-FEEDBACK LOW NOISE AMPLIFIER DESIGN LIMITATIONS resistive shunt feedback LNA. Fig. 5 1(a) has a resistive load. Fig. 5 1(b) utilizes an active load to provide wideband load while requiring low voltage headroom and Fig. 5 1(c) is based on current reuse inverter type architecture. A resistive load has been widely used in resistive shunt feedback topologies [112,114,116]. While the resistive loads do provide advantages in terms of bandwidth and NF over active loads, they are not optimal for ULV designs because of the low voltage headroom available. We can best show this through a design example. We will first assume the required g m for input matching is 20-mS. Using the proposed FOM [22], we can determine the optimal IC to be 1, which leads to a transconductance efficiency of 14 (Fig. 3 2). In this case, the required current to get a g m of 20-mS is roughly 1.5-mA. Supposing that the available supply voltage is 0.5-V, the R L needed to set the output voltage to mid-rail is 333 Ω, and this leads to a V DS of 0.25-V for the transistor. At the IC of 1, this leads to an approximate intrinsic gain of 6.5 (from Fig. 3 4) which leads to g ds of 3.1-mS and a r o of 325 Ω. The total resistance at the output is now this in parallel with 333 Ω which gives R O = 163 Ω. In this case, Fig. 5 2 shows that the required g m for a 50 Ω match is 45-mS, and not the original 20-mS that we had assumed. This will require a higher current, which lower the output resistance even further, and will necessitate an even higher g m. The end result of this loop is that the specifications will never be satisfied. Consequently, the resistive load is not a feasible option for an optimal low voltage and low power design. Resistive shunt feedback with active load shown in Fig. 5 1(b) is another option [117]. Although an active load entails a constant voltage drop, the achieved output conductance is a function of the drain current. When the drain current is increased to enhance g m, it leads to lower output resistance. Fig. 5 4(a) shows the achievable g m versus R O (R O = r o1 r o2 ) for an NMOS with an active load biased at the new recommended inversion coefficient for two different supply voltages. These curves are plotted together with the g m required for input matching, and the intersection point of the g m vs. R O curves and the input matching curve corresponds to a solution with an S 11 of 18-dB. When operating at a supply voltage Ph.D. Thesis 2015 M. Parvizi

104 g m (A/V) 5.3. CIRCUIT ANALYSIS OF THE PROPOSED ULP, ULV SHUNT FEEDBACK LNA Required g m for S 11 =-18dB Active load with V DD =1.2V Active load with V DD =0.5V R o ( ) Figure 5 4: The g m vs. R O plots for a resistive shunt feedback LNA with active load, for two different supply voltages along with the plot for the required g m vs. R O. of 1.2-V the two curves meet each other at R O of 100 with g m of 60mS. Consequently, the solution at V DD of 1.2-V exists, though at the cost of high power consumption. As can be seen from the plot, this topology has no solution for a supply voltage of 0.5-V. The third option is the current reuse inverter type input shown in Fig. 5 1(c). In this architecture the current reuse scheme facilitates doubling the effective g m without any extra power consumption or deterioration of output conductance compared to active load structure. The g mt versus R O plots for current reuse architecture are shown in Fig. 5 5(b), where g mt is g m1 + g m2. As illustrated in the figure, the current reuse structure provides a solution even at low supply voltage (0.5-V). The intersection happens at R O of 210-Ω and g m of 37-mS. As a result, the inverter type input with current reuse is the best solution for low voltage and low power applications. 5.3 Circuit Analysis of The Proposed ULP, ULV Shunt Feedback LNA To achieve ultra-low power, ultra-low voltage and broadband LNA, a combination of techniques has been utilized to enhance the performance of a shunt-feedback amplifier. As Ph.D. Thesis 2015 M. Parvizi

105 g m (A/V) 5.3. CIRCUIT ANALYSIS OF THE PROPOSED ULP, ULV SHUNT FEEDBACK LNA Required g m for S 11 =-18dB Current reuse with V DD =0.5V Current reuse V DD =1.2V R o ( ) Figure 5 5: The g m vs. R O plots for a current reuse inverter type architecture for two different supply voltages along with the plot for the required g m vs. R O. discussed in section 5.2.3, the current reuse architecture shows the best performance for low power and low voltage applications. As a result, this technique is employed to reduce power consumption and at the same time to improve the gain and noise performance. Input matching is achieved using the standard resistive shunt-feedback technique. Furthermore, inductive series peaking in the feedback loop is exploited to cancel the parasitic gate-source capacitance, C gs, and the Miller effect of the parasitic gate-drain capacitance, C gd, to extend the input matching and bandwidth without additional power consumption. In this section, the proposed resistive feedback LNA will be discussed in details with a focus on the inductive series peaking to enhance the gain, input matching and noise performance Inductive Series Peaking in the Feedback Path Low supply voltages impose several restrictions on the circuit topologies that can be used. One common technique in LNA design is to use a cascode transistor for bandwidth and output resistance enhancement. However, the voltage drop needed by this transistor Ph.D. Thesis 2015 M. Parvizi

106 5.3. CIRCUIT ANALYSIS OF THE PROPOSED ULP, ULV SHUNT FEEDBACK LNA RL R L RL Rf Rf Rf Vout Vout RS M1 R S V out L1 L 1 M1 R S M1 (a) (b) (c) Figure 5 6: Resistive shunt feedback LNA (a) without bandwidth extension technique, (b) with inductive series peaking at the input, (c) with inductive series peaking in the feedback path. makes it impractical at low supply voltages. Consequently, other approaches must be used to extend the bandwidth of the amplifier, preferably without increasing the power consumption. A conventional technique to extend the bandwidth without additional power consumption is to use inductors to resonate with the parasitic capacitances of the transistors. Fig. 5 6(b) and (c) show circuits with the inductor placed at the input of the LNA [113,150] and inside the feedback loop [116, 151], respectively. To compare all three topologies in Fig. 5 6, the effective g m is calculated for each. The gate-drain capacitance, C gd, is ignored in this analysis for simplicity. The effective g m for the circuit shown in Fig. 5 6(a) is given by g m,eff = g m 1 + jωc gs R s + Rs (R Rf 2 f R 1 (g m R f 1)), (5.1) where R 1 = R L R f. At high frequencies, the equation simplifies to, g m,eff = ω T ω 1 jr s. (5.2) The effective transconductance for the circuit shown in Fig. 5 6(b) with the inductor at the input can be found by, g m,eff = α + jωc gs R s + β R 2 f g m (R f R 1 (g m R f 1)), (5.3) Ph.D. Thesis 2015 M. Parvizi

107 5.3. CIRCUIT ANALYSIS OF THE PROPOSED ULP, ULV SHUNT FEEDBACK LNA where α = (1 ω 2 L 1 C gs ) and β = R s + jωl 1. The effective transconductance for the LNA with inductive series peaking in the feedback path can be found by g m,eff = ( α 1 + Rs R f + Rs R 2 f g m R 1 (g m R f α) + jωcgsrs α ). (5.4) Clearly, by adding the inductor inside the feedback loop, the denominator ends up having only the term jωc gs R s at the resonance frequency and, as a result, the g m is boosted. However, for the case where the inductor is placed at the input (eq. 5.3), the last term in the denominator exists at the resonant frequency and dampens the response so no g m boosting occurs. Fig. 5 7 shows MATLAB simulation results for the effective g m for the three circuits. The bandwidth enhancement for series peaking in the feedback loop is obviously higher than the inductive peaking at the input, and is the approach adopted in the design presented here. The core of the LNA in this work is shown in Fig. 5 8(a), in which two series peaking inductors are employed at the gates of NMOS and PMOS transistors. The ac equivalentcircuit model is also shown in Fig. 5 8(b). As can be seen in the equivalent ac model, exploiting the two inductors inside the feedback loop has two advantages. Firstly, the inductors split the MOS C gs and C gd from the pad capacitance at the input of the amplifier and facilitates bandwidth extension [152]. Another positive effect of adding the inductors inside the feedback loop is that R f and C gd are not in parallel any more, therefore, the bandwidth of the feedback loop is also broadened. From pole-zero perspective, this technique pushes the dominant poles of the circuit to higher frequencies. Fig. 5 9 illustrates the effect of inductor values on the position of dominant poles of the circuit shown in Fig. 5 8(a). As can be seen, the conjugate poles are pushed into the left side by increasing the value of inductors. However, after a certain limit, the magnitude of the dominant poles from the origin gets reduced, leading to lower bandwidth. Ph.D. Thesis 2015 M. Parvizi

108 Effective g m (A/V) 5.3. CIRCUIT ANALYSIS OF THE PROPOSED ULP, ULV SHUNT FEEDBACK LNA Circuit (a) Circuit (b) Circuit (a) Frequency (GHz) Figure 5 7: The effective transconductance of the circuits shown in Fig L2 M2 Cgs2 - vgs2 + gm2vgs2 ro2 RS Rf Vout RS L2 R f Cgd2 Vout V S L1 M1 VS Cpad L1 C gs1 + vgs1 - C gd1 gm1vgs1 ro1 (a) Figure 5 8: (a) current reuse resistive shunt feedback LNA with inductive series peaking in the feedback loop, (b) its equivalent circuit model. (b) The exact equations of the pole-zero locations are very complex to solve; hence, simulations are employed to find the optimum values of the inductors. Fig demonstrates voltage gain simulation results with respect to various values of L 1 with L 2 shorted. Clearly, the bandwidth of the amplifier increases as L 1 increases up to a certain value, and then due to high peaking in the gain, the bandwidth gets degraded. Ph.D. Thesis 2015 M. Parvizi

109 Imaginary (Pole Frequency) (GHz) 5.3. CIRCUIT ANALYSIS OF THE PROPOSED ULP, ULV SHUNT FEEDBACK LNA L 1 & L 2 L 1 & L 2 = 3.1nH L 1 & L 2 = 3.1nH Real (Pole Frequency) (GHz) Figure 5 9: The effect of inductive series peaking in the feedback loop on the dominant poles of the circuit. Input matching of the amplifier is also affected by the inductive peaking in the feedback and hence it should be taken into consideration while choosing inductor values. Staggering the resonance frequencies of the two feedback loops further broadens the bandwidth and limits the amount of peaking in the response of the amplifier. After several simulations the value of L 1 and L 2 are found to be 3.1-nH to maximize the bandwidth while achieving the S11 of better than 10-dB in the whole frequency band. The inductors used in this design are single-ended spiral inductors in the top metal with 4 turns and with the Q-factor of 15 at 4-GHz Input Matching At low frequencies, the input impedance of the circuit in Fig. 5 8(a) can be found by the equation in Table 5 1, where g m is replaced by the total g m of the two transistors, g mt. At high frequencies the parasitic capacitances degrade the input match, but this is compensated for by the addition of the inductors described in the previous section. The Ph.D. Thesis 2015 M. Parvizi

110 Voltage Gain (db) 5.3. CIRCUIT ANALYSIS OF THE PROPOSED ULP, ULV SHUNT FEEDBACK LNA L 1 = 5.7nH L 1 = 4.9nH L 1 = 4.2nH L 1 = 3.5nH L 1 = 2.7nH L 1 = 1.5nH L 1 = Frequency (GHz) Figure 5 10: The voltage gain of the circuit in Fig. 5 8(a) for different values of L 1. overall input impedance of the proposed LNA can be found by Z in = Z 1 Z 2 (R f + r o1 r o2 ) ( ), (5.5) gm1 (r o1 r o2 ) jωc gs1 Z 1 + g m2(r o1 r o2 ) jωc gs2 Z 2 where, Z 1 and Z 2 are (jωl 1 + 1/jωC gs1 ) and (jωl 2 + 1/jωC gs2 ), respectively Voltage Gain At low frequencies, the voltage gain can be found by the equation provided in Table 5 1 with g m replaced by g mt. At high frequencies the inductive series peaking in the feedback path creates a resonance circuit with parasitic capacitances to boost the voltage gain. The overall voltage gain of the LNA is given by ( (r o1 r o2 ) 1 g m1r f A v = (R f + (r o1 r o2 )) jωc gs1 Z 1 ) g m2r f jωc gs2 Z 2 ( 1 + Rs Z 1 Z 2 ). (5.6) Ph.D. Thesis 2015 M. Parvizi

111 5.3. CIRCUIT ANALYSIS OF THE PROPOSED ULP, ULV SHUNT FEEDBACK LNA Noise Figure The main noise sources in this LNA are the channel noises of M1 and M2 and the thermal noise of the feedback resistor, R f. The noise factor of the LNA can be expanded as NF 1 + R f R s ( ) gmt R s + γg ( mt Rs + R f 1 g mt R f αr s 1 g mt R f ) 2, (5.7) in which, the second term is due to the feedback resistor, R f, and the third term is the noise contribution of the transistors. Also, it should be noted that the effective g m of the transistors gets boosted due to inductive series peaking in the feedback path, giving rise to a lower NF at the resonance frequency Linearity The linearity of the LNA is mainly limited by the low supply voltage of the LNA, which is a direct consequence of the low power design and the migration to lower supply voltages. However, employing a conventional inverter-type structure as the input causes the g m of the NMOS and PMOS transistors, which have opposite signs, to cancel each other. This leads to second-order distortion cancellation in the LNA [98]. Additionally, as discussed in section [22], biasing the transistor close to the optimum point in the ULP ULV biasing metric, IC = 1, causes the third-order distortion to be small, leading to higher linearity. As discussed earlier, the non-linear g ds is also a dominant contributor to the non-linearity. The role of negative feedback on distortion suppression [153] is also an advantage of this topology Comparison between Different Bandwidth Extension Techniques To verify the advantages of using inductors in the feedback loop as a bandwidth extension technique, the three circuits in Fig are simulated in a 90-nm CMOS technology. The circuits draw the same amount of current from a 0.5-V supply voltage, and the results for voltage gain, input matching and NF are provided in Fig. 5 12, 5 13 and Fig respectively. As expected, it can be seen in the figures that placing the peaking inductor Ph.D. Thesis 2015 M. Parvizi

112 5.3. CIRCUIT ANALYSIS OF THE PROPOSED ULP, ULV SHUNT FEEDBACK LNA M2 M2 L2 M2 R S R f Vout RS L 1 Rf V out RS R f V out M1 M1 L 1 M1 (a) (b) (c) Figure 5 11: Resistive shunt feedback LNAs with current reuse scheme (a) without any bandwidth extension, (b) with inductive series peaking at the input, (c) with inductive series peaking in the feedback loop. inside the feedback loop leads to 55% improvement in S 21 bandwidth, 80% increase in S 11 bandwidth, and to lower NF especially at high frequencies. As discussed earlier, by placing the inductors in the feedback loop a pair of complex poles are created in the response and that leads to an under-damped response. As a result, the transient response has overshoots and undershoots and requires a settling time to reach its steady state Circuit Design Fig shows the complete schematic of the proposed ULP, ULV LNA with the test buffer. The target specifications to design the LNA are: 1) current budget of 1.5-mA; 2) supply voltage of 0.5-V; 3) bandwidth > 7-GHz; 4) voltage gain of 12-dB; 5) NF < 5-dB; 6) IIP 3 > 10-dBm. These specifications are competitive with other LNAs in the literature in terms of bandwidth, gain, noise figure, and linearity [106, 114, 117, 118, 152, ] while targeting a lower supply voltage and power consumption. The inversion coefficients, and thus the effective voltages, are determined using eq. 4.1 and Fig The g mt is set by input matching criteria with regard to Table 5 1 and Fig The final values for the device sizes are summarized in Table 5 2. It should be noted that the threshold voltage of an NMOS transistor in this technology with a drain-source voltage of 0.25-V is about 370-mV. Ph.D. Thesis 2015 M. Parvizi

113 S 11 (db) S 21 (db) 5.3. CIRCUIT ANALYSIS OF THE PROPOSED ULP, ULV SHUNT FEEDBACK LNA % BW Improvement in S Circuit (c) Circuit (b) Circuit (a) Frequency(GHz) Figure 5 12: The S 21 simulation results for the circuits shown in Fig % BW Improvement in S Circuit (c) -30 Circuit (b) Circuit (a) Frequency (GHz) Figure 5 13: The S 11 simulation results for the circuits shown in Fig Ph.D. Thesis 2015 M. Parvizi

114 NF(dB) 5.3. CIRCUIT ANALYSIS OF THE PROPOSED ULP, ULV SHUNT FEEDBACK LNA Circuit (c) Circuit (b) Circuit (a) Frequency(GHz) Figure 5 14: The NF simulation results for the circuits shown in Fig Table 5 2: Device Dimensions M 1 (1.1µm/100nm) 64 M 2 (1.9µm/100nm) 64 R f C 1, C 3 C 2 L 1, L 2 335Ω 3.2pF 11pF 3.1nH In this prototype design, the bias voltages V b1 and V b2 were used to tune the gate voltages of M1 and M2 such that their drain voltages are mid-rail, and also to give us additional flexibility to control the characteristics of the LNA. In a self-contained version of the circuit, the drain voltage can be controlled through a DC feedback network to make sure that the drain voltage of the transistors is biased at mid-rail. Ph.D. Thesis 2015 M. Parvizi

115 K-Factor 5.3. CIRCUIT ANALYSIS OF THE PROPOSED ULP, ULV SHUNT FEEDBACK LNA Rb RS Vb2 C3 C2 L2 Rf M2 Buffer M4 C1 L1 M1 M3 Vout Vb1 Rb Figure 5 15: The complete schematic of the proposed ultra-low power, ultra-low voltage LNA with buffer for measurement purposes Frequency (GHz) Figure 5 16: The K-factor of the LNA. The stability of the LNA is also verified. The presence of a feedback network and boosting inductors might cause the LNA to be potentially unstable. However, simulation results demonstrate that the LNA is unconditionally stable. Fig shows the k-factor of the LNA. The k-factor is always more than 40 while = S 11 S 22 S 12 S 21 is less than 0.07 in the frequency band of interest. Ph.D. Thesis 2015 M. Parvizi

116 LNA Buffer 5.4. MEASUREMENT RESULTS Test Structures Figure 5 17: The die micrograph of the proposed LNA. 5.4 Measurement Results The proposed and fabricated circuit was tested using a probe station, and an on-chip buffer was used to drive the 50 Ω impedance of the measurement equipment. The buffer characteristics were measured separately using a stand-alone on-chip test structure, and were de-embedded from the S 21 and NF results presented here. The die photograph of the LNA is shown in Fig The active area is mm 2. The DC pads were bonded and the DC voltages were supplied using a custom PCB. The lengths of the bond wires were minimized to lower their parasitic inductance. It should be noted that the simulation results include bondwire inductances and parasitic capacitances when appropriate (e.g., on the DC biasing nodes). Fig shows the measurement results, along with the simulation results for the voltage gain and S 11. The fabricated LNA reaches a maximum voltage gain of 12.6-dB Ph.D. Thesis 2015 M. Parvizi

117 A V and S 11 (db) 5.4. MEASUREMENT RESULTS A V S11 Measurement Simulation Frequency (GHz) Figure 5 18: Measurement and simulation results for the voltage gain and input matching. at 4.5-GHz, with a 3-dB cut-off frequency of 7-GHz and achieves a S 11 of less than 10- db across the entire bandwidth. The gain roll-off seen in the measured voltage gain can be explained by the change in the resonant frequency of the inductors due to increased inductances and/or parasitic capacitances in the fabricated LNA. Fig shows the S 22 of the buffer and S 12 of the LNA and buffer. The S 22 is less than 9-dB across the entire bandwidth and S 12 is better than 23-dB in the band of interest. The parasitic capacitances of the LNA and the buffer and the parasitic couplings between them have deteriorated the S 12 at low frequencies compared to the simulations. The NF of the LNA is shown in Fig The minimum measured NF is 5.5-dB, while the average NF across the entire band is 6dB. The measured NF is higher than the simulations results. This is mainly due to inaccurate transistor noise modelling in moderate inversion, and to a reduced voltage gain compared to simulation results. The bond wires used in this test setup also contribute to noise injection, since there are insufficient on-chip decoupling capacitances at the DC biasing nodes to create a strong small-signal ground. The IIP3 was measured using a two-tone test, with the tones spaced 100-MHz apart. The center frequency of the two tones was varied across the band, and the worst case was Ph.D. Thesis 2015 M. Parvizi

118 NF (db) S22 and S12 (db) 5.4. MEASUREMENT RESULTS S22 Simulation Measurement -40 S Frequency (GHz) Figure 5 19: The reverse isolation and output matching of the LNA Measurement Simulation Frequency (GHz) Figure 5 20: The measurement and simulation results for the NF. observed for a center frequency of 5-GHz. A plot of the output power vs. the input power for this case is shown in Fig. 5 21, and the IIP3 was measured to be 9-dBm. The worst case 1-dB compression point is 18-dBm. The worst case IIP2 is +5-dBm, and was measured using two-tone with 1-GHz frequency spacing at 5-GHz frequency. The power consumption of the LNA is only 0.75-mW from a 0.5-V supply voltage. A comparison of this work to other recently published state-of-the-art circuits is shown in Ph.D. Thesis 2015 M. Parvizi

119 Output Power (dbm) 5.4. MEASUREMENT RESULTS IIP 3 =-9dBm Input Power (dbm) Figure 5 21: The measured input-output characteristics of the LNA. Table 5 3. Two figures of merit (FOM I and FOM II) [125] are used to compare the overall performance of the LNAs and they are given by F OM I = 20 log 10 ( S21av.[lin] BW [GHz] P dc[mw ] ( F av.[lin] 1 ) F OM II = 20 log 10 ( S21av.[lin] BW [GHz] IIP 3 [mw ] P dc[mw ] ( Fav.[lin] 1 ) ), (5.8) ). (5.9) The LNA presented here achieves one of the lowest power consumption and employs the lowest supply voltage, when compared to other works. It also offers comparable performance in terms of gain, noise figure, and linearity and achieves very high FOM I. Fig highlights the overall performance of this LNA based on FOM I versus power consumption and compares it with the other works in the literature. When comparing the works using FOM II (which includes linearity), this LNA has the third highest FOM. Reduced linearity is expected due to the ultra low supply voltage used for this LNA, however, it is interesting to note that the FOM achieved here is higher than LNAs operating from a supply voltage twice as large. The measured performance of this LNA shows that this topology is suitable for ULP, ULV, broadband LNAs in deep sub-micron CMOS technologies. In summary, the designed LNA Ph.D. Thesis 2015 M. Parvizi

120 FOM I 5.5. SUMMARY This Work 15 [33] [35] [31] [18] 10 [27] [36] [21] [37] 5 [14] [34] Power (mw) Figure 5 22: Comparison of the state-of-the-art works in the literature with the designed LNA based on FOM I. outperforms the previously published works due to the following reasons: having 1) the right choice of R f, 2) the right value of the bias voltages based on the extended ULP ULV biasing metric, 3) employing a current reuse scheme and 4) using inductive series peaking in the feedback path. 5.5 Summary In this chapter, an ultra-low power and ultra-low voltage wideband CMOS LNA was proposed and implemented in a TSMC 90-nm CMOS technology. An extended biasing metric for low power and low voltage circuit design was employed in the LNA to reduce the power consumption of the LNA while maintaining the performance. A current-reuse scheme to lower the power consumption, along with inductive series peaking in the feedback path to increase the bandwidth of the LNA without additional power consumption, are analysed and exploited in the LNA. The LNA operates over the bandwidth of GHz and achieves a voltage gain of 12.6-dB, a minimum NF of 5.5-dB, and a worst-case IIP3 of 9-dBm, while consuming only 0.75-mW from a 0.5-V supply voltage. The LNA presented here offers comparable performance in terms of gain, noise figure, and linearity and achieves one of the best FOM I in the literature while burning very low power. Ph.D. Thesis 2015 M. Parvizi

121 5.5. SUMMARY Table 5 3: Performance summary and comparison with state-of-the-art LNAs Parameter This Work [99] 2010 TCAS-I [152] 2006 JSSC [118] 2011 E. Lett. [155] 2007 JSSC [157] 2011 MTT [125] 2008 JSSC [114] 2006 ISSCC [106] 2009 ISSCC [117] 2009 MCL [158] 2010 MTT 3-dB BW (GHz) Power (mw) Supply (V) S11(dB) < 10 < 9 < 6 < 10 < 9.9 < 9 < 10 < 7 < 9* < 9 < 10 Gain max (db) NF (db) * * IIP3 (dbm) NA NA Technology 90-nm 0.13-µm 0.18-µm 0.18-µm 0.13-µm 90-nm 90-nm 90-nm 0.18-µm 90-nm 90-nm Area (mm 2 ) FOM I FOM II NA 1.7 NA *Estimated from the curves. Ph.D. Thesis 2015 M. Parvizi

122 CHAPTER 6 An Inductor-less Ultra-Low Power Tunable Active Shunt-Feedback LNA 6.1 Introduction The need for wideband and ultra-low power (ULP) RF front-end circuits has been growing with the proliferation of portable wireless devices and wireless sensor networks. These applications impose strict restrictions on the power consumption of RF front-end circuitry to extend battery lifetime. The low noise amplifier (LNA), as the first active block in the RF front-end of a receiver, has to provide simultaneous wideband matching, low noise, high gain, and modest linearity, all of which require high power consumption. However, the LNA specifications for ULP receivers are different from that of traditional LNA design mostly in terms of noise figure (NF). These receivers are typically used in low data rate applications such as wireless sensor networks, where a higher NF (and reduced sensitivity) is tolerated to achieve low power consumption. [45, 54]. Given the aforementioned low power LNA design challenges, the design of ULP (<1- mw), wideband LNAs has been an active research topic e.g. [27, ]. In this work we introduce a compact, inductorless, low voltage, ULP LNA employing complementary current-reuse and tunable active shunt-feedback through forward body biasing techniques. The remainder of the chapter is organized as follows. First, ULP design techniques are introduced and analysed. After that, the circuit description of the proposed LNA is discussed followed by the measurement results and discussions. Finally, the conclusion is presented. 6.2 Ultra-Low Power Design Techniques In this section we describe the low power design techniques for the proposed inductorless ultra-low power LNA. Active shunt-feedback input matching will be discussed first, after 101

123 6.2. ULTRA-LOW POWER DESIGN TECHNIQUES Rd RS -A M1 Vout Figure 6 1: The block diagram of the shunt-feedback architecture employed in this design. RS IFeedback1 RS1 Vbp Vout M2 IFeedback1 M3 IFeedback1 RS Vbn M1 RL1 Vout M4 IFeedback2 IFeedback2 Vbn RS2 M1 I Feedback2 M4 Vbp IFeedback2 M2 R L2 C1 M3 I Feedback1 (a) (b) Figure 6 2: Inductorless common-gate complementary current-reuse architectures (a) NMOS-PMOS in conventional order (b) flipped NMOS-PMOS to enable current-reuse in the feedback transistor. which a complementary current-reuse architecture to minimize current consumption will be described Active Shunt-Feedback With the available power budget, input impedance matching is one of the main challenges in the design of an ultra-low power LNA. Resistive shunt-feedback is a commonly used Ph.D. Thesis 2015 M. Parvizi

124 6.2. ULTRA-LOW POWER DESIGN TECHNIQUES technique for wideband input matching with no extra power consumption in the feedback stage [113, 114]. However, due to the trade-off between gain/nf and input matching in this architecture, the resultant solution generally leads to high power consumption [ ]. Active shunt-feedback is the other viable solution that can provide wideband input matching in both common-source and common-gate amplifiers [106, 119, 125, 149]. However, only common-gate architectures have shown good potential for providing low power input matching [106, 119]. On the other hand, the power consumed in the feedback stage has to be considered in ultra-low power designs. In this work, a new ultra-low power LNA architecture based on active shunt-feedback is implemented. Fig. 6 1 conceptually illustrates the feedback mechanism in the proposed LNA. The shunt-feedback reduces the input impedance of the common-gate LNA by a factor of (1 + A), which enables the use of a lower g m for M1, and hence reduces the power consumption of the LNA Complementary Current-reuse Inductorless Common-Gate LNA Current-reuse is a very useful design technique for ULP LNAs and has been widely used in the literature [119, 136, 137] to reduce the overall power consumption and improve the current efficiency. In this method, the DC current is shared between transistors, while each contributes to the overall gain of the LNA. Moreover, complementary current-reuse [27, 98, 122, 138, 139] that employs both NMOS and PMOS transistors to take advantage of their complementary characteristics has led to ULP designs with additional advantages including g m -boosting, and cancellation of distortion and noise. Many architectures, use inductors to facilitate current-reuse [27, 122, 139] at low supply voltages, however, this work targets a compact solution which means inductors are not a viable option. To implement an inductorless complementary current-reuse scheme with transistors in a common-gate configuration and with active shunt-feedback, the structures shown in Fig. 6 2 are considered. The schematic in Fig. 6 2(a) shows a conventional structure, while the schematic in Fig. 6 2(b) flips the NMOS and PMOS transistors in order to reuse the current of feedback transistor in one of the input stage transistors. As shown in Fig. 6 2(a) if either an NMOS or a PMOS Ph.D. Thesis 2015 M. Parvizi

125 6.3. CIRCUIT DESCRIPTION (M3 or M4) is added as the shunt-feedback transistor to the input stage, its current will be in parallel with the input devices and cannot be reused. However, as can be seen in Fig. 6 2(b) the current of the feedback transistor is reused either in M1 or M2, depending on whether we add M3 or M4. Hence, the architecture in Fig. 6 2(b) was chosen, since it lends itself better to current-reuse and is more suitable for ultra-low power applications. 6.3 Circuit Description The circuit schematic of the ULP LNA proposed here is shown in Fig Transistors M1 and M2 form the input stage in a common-gate configuration and share the same DC current. They are both biased in the moderate inversion region to achieve good overall performance with low power consumption [22]. R L1 and R L2 form the load resistances, and are chosen over an active load to reduce the flicker noise contribution, and also to maintain constant impedance at the output. Capacitor C 1 combines the output currents of the transistors. Transistor M3 is a common-source amplifier that creates the shunt-feedback network. Capacitor C 2 is used for DC blocking between LNA and the buffer. To avoid the use of another decoupling capacitor and its associated area and parasitics, M3 is DC coupled to the output of M2 and is biased in weak inversion. This configuration allows choosing the DC current of the feedback transistor independently, which adds a degree of freedom compared to the architecture in [119]. Moreover, the DC current of the feedback transistor is also being reused by M1, leading to higher current efficiency and lower power. Since the current of M1 is larger than that of M2, the load resistances must be chosen appropriately for proper biasing. The following subsections will present a detailed analysis of the tunable active shuntfeedback scheme, which is realized through forward body biasing, and the LNA design equations including input matching, gain, NF and linearity Forward Body Biased Tunable Feedback A forward body biasing scheme in M3 is used to control the feedback coefficient and to adjust the gain and input matching of the LNA. Increasing the voltage at V sub leads Ph.D. Thesis 2015 M. Parvizi

126 6.3. CIRCUIT DESCRIPTION Buffer Vbn + - RL1 V2 Vb1 + - RS Cin M1 V1 C2 M4 Vout M2 C1 V3 Vbp + - RL2 M3 Rsub + - V sub Figure 6 3: The complete schematic of the ULP LNA including the output buffer for testing purposes. to a lower threshold voltage and therefore to a higher feedback coefficient. The following equations describe the impact of forward body biasing on g m3, which is biased in the weak inversion region. The g m of a transistor biased in weak inversion can be expressed as [131] g m = I D0 nu T ( ) e V GS V T H 2nU T. (6.1) V T H is given by ( V T H = V T H0 + α 2φF V BS ) 2φ F, (6.2) where α is a process dependent body effect parameter and φ F is the substrate Fermi potential with typical values of V 1/2. The variation of g m with V T H can be found by, g m = I D0 V T H (nu T ) 2 ( ) e V GS V T H 2nU T. (6.3) As can be seen in (6.3), g m varies exponentially with respect to V T H in the weak inversion region. Simulation results reveal that, by varying the substrate voltage between V, Ph.D. Thesis 2015 M. Parvizi

127 S11 & S21 (db) 6.3. CIRCUIT DESCRIPTION Vsub S21 S11 Vsub Vsub=0V Vsub=0.1V Vsub=0.2V Vsub=0.3V Vsub=0.4V Frequency (GHz) Figure 6 4: Variation of S11 and S21 of the LNA with substrate voltage of M3 (V sub ). g m3 can be tuned by 50% around 2-mS. The effect of the V sub variation on S11 and S21 of the LNA are illustrated in Fig As can be seen, by increasing the substrate voltage from V, S11 improves by 5-dB while S21 at 1-GHz is reduced by only 1.5-dB. Additionally, V sub has negligible impact on the NF, causing an increase of only 0.05-dB over the entire range. Hence, this tuning ability can be employed to compensate for variations in the input matching across process corners. It should be noted that the bias current variation in M3 is designed to be in a range that does not disturb the desired operating point of the input stage Input Impedance Active shunt feedback plays an important role in the ability to match the input impedance of the LNA to 50-Ω while minimizing the current consumption. By considering the impedance of capacitor C 1 to be negligible, the input impedance of the LNA can be expressed as Z in = 1 (R L1 R L2 ) (g o1 + g o2 ) + 1, (6.4) sc gs (G 1 + G 2 ) (1 + A) Ph.D. Thesis 2015 M. Parvizi

128 Input Impedance ( ) 6.3. CIRCUIT DESCRIPTION g o1 +g o2 g o1 +g o2 =1uS g +g =1mS o1 o2 g o1 +g o2 =2mS g +g =3mS o1 o2 g o1 +g o2 =4mS g +g =5mS o1 o g m3 (ms) Figure 6 5: Input impedance of the LNA versus g m3 for multiple values of g o1 +g o2 at 1-GHz. where g oi is the output conductance of transistor M i, C gsi is the gate-source capacitance of transistor M i, A = g m3 (R L1 R L2 ), G 1 = g m1 + g o1, G 2 = g m2 + g o2, and C gs = C gs1 + C gs2. To investigate the g m3 needed to provide an input impedance close to 50-Ω, Fig. 6 5 plots the input impedance of the LNA versus g m3 for multiple values of g o1,2 at 1-GHz based on the design parameters used in this work. For instance, the g m1 + g m2 is considered to be 10-mS which is achievable with very low power consumption. It is interesting to note the impact of output conductance of the transistors on the input impedance since the output conductance has become an important circuit design parameter as low voltage headroom due to supply voltage reduction and migration towards nano-meter CMOS technology nodes contribute to the output conductance degradation [22]. As shown in the figure, a higher output conductance means a higher g m3 is required to achieve good input matching. Fig. 6 5 also shows that an input impedance match can be achieved with a g m3 below 5-mS, which means that a low power input matching network is feasible. In this design, g m3 is chosen to be 2-mS while consuming only 100-µA. At low frequencies, and ignoring the output conductance Ph.D. Thesis 2015 M. Parvizi

129 6.3. CIRCUIT DESCRIPTION for simplicity, the input matching condition simplifies to R in = 1 (g m1 + g m2 ) (1 + A). (6.5) Equation (6.5) clearly shows that the feedback network boosts the effective g m by a factor of (1+A), which allows the circuit to achieve good input matching with low power consumption. It is instructive to investigate the impact of capacitor C 1 s impedance on the input matching. As noted earlier, the function of C 1 is to combine small signal output currents of the input transistors M1 and M2 without affecting their bias points. At low frequencies where jωc 1 1, the input impedance of the LNA can be found by R in = (1 + g o1 R L1 ) (1 + g o2 R L2 ) (1 + g o2 R L2 ) G 1 + G 2 (1 + g o1 R L1 ) (1 + g m3 R L2 ), (6.6) Comparing (6.4) with (6.6) shows that the feedback coefficient at low frequencies increases to A low freq = g m3 R L2 which is almost 2 times larger than A at high frequencies. Moreover, R L1 and R L2 are not in parallel any more at low frequencies, since C 1 is open circuit and V 2 and V 3 are not ac shorted together. Ignoring the output conductances and assuming g m1 2g m2 and A low freq = 2A the equation simplifies to R in = 1 g m1 (1.5 + A), (6.7) Comparing (6.5) and (6.7) reveals that the combining capacitor C 1 reduces the input impedance of the LNA and improves the impedance matching, which is suitable for ultra-low power input matching. Fig. 6 6 shows the input impedance matching of the LNA for the two cases of C 1 = 0 (the physical meaning in the circuit is that C 1 does not exist and there is an open circuit between V 2 and V 3 ) and C 1 = 3.3-pF. As shown in the figure, the S11 at low frequencies is almost equal in both cases as expected. However, at high frequencies the advantage of the combining capacitor is evident. Ph.D. Thesis 2015 M. Parvizi

130 Sxx (db) 6.3. CIRCUIT DESCRIPTION S S21 C1=3.3pF S21 C1=0 S11 C1=0 S11 C1=3.3pF S Frequency (GHz) Figure 6 6: The variation of S21 and S11 of the LNA for two C 1 values Voltage Gain Assuming that V 1 and V 2 are ideally ac coupled together through C 1, all of the transistors, M1, M2, M3, and load resistors, R L1, R L2, contribute to the voltage gain of the LNA. The gain of the proposed LNA with this assumption can be expressed as, A v = Z L G R s G (1 + A) + (1 + Z L (g o1 + g o2 )) (1 + sc gs ), (6.8) where G = g m1 + g m2 + g o1 + g o2 and Z L = Z L1 Z L2 and Z L1 and Z L2 are the impedances at nodes V 2 and V 3, respectively. Equations (6.4) and (6.8) highlight the fundamental trade-off between good input matching and high gain. Increasing g m3 reduces the input impedance but at the cost of lower voltage gain. It should be noted that the current-reuse scheme in the feedback stage leads to an increase in g m1 as g m3 is increased, which improves both input matching and voltage gain. At low frequencies and by ignoring output conductances of M1 and M2 for simplicity, the voltage gain of the LNA can be simplified to A v = (R L1 R L2 ) (g m1 + g m2 ) R s (g m1 + g m2 ) (1 + A) + 1. (6.9) Ph.D. Thesis 2015 M. Parvizi

131 6.3. CIRCUIT DESCRIPTION A more realistic design perspective is achieved by considering the impedance of capacitor C 1. This capacitor has a large impact on the bandwidth and gain of the LNA. By taking into account the impedance of C 1, the voltages at nodes V 1 and V 2 are related to each other by V 2 = V 1 jωc 1 (g m1 + g m2 ) + g m2 R L1 jωc 1 (g m1 + g m2 ) + g m1 R L2. (6.10) At low frequencies, where jωc 1 1, this equation is simplified and becomes V 2 = V 1 (g m2 R L2 ) / (g m1 R L1 ) and at high frequencies, where jωc 1 1, V 2 equals V 1. Therefore, the voltage gain of the LNA is expressed by A v = R L1 g m1 R L1 +R L2 + F (ω) 1+R s((g m1 +g m2 )+g m3 g m2 R L2 ) R L1 +R L2 + G (ω). (6.11) F (ω) = jωc 1 (g m1 + g m2 ). (6.12) G (ω) = jωc 1 (1 + R s (g m1 + g m2 ) (1 + A)). (6.13) At high frequencies, where jωc 1 1, (6.11) simplifies to (6.8) and at low frequencies, where jωc 1 1, the two functions F (ω) 0 and G (ω) 0 and the voltage gain is expressed by A v = R L1 g m1 1 + R s ((g m1 + g m2 ) + g m3 g m2 R L2 ). (6.14) The impact of capacitor C 1 on the voltage gain can be understood by comparing (6.14) and (6.9). When C 1 is chosen large enough the numerator of voltage gain is determined by (g m1 + g m2 ) (R L1 R L2 ). However, when jωc 1 1 the numerator is formed by R L1 g m1. Assuming that g m1 2g m2 and R L1 R L2 the numerator in (6.14) is larger than that in (6.9) by a factor of 3/4, whereas the denominator in (6.14) is slightly larger than that in (6.9). Thus, the overall gain is larger once the combining capacitor is removed. Fig. 6 6 shows the S21 plots for the two cases of C 1 = 0, 3.3-pF. As we expect, at low frequencies, the two S21 plots match each other, however, at higher frequencies the lower value of the numerator and the parasitic capacitance of MIM capacitor reduces the gain and limits the Ph.D. Thesis 2015 M. Parvizi

132 6.3. CIRCUIT DESCRIPTION bandwidth. Hence, it can be deduced that the combing capacitor does not improve the voltage gain and reduces the bandwidth Noise Figure The main noise sources in the proposed LNA are the drain current noise of transistors M1, M2 and M3 ((i 2n)/ f ) = 4KT γ/α g m and the thermal noise of load resistors ( ) (i 2 n)/ f = 4KT/R. γ is the thermal noise coefficient and α is g m /g d0. In this analysis, the output impedance of the transistors are assumed to be infinite and the gate noise (R g ) of the transistors is ignored since it can be minimized by proper layout techniques. First, it is required to discuss the noise mechanisms in the LNA. Consider Fig. 6 7, which shows the proposed LNA with noise contributed from only M1 for simplicity. As shown in the figure, this noise current creates a noise voltage at V 2 and a correlated noise voltage with a smaller amplitude and opposite phase at V 1. Accordingly, the noise voltage at V 1 is amplified by M2 and appears at node V 3. If the impedance between V 2 and V 3 is small enough, the noise voltages on V 2 and V 3 (which have opposite phases) are added together and part of the noise from M1 will be cancelled. The noise cancellation mechanism also applies to the noise generated by M2. On the contrary, if V 2 and V 3 are not ac coupled together, in addition to all M1 noise that appears on V 2, part of M2 noise is also added without getting cancelled. To fully underline the impact of C 1 on the NF, the noise analysis will be discussed for two cases: 1) jωc 1 1 and 2) jωc 1 1. jωc 1 1 In this case V 2 and V 3 are ideally ac coupled to each other. Therefore, the noise factor of the LNA can be found by NF = 1 + γ 1 + γ [ ] α (g m1 + g m2 ) R s α g (gm1 + g m2 ) R s m3r s + R s (g m1 + g m2 ) 2 (R L1 R L2 ). (6.15) The second term in the equation is due to the noise contribution of M1 and M2. The third term is from the feedback transistor, and the last term highlights the noise contribution Ph.D. Thesis 2015 M. Parvizi

133 6.3. CIRCUIT DESCRIPTION RL1 V2 M1 V bn 2 i n1 R S V bp V1 M2 V3 R L2 C1 M3 Rsub Vsub Figure 6 7: The noise mechanisms in the LNA. Only noise contribution of M1 is highlighted for simplicity. of the loads. Equation (7.23) can be simplified at the input matching condition to NF = 1 + γ α (1 + A) + γ α g m3r s + R s R L1 R L2 (2 + A) 2. (6.16) To better understand the noise contribution of the load resistors, Fig. 6 8 models the NF of the LNA, due only to the load resistors in MATLAB. As shown in the figure, by increasing the load resistance value, the noise figure decreases. The voltage drop on the load resistors and the available voltage headroom are also important factors in determining the load resistance value. On the other hand, as shown on the figure, a higher (g m1 + g m2 ) decreases the NF so lower load resistance values can be tolerated to achieve the same noise figure. Ph.D. Thesis 2015 M. Parvizi

134 NF(dB) 6.3. CIRCUIT DESCRIPTION g m1 +g m2 g m1 +g m2 =5mS g m1 +g m2 =8mS g m1 +g m2 =11mS g m1 +g m2 =20mS g m1 +g m2 =40mS R L1 R L2 (K ) Figure 6 8: The NF of the LNA only due to the load resistances modelled in MATLAB. jωc 1 1 In this case, V 2 and V 3 are not ac coupled to each other and as discussed earlier, each node has separate noise voltage due to (i 2 n1). These noise voltages due to M1 are given by V 2,n = R L1 4KT γ/α gm1 V 3,n = + 4KT γ/α g m1 1 + R s g m2 (1 + R L2 g m3 ) 1 + R s (g m1 + g m2 ) + R s R L2 g m2 g m3, (6.17) R L2 R s g m2 1 + R s (g m1 + g m2 ) + R s R L2 g m2 g m3. (6.18) The same equations can be developed for the noise contribution of M2. As a result, not only does V 2 have a large noise contribution from (i 2 n1) without being cancelled by the correlated noise at V 3, but also it contains a noise contribution from (i 2 n2) as can be deduced from (6.18). Fig. 6 9(a) highlights the noise figure dependence on the value of C 1. As can be seen in the figure and explained above, when there is no capacitor between V 2 and V 3 the NF is very high. Once the capacitor is added, increasing the value of C 1 improves the NF, especially at low frequencies. On the other hand, as discussed previously, increasing C 1 reduces the 3-dB bandwidth due to larger parasitic capacitances at V 2 and V 3. Fig. 6 9(b) highlights Ph.D. Thesis 2015 M. Parvizi

135 Bandwidth (GHz) NF (db) NF (db) Bandwidth (GHz) 6.3. CIRCUIT DESCRIPTION C1=2.1pF C1=3.5pF C1=4.8pF C1=7.4pF C1=11pF C1= C Frequency (GHz) (a) C 1 (pf) MHz Bandwidth C Frequency (GHz) C 1 (pf) (b) Figure 6 9: The impact of C 1 capacitor on (a) NF of the LNA (b) 3-dB bandwidth of the LNA and the NF value at 100-MHz. the trade-off between the NF at low frequencies (100-MHz) and the 3-dB bandwidth of the LNA and provides a good guideline to determine the optimum value for C 1 based on the target specifications. It should be noted that the noise cancellation mechanism discussed in this LNA is different from the conventional noise cancellation technique [100]. In the conventional noise cancellation scheme the noise of the matching device is fully cancelled through an extra Ph.D. Thesis 2015 M. Parvizi

136 6.3. CIRCUIT DESCRIPTION gain stage that does not help with input impedance matching. In this case, the NF of the LNA is determined by the NF of the extra gain stage, whereas in this design the noise of input matching devices (M1 and M2) are partially cancelled through C 1 to make the noise contribution of M1 and M2 as good as a single common-gate transistor with g m equals to (g m1 + g m2 ) Nonlinearity The nonlinearity of the LNA is analysed using a Taylor series. The nonlinearity of the output conductance of the transistor is also included in the analysis since low drain-source voltage makes the output conductance to be one of the large contributors to the overall nonlinearity of the LNA, specifically in deep sub-micron technologies [22, 145, 159]. At high frequencies, where jωc 1 1, the nonlinear source-drain current of the NMOS transistor in a common-gate configuration shown in Fig. 6 3 for small-signal operation can be modelled using the Taylor series i sd1 = ( g m1 v gs + g ds1 v sd + g m1 2 v2 gs + g ds1 2 v2 sd + g m1 6 v3 gs + g v gs = v 1 & v ds = v 21 i sd1 = g m1 v 1 + g ds1 v 21 g m1 2 v2 1 + g ds1 2 v g m1 6 v3 1 + g ds1 6 v3 sd ) ds1 6 v3 21, (6.19) where g m and g m are the first and second order derivatives of g m with respect to V GS and g ds and g ds are the first and second order derivatives of g ds with respect to V DS. In (6.19) the cross terms are ignored for simplicity. The PMOS has the same current characteristics, except that its opposite carrier type has to be taken into consideration, which leads to i sd2 = g m2 v 1 + g ds2 v 21 + g m2/ 2 v g ds2 / 2 v g m2/ 6 v g ds2 / 6 v Hence, the output current of the input stage without the transistor M3 can be found by, i out = a 1 v 1 + b 1 v 21 + a 2 v b 2 v a 3 v b 3 v 3 21, (6.20) Ph.D. Thesis 2015 M. Parvizi

137 6.3. CIRCUIT DESCRIPTION where a 1 = (g m1 + g m2 ), b 1 = (g ds1 + g ds2 ), a 2 = (g m2 g m1) /2, b 2 = (g ds2 + g ds1 ) /2, and a 3 = (g m1 + g m2) /6 and b 3 = (g ds1 + g ds2 ) /6. As can be seen, the use of the complementary characteristics of NMOS and PMOS devices can lead to cancellation of second-order distortion due to g m. To fully capture the nonlinear behaviour of the LNA, M3 has to be considered. The feedback transistor can be treated as a linear network, since its g m is very small. Hence, the overall output current can be simplified to [160] i out = a 1 (1 + B) v 1 + b 1 (1 + B) v 21 + a 2 (1 + B) 3 v2 1 + b 2 (1 + B) 3 v a 3 (1 + B) 2 (a 2 + b 2 ) 2 A (1 + B) 5 v b 3 (1 + B) 2 (a 2 + b 2 ) 2 A (1 + B) 5 v (6.21) In this work, B 1, and theoretically, the feedback reduces the second and third-order distortion terms by (1 + B) 3 8 and (1 + B) 4 16, respectively. The use of complementary current-reuse leads to second-order distortion reduction due to g m as shown in (6.20), however, since the currents in M1 and M2 are not equal in this design and also the distortion of the output conductance still exists, complete cancellation of the second-order terms a 2 and b 2 does not occur. On the other hand, it is clear that at low frequencies, where jωc 1 1, V 2 and V 3 are not shorted together and hence the second-order distortion cancellation does not occur completely Stability It is important to investigate the stability of the LNA, considering the use of feedback. At high frequencies, the stability is examined using the k-factor. Unconditional stability is achieved if the k > 1 and < 1 for the frequency band of interest. The k-factor and can be found by [161] k = 1 S 11 2 S S 11 S 22 S 12 S (6.22) 2 S 12 S 21 = S 11 S 22 S 12 S 21. (6.23) The most likely element to cause instability in the LNA is M3. As a result, to demonstrate the effect of tunable feedback coefficient on the k-factor of the proposed LNA, the body bias Ph.D. Thesis 2015 M. Parvizi

138 k-factor 6.4. MEASUREMENT RESULTS AND DISCUSSIONS Vsub=0 Vsub=0.2V Vsub=0.4V Frequency(GHz) Figure 6 10: The k-factor of the LNA for multiple forward body biasing voltages of transistor M3 (V sub ). of M3 is swept from 0 to 0.4-V and the simulation results are provided in Fig Also, is less than 0.05 for all the bias ranges. Therefore, the LNA is unconditionally stable for the entire range of operation. Employing the aforementioned design equations and techniques, the ultra-low power inductor-less LNA was implemented. The value of the bias voltages are determined using the extended biasing metric for ultra-low power and ultra-low voltage designs [22]. The component values and transistor sizes of the proposed LNA are provided in Table Measurement Results and Discussions The expected theoretical results are verified by implementing the circuit in a 0.13-µm IBM CMOS technology. Fig shows the die micrograph of the LNA. As highlighted on the photo, the total area of the LNA is only 70-µm 75-µm. The chip is bonded onto a PCB (chip-on-board technology) to provide DC biasing, and the RF functionality is tested using a probe station. The chip-on-board technology provides more flexibility to control the length of the bond wires and hence, reducing the associated parasitics. The measurement setup of this LNA is shown in Fig As shown in the figure, GSG 150-µm pitch probes were used to measure the input and output characteristics. Ph.D. Thesis 2015 M. Parvizi

139 6.4. MEASUREMENT RESULTS AND DISCUSSIONS Table 6 1: Device Dimensions and Component Values M 1 M 2 M 3 R 1 R 2 C 1 R sub (55-µm/120-nm) (70-µm/120-nm) (20-µm/120-nm) 1-KΩ 1.1-KΩ 3.3-pF 1.1-KΩ 70µm G IN G Buffer 75µm G OUT G Vsub Vbp Vbn Figure 6 11: The die micrograph of the LNA. The DC bias pads are placed on the side of the die as some of them highlighted on the photo. A common-source buffer is used to drive the 50-Ω resistance of the measurement equipment, and its characteristics were de-embedded from the S21 of the LNA. Ph.D. Thesis 2015 M. Parvizi

140 S21 & S11 (db) 6.4. MEASUREMENT RESULTS AND DISCUSSIONS Figure 6 12: The measurement setup of the chip-on-board LNA with GSG probes S21 Measured S11 Measured S11 PLS S21 PLS Frequency(GHz) Figure 6 13: The post-layout simulation and measured results for S21 and S11 of the LNA. The measured and post-layout simulation (PLS) results of S21 and S11 are plotted in Fig The LNA achieves an S21 of 12.3-dB with a 3-dB BW of GHz. The S11 is better than 9-dB in the band of interest, and can be tuned by 5-dB using the body bias of M3. In all cases, there is a good agreement between the measured and post-layout simulation results. Fig highlights the S22 and S12 of the LNA. The buffer provides Ph.D. Thesis 2015 M. Parvizi

141 S22 & S12 (db) 6.4. MEASUREMENT RESULTS AND DISCUSSIONS S22 Measured -60 S22 PLS S12 Measured Frequency (GHz) Figure 6 14: The measured S12 and S22 of the LNA. an average S22 of below 10-dB while the reverse isolation is around 35-dB in the band of interest. The measured S12 highly depends on the buffer characteristics and also the isolation calibration of the VNA. The NF of the LNA is plotted in Fig As can be seen, the NF is below 5- db between GHz. However, as discussed in section 6.3.4, at low frequencies, the coupling between V 2 and V 3 weakens hence the noise cancellation is degraded. As a result, the NF increases up to 6-dB. The linearity of the LNA is mostly limited by the low voltage headroom available for each component when using a 1-V supply voltage. The input-output characteristics of the LNA are shown in Fig The measured IIP3 is 10-dBm at 2-GHz with two-tone spacing of 100-MHz. The 1-dB compression point is at 20-dBm. The IIP2 at 2-GHz is 5-dBm. The performance of the LNA is simulated under process corner and temperature variations. The performance variation of the LNA in gain, NF and input matching can be compensated through adjusting the feedback coefficient and also the biasing voltage of the transistors. For that reason ultra-low power DC bias control loops can be designed and implemented to adjust these parameters automatically. Ph.D. Thesis 2015 M. Parvizi

142 Output Power (dbm) NF (db) 6.4. MEASUREMENT RESULTS AND DISCUSSIONS NF Measured NF PLS Frequency (GHz) Figure 6 15: The post-layout simulation and measurement results for NF of the proposed LNA IIP3=-10dBm -60 IM3 PLS -80 Fundamental PLS IM3 Measured Fundamental Measured Input Power (dbm) Figure 6 16: The measured and simulated input-output characteristics of the LNA. The performance of the LNA is summarized in Table 6 2 and is compared with stateof-the-art works in the literature. Two figures of merit (FoM I and FoM II)(5.8, 5.9) are used to compare the overall performance of the LNAs. The LNA presented here achieves ultra-low power consumption while offering comparable performance in terms of gain, noise figure, and linearity which leads to achieving one of the highest FOM I and the highest FOM II (which includes area and linearity), when Ph.D. Thesis 2015 M. Parvizi

143 FOM II FOM I 6.5. SUMMARY This Work [19] [24] [26] [27] [25] [3] [6] (a) This Work [24] 20 [6] [19] Power (mw) 4 5 (b) Figure 6 17: Comparison of the state-of-the-art works in the literature with the designed LNA based on (a) FOM I and (b) FOM II. [26] [25] compared to other works. Fig highlights the overall performance of this LNA based on FOM I and FOM II versus power consumption and compares it with the other works in the literature. 6.5 Summary In this work, an ultra-low power, very compact inductorless LNA was proposed, analysed and implemented. A complementary current-reuse technique was employed to reduce the overall DC current in the amplifier, and a tunable active shunt-feedback scheme provides ultra-low power input matching. A forward body biasing scheme was used to control the g m of the feedback transistor, and provides a means of tuning the input matching across process corners. Moreover, the complementary characteristics of the input stage leads to partial second-order distortion cancellation. The LNA was fabricated in a IBM 0.13-µm 1P8M Ph.D. Thesis 2015 M. Parvizi

144 6.5. SUMMARY CMOS technology and the measured performance showed a 12.3-dB gain, a GHz bandwidth, a minimum NF of 4.9-dB, IIP3 of 10-dBm and a power consumption of 400-µW from a 1-V supply. The LNA presented here offers comparable performance in terms of gain, noise figure, and linearity and achieves one of the best FOM I and FOM II in the literature while burning very low power. Ph.D. Thesis 2015 M. Parvizi

145 6.5. SUMMARY Table 6 2: Performance summary and comparison with state-of-the-art LNAs Parameter This Work [120] 2006 [148] 2012 [118] 2011 [123] 2013 [99] 2010 [121] 2014 [22] 2014 JSSC JSSC E. Lett. RFIC TCAS I E. Lett. TVLSI 3-dB BW (GHz) Power (mw) Supply (V) Gain max (db) * 12.6 min NF (db) IIP3 (dbm) NA NA 6 9 Technology 0.13-µm 0.13-µm 0.13-µm 0.18-µm 0.18-µm 0.13-µm 0.13-µm 90-nm Area (mm 2 ) FoM I FoM II NA NA 6 *Estimated from the curves. Ph.D. Thesis 2015 M. Parvizi

146 CHAPTER 7 A 0.5-V 250-µW Forward Body Bias Enhanced Complementary Current-reuse CMOS LNA 7.1 Introduction Reducing the supply voltage of CMOS circuitry has dramatic impacts on the performance of transistors like the intrinsic gain (g m /g ds ), the transit frequency (f t ), the minimum noise figure, etc. [22]. Additionally, the requirements of ultra-low power (ULP) (P diss <0.5- mw) design lead to strict restrictions on the design options and the overall speed of the circuits. These limitations motivate using a combination of circuit design techniques to improve the characteristics of the transistors under ULV conditions without additional power consumption in order to realize ULP, ULV, and wideband solutions. The design of ULP and ULV wideband LNAs has been an active research topic e.g. [22, 27, 118, 120, 123, 124, 148]. In this chapter, the use of forward body biasing as a method to mitigate short channel effects and improve transistor s intrinsic characteristics in ULV environments with no additional power consumption is examined. This technique along with complementary current-reuse and active shunt-feedback that is tunable through forward body biasing, is then used to implement an LNA tailored for ULP and ULV applications. The proposed LNA achieves the best figure of merit in the literature to the best of the author s knowledge while burning only 250-µW of power from a 0.5-V supply voltage The remainder of the chapter is organized as follows. First, the circuit description of the proposed LNA including the analysis on the tunable input impedance, voltage gain, inductive g m boosting and noise figure is discussed, then measurement results and discussions are provided followed by conclusion. 125

147 7.2. LNA CIRCUIT DESCRIPTION Bias Gen. DAC 1 A3 Vout Vbc M4 A1 FBB M1 RS A2 Figure 7 1: Simplified block diagram of the LNA with the feedback mechanisms. 7.2 LNA Circuit Description The forward body biasing technique discussed in section 4.3, along with other ultra-low power and ultra-low voltage techniques that will be discussed, is employed in this LNA to realize an ULP and ULV LNA. Fig. 7 1 illustrates the block diagram of the proposed ULV ULP LNA. The input impedance of the common-gate input transistors are controlled by the two feedback loops composed of A1 and A2. Amplifier A1 denotes the tunable active shunt-feedback stage whose gain is controlled by a low power DC bias generation digital to analog converter (DAC). Amplifier A2 represents the inductive g m -boosting that is used to improve the input matching and gain at high frequencies. The main common-gate transistor (M1) is forward body biased to enhance the performance as discussed earlier. Moreover, the combination of M4 and the unity gain amplifier, A3, form a folded-cascode amplifier to enhance the gain and the bandwidth of the LNA. Fig. 7 2 shows the schematic of the proposed LNA. Two parallel amplifiers are used the common-gate amplifier (M1) and the common-gate input folded-cascode amplifier (M4 and M2). M3 forms the tunable active shunt-feedback loop and inductors L 1 and L 2 provide zerovoltage drop current sources. Inductive g m -boosting is implemented by placing an inductor Ph.D. Thesis 2015 M. Parvizi

148 7.2. LNA CIRCUIT DESCRIPTION Vbc + - Vbp + - L2 RS Cin M4 M2 Buffer Vout Rsub Rsub M3 L3 RL Bias Gen. DAC Vbn + - C1 M1 L1 C2 Vout_buff M5 Vbuf + - Figure 7 2: The complete schematic of the proposed ULV and ULP LNA along with the buffer for measurement purposes. L 3 at the gate of M1 [27]. This boosts the effective g m of the device at the resonant frequency of inductor L3 and gate-source capacitance of M1 and enhances the bandwidth, gain, and input matching of the LNA, without additional power consumption or noise contribution. The feedback from transistor M3 reduces the current needed for a broadband 50-Ω match, at the expense of slightly reduced gain and higher NF. An additional feature of this LNA is the ability to tune the feedback coefficient by adjusting the body bias of M3 through a low power DC bias generating DAC. This enables tuning of the LNAs input matching in the presence of process variations. The following subsections will present a detailed analysis of this circuit, including design equations for input matching, gain, and NF. The ULP, ULV design techniques used to realize a wideband 250-µW LNA presented here will also be described. Ph.D. Thesis 2015 M. Parvizi

149 7.2. LNA CIRCUIT DESCRIPTION C p2 L 2 V 2 Z in -g m4 V 1 1/g ds2 -g m2 V 2 R S C in V 1 C 1 Vo V in C p3 g m3 V O 1/g ds1 -g m1 V 3 C p4 V 3 L 1 C p1 Figure 7 3: The small-signal schematic of the proposed LNA for input impedance, voltage gain and NF analysis Input Impedance As discussed earlier One of the main challenges in any ultra-low power and ultra-low voltage LNA design is to provide input matching with the available power budget. A commongate topology requires at least 1.7-mA of current to provide 50-Ω matching in the chosen 0.13-µm CMOS technology, assuming the amplifier is in moderate inversion (with a g m /I D of 12-V 1 in this region). This criterion leads to an overall power consumption greater than 0.85-mW with V DD =0.5-V. Active shunt feedback plays an important role in the input impedance matching of the proposed LNA, while minimizing the current consumption. To find the input impedance the small-signal schematic of the proposed LNA shown in Fig. 7 3 is considered where multiple assumptions have been made. The output conductance of M3 and M4 is ignored since the current in this branch is very low, and hence the g ds is negligible. Moreover, at each node the parasitic capacitances are summed and the total capacitance is presented for simplicity. Furthermore, capacitor C 1 is chosen large enough to ac couple V 1 and V 3, thus its impedance can be ignored. Following the aforementioned assumptions, the Ph.D. Thesis 2015 M. Parvizi

150 7.2. LNA CIRCUIT DESCRIPTION input impedance of the LNA is given by Z in = g m2g o1 + Y 2 (g o1 + g o2 ) + Y 3 G 2 + Y 2 Y 3 F 1 + F 2 + F 3 (7.1) F 1 = g m3 ( gm2 (G 1 + g m4 ) + g o2 (g m1 + g m4 ) ) (7.2) F 2 = Y 1 g m2 g o1 + Y 2 F G + Y 3 ( gm2 (G 1 + g m4 ) + g o2 (g m1 + g m4 ) ) (7.3) F 3 = Y 1 Y 2 (g o1 + g o2 ) + Y 2 Y 3 (g m1 + g m4 ) + Y 1 Y 3 G 2 + Y 1 Y 2 Y 3 (7.4) F G = g m3 G 1 + g m4 (g o1 + g o2 ) + g m1 g o2. (7.5) where g oi is the output conductance of transistor M i, G i = g mi + g oi, i = 1, 2, and Y 1 = ( 1 ω 2 L 1 (C p1 + C p3 ) ) /jωl 1 is the admittance of L 1 in parallel with parasitic capacitance C p1 and C p3 at node V 3. Y 2 = (1 ω 2 L 2 C p2 ) /jωl 2 is the admittance of L 2 in parallel with parasitic capacitance C p2 at node V 2, and Y 3 = jωc p4 is the admittance of parasitic capacitance at node V O. At medium frequencies where Y 1 1/jωL 1 and Y 2 1/jωL 2 and Y 3 0, (7.1) can be simplified to Z in g m2 g o1 + (g o1 + g o2 ) / (jωl 2 ) F 1 + (g m2 g o1 ) / (jωl 1 ) + F G / (jωl 2 ), (7.6) The g m3 required to provide an input impedance close to 50-Ω is investigated by using the design parameters of the LNA applied in (7.6). Fig. 7 4 models the real part of the input impedance of the LNA versus g m3 for multiple values of g o1,2 at 1-GHz using MATLAB. Low voltage design leads to low drain-source voltages, which contribute to a degradation in the output conductance of MOS transistors [22]. Thus, it is interesting to note the impact of output conductance on the input impedance, and how forward body biasing can lead to low power input matching. As shown in the figure, a higher output conductance means a higher g m3 is required to achieve good input matching. Moreover, Fig. 7 4 shows that an input impedance match can be achieved with a g m3 below 4-mS, which means that a low power input matching network is feasible. In this design, g m3 is chosen to be 2-mS. Ph.D. Thesis 2015 M. Parvizi

151 7.2. LNA CIRCUIT DESCRIPTION It is interesting to consider the input impedance of the LNA at the resonant frequency of admittances Y 1 and Y 2 where Y 1 0 and Y 2 0. Note that Y 1 and Y 2 are designed to have close resonant frequencies and it is placed in the middle of operation band. Hence, at these frequencies and by ignoring Y 3 for simplicity, (7.1) simplifies to Z in = 1 ( g m3 r o1 (G1 + g m4 ) + g o2 g m2 (g m1 + g m4 ) ) 1 A (G 1 + g m4 ) (7.7) where r o1 is the output resistance of M1 and A = g m3 r o1 is the feedback coefficient. Equation (7.7) clearly shows that the feedback network boosts the effective g m by a factor of A, which allows the circuit to achieve good input matching with low power consumption. It is enlightening to compare the impact of the feedback coefficient A on the input impedance in this design and the one presented in [149] where there is an explicit load impedance. Fig. 7 5 illustrates the two circuits with and without the explicit load impedance. To study the low frequency input impedance of both structures, a general equation describing both architectures is given by Z in = 1 G 1 ( 1 + gm,f B (Z L r o1 ) (Z L r o1 ) g o1 ) (7.8) It can be seen in (7.8) that if Z L (which is the case in Fig. 7 5 (a)) the last term in the denominator of (7.8) cancels the first term and the input impedance is only a function of A = g m,f B r o1. Whereas when there is an explicit load, (7.8) simplifies to Z in = G 1 ( 1 ro1 r o1 +Z L + g m,f B (Z L r o1 ) ) ro1 Z in 1 g m1 (1 + A) (7.9) Thus, in the case that an explicit load exists, the feedback coefficient has a more pronounced impact on the input impedance. Table 7 1 summarizes the input impedance values along with feedback coefficients for Fig. 7 5 (a) and (b) for comparison Tunable Feedback Coefficient As discussed in section (6.3.1), this LNA also has an additional feature to tune the feedback coefficient by adjusting the body bias of M3 through a DC bias generating DAC. Ph.D. Thesis 2015 M. Parvizi

152 Real(Input Impedance) ( ) 7.2. LNA CIRCUIT DESCRIPTION 120 g oi =0.1-mS g oi g oi =0.5-mS g oi =1-mS g oi =2-mS Desired Range g oi =3-mS g m3 (ms) Figure 7 4: The real part of the input impedance of the LNA versus g m3 for multiple values of g oi at 1-GHz. Shunt- Feedback Shunt- Feedback Z L MFB Vout MFB Vout M1 M1 Vin Vin Z in Z in (a) (b) Figure 7 5: Simplified schematic of LNAs with active shunt-feedback (a) without explicit load impedance (this work) (b) with explicit load impedance [149]. This tuning mechanism can be used to optimize the input matching of the LNA in the presence of process variations. Increasing the substrate voltage of M3 thorugh the DAC Ph.D. Thesis 2015 M. Parvizi

153 7.2. LNA CIRCUIT DESCRIPTION Table 7 1: Comparison between shunt-feedback LNAs w/o explicit load Architecture Low freq. input impedance FB coefficient Without explicit load (Fig. 7 5 (a)) 1 A(g m1 +g o1 ) g m,f B r o1 With explicit load (Fig. 7 5 (b)) 1 (1+A)(g m1 +g o1 ) g m,f B (Z L r o1 ) leads to a lower V T H, and therefore, to a higher feedback coefficient. The following equations describe the impact of FBB on the g m3, which is biased in the weak inversion region. First, the g m of a transistor biased in weak inversion is represented by [131] g m = I D0 nu T The derivative of the g m with respect to V T H can be found by, g m = I D0 V T H (nu T ) 2 ( ) e V GS V T H 2nU T. (7.10) ( ) e V GS V T H 2nU T. (7.11) Equation (7.11) shows that g m varies exponentially with respect to V T H in weak inversion region. It is confirmed by simulation results, which show that varying the substrate voltage between V can be used to tune g m3 by 40% around 2-mS. This directly impacts input impedance as predicted by (7.6) and showed in Fig Fig. 7 6 illustrates how the substrate bias of M3, and thus g m3, affects S11 and S21 of the LNA. As can be seen, varying the DAC output voltage from 0 to 0.3-V results in 8-dB improvement in S11, while the maximum S21 is altered by only 2-dB. Simulation results also reveal that this feedback coefficient variation has negligible impact on the NF. Consequently, this tuning capability can be exploited to compensate variations in the input matching across process corners Voltage Gain Assuming that V 1 and V 2 are ideally ac coupled to each other, transistors, M1, M2, M3, and M4 all contribute to the voltage gain of the LNA. The gain of the proposed LNA with Ph.D. Thesis 2015 M. Parvizi

154 S21 & S11 (db) 7.2. LNA CIRCUIT DESCRIPTION S21 Vsub Vsub=0 Vsub=0.1V Vsub=0.2V Vsub=0.3V 0-10 S11 Vsub Frequency (GHz) Figure 7 6: The S21 and S11 variations with the feedback coefficient tuned through the substrate voltage of M3. this assumption can be expressed as, where A v = g m2 (G 1 + g m4 ) + g o2 (g m1 + g m4 ) + Y 2 G 1 F A 1 + F A 2 + F A 3 + Y 1 Y 2 Y 3 R s, (7.12) F A 1 =g o1 G 2 + R s g m3 G 2 (G 1 + g m4 ) (7.13) ( ( F A 2 =Y 1 R s g o1 G 2 + Y 2 g o1 + g o2 + R s G1 (g m3 + g o2 ) + g m4 (g o1 + g o2 ) )) ( ( ) ) (7.14) + Y 3 G 2 + R s (gm1 + g m4 ) G 2 F A 3 =Y 2 Y 4 ( 1 + Rs (G 1 + g m4 ) ) + Y 1 Y 2 R s (g o1 + g o2 ) + Y 1 Y 4 R s G 2 (7.15) Equations (7.6) and (7.12) highlight the fundamental trade-off between good input matching and high gain. Increasing g m3 reduces the input impedance but at the cost of lower voltage gain. It is informative to find the voltage gain at the resonant frequency of Y 1 and Y 2 where Ph.D. Thesis 2015 M. Parvizi

155 7.2. LNA CIRCUIT DESCRIPTION Y 1 Y 2 and ignoring Y 3 for simplicity. Then, (7.12) is simplified to A v = g r o1 (G 1 + g m4 ) + r o2 o1 g m2 (g m1 + g m4 ) ( ) ) 1 + g o2 g m2 + R s g m3 r o1 (1 + g o2 g m2 (G 1 + g m4 ) r o1 (G 1 + g m4 ) 1 + R s A (G 1 + g m4 ). (7.16) It is interesting to note that in this architecture the voltage gain also varies with A rather than (A + 1) as discussed before for input impedance Inductive g m -boosting Technique As discussed in Section 4.4.2, ultra-low power and ultra-low voltage design criteria reduce the possible design options and impose restrictions on the operating region of the transistors, which makes it difficult to achieve a high f T in the devices. An ULP transconductance boosting scheme helps overcome this challenge, and becomes necessary when operating at high frequencies. Transconductance boosting can be achieved without any additional power consumption by adding and inductor at the gate of a common-gate transistor to boost its g m at the resonance frequency of the inductor and the parasitic gate-source capacitance [27]. By placing an inductor at the gate of a common-gate transistor a feedback network which boosts the v gs of the transistor by 1/ (1 ω 2 C gs1 L 3 ) is formed. This gives rise to an effective g m of, G m,eff = At high frequencies (7.17) simplifies to g m 1 + g m R s + jωc gs R s ω 2 L 3 C gs1. (7.17) G m,eff ω T ω 1 jr s ωl 3. (7.18) Therefore, the additional inductor at the gate reduces the denominator of the G m,eff up to the resonance frequency of L 3 and C gs1. The improvement obtained by applying this inductive g m -boosting technique on transistor M1 in the proposed LNA is simulated and highlighted in Fig It is clear from the figure that the S11 is improved at high frequencies by more than 6-dB, while the bandwidth of the LNA is extended by over 600- MHz. It is important to note that these improvements are achieved without any additional power consumption in the LNA, and the only penalty is the extra area for the spiral inductor. Ph.D. Thesis 2015 M. Parvizi

156 S21 & S11 (db) 7.2. LNA CIRCUIT DESCRIPTION 20 L 3 = S21 S MHz BW Improvement L 3 =5nH dB Improvement Frequency (GHz) Figure 7 7: The S21 and S11 variations with the inductive peaking caused by L 3 at the gate of M Noise Figure The main noise sources in the proposed LNA are the drain current noise of the transistors M1, M2, M3 and M4, which can be expressed as ((i 2n)/ f ) = 4KT γ/α g m, where, γ, is the thermal noise coefficient and α is g m /g d0. In this analysis, the output resistance of M3 and M4 is assumed to be infinite and the gate noise (R g ) of the transistors is ignored since it can be minimized by proper layout techniques. Due to frequency dependant components like the current source inductors (L 1, L 2 ) the noise contribution of the transistors is also frequency dependant. The detailed noise contribution of each transistor to the overall noise factor of the LNA can be expressed as NF M1 = γ g m1 α R s ( (G 2 + Y 2 ) (1 + R s Y 1 ) + Y 2 R s g m4 G 2 (g m1 + g m4 ) + g m2 g o1 + Y 2 G 1 ) 2. (7.19) NF M2 = γ g m2 α R s NF M4 = γ g m4 α R s ( Y2 ( 1 + Rs (G 1 + g m4 ) ) + R s Y 1 Y 2 G 2 (g m1 + g m4 ) + g m2 g o1 + Y 2 G 1 ) 2. (7.20) NF M3 γ α g m3 R s. (7.21) ( G 2 + Y 2 R s G 2 + Y 1 R s G 1 G 2 (g m1 + g m4 ) + g m2 g o1 + Y 2 G 1 ) 2. (7.22) Ph.D. Thesis 2015 M. Parvizi

157 7.2. LNA CIRCUIT DESCRIPTION From (7.20) it can be seen that the noise from cascode transistor, M2 has a strong frequency dependency, as the numerator in (7.20) will theoretically approach 0 if Y 2 0. This is because a high impedance at the source of M2 (at the resonant frequency of L 2 and C p2 ) will cause the noise current of M2 to flow into itself, and not contribute to the overall noise factor. Note that, the resonant frequency of Y 2 was designed to be in the middle of the operating region to reduce the noise contribution of M2. On the other hand, transistors M1, M3 and M4 always have a noise contribution to the output which slightly varies with frequency. Fig. 7 8 shows the simulation results for noise contribution of each transistor. It can be seen that the noise contribution of M2 is relatively large at low frequencies and approaches zero at the resonant frequency of Y 2, which agrees with the analysis. Moreover, it can be seen that M1 has the largest noise contribution at the medium frequencies. It is interesting to note that the noise penalty of the feedback transistor (M3) is very low, and hence including it is a viable technique for ULP LNA designs. It is instructive to study the NF of the LNA at the resonant frequency of Y 1 and Y 2, which are both designed to be in the middle of the operation band. At this band, the frequency dependant terms can be ignored and the overall noise factor simplifies to NF = 1 + γ 1 α R s (g m1 + g m4 ) + γ α g m3r s. (7.23) The second term in the equation is due to the noise contribution of the common-gate transistors (M1 and M4) and the third term is from the feedback transistor. M2 has no impact on the overall noise figure at the resonance frequency of Y 2 as discussed before Ultra-Low Voltage Design Techniques Self Forward Body Biasing In summary, it can be said that ultra-low voltage operation of the LNA is made feasible by employing several techniques. The first is to mitigate the short channel effect through the FBB as described in section 4.3, which is implemented by adding R sub between the bodies of M1 and M2 [142]. Fig. 7 9 illustrates the details of the this scheme. The connection Ph.D. Thesis 2015 M. Parvizi

158 Noise Contribution (%) 7.2. LNA CIRCUIT DESCRIPTION M1 M2 M3 M Frequency (GHz) Figure 7 8: The noise contribution of the transistors to the noise factor of the LNA at multiple operating frequencies. of R sub between the bodies of M1 and M2 creates an ULP self-bias loop that includes the source-body diodes of M1 and M2. The current in this loop is controlled through R sub and is in the order of a few micro-amperes. The supply voltage will be divided equally on the substrate diodes since I sub is very low. As discussed earlier, this technique improves the r out of both transistors and reduces the V T H, which reduces the required gate-source voltage to keep M1 and M2 in the moderate inversion region. It should be noted that the main advantage of this method of forward biasing the bulk is that it eliminates the need for a separate bias voltage. Folded-Cascode Another key technique for ultra-low voltage operation is the folded-cascode architecture created using transistors M4 and M2 with a zero-voltage drop current source. This scheme provides ULV isolation between the feedback summation node and the output, which is necessary to achieve flat gain and good input matching. Ph.D. Thesis 2015 M. Parvizi

159 7.2. LNA CIRCUIT DESCRIPTION VDD L2 Vbp M2 VDD + VSB2 - V out R Sub RSub Isub Vbn M1 + VBS 1 - L1 Figure 7 9: The schematic of self forward biasing scheme with the equivalent circuit Ultra-Low Power Design Techniques Active Shunt-Feedback As discussed in section 7.2.1, active shunt-feedback facilitates ultra-low power input matching at the cost of a slight voltages gain and NF penalty. Hence, this technique plays a crucial role in the realization of the low power LNA. Current-reuse Another key technique in the implementation of ULP circuits is current reuse, as mentioned earlier in Section (4.2.2). In the proposed LNA, the input stage transistors M1 and M4 share DC current with the cascode device M2 and feedback transistor M3, respectively. This leads to an improvement in the DC current efficiency of the LNA. It should be noted that M1 and M2 are biased in the middle of moderate inversion, while M3 and M4 are biased in the weak inversion region. Employing the aforementioned design equations and techniques, an ULP current-reuse LNA was implemented. The values of the bias voltages are determined using the extended Ph.D. Thesis 2015 M. Parvizi

160 7.3. MEASUREMENT RESULTS AND DISCUSSIONS Table 7 2: Device Dimensions and Component Values M 1 M 2 M 3 M 4 L 1 L 2 C 1 R sub (50-µm/120-nm) (77-µm/120-nm) (40-µm/120-nm) (80-µm/120-nm) 12-nH 12-nH 4.2-pF 1.1-KΩ biasing metric for ULP and ULV designs [22]. The component values and transistor sizes of the proposed LNA are provided in Table Measurement Results and Discussions The proposed LNA is implemented in a 1P8M 0.13-µm IBM CMOS technology. Fig shows the die micrograph of the LNA. Spiral inductors are used to implement L 1, L 2 and L 3. A metal-insulator-metal (MIM) capacitor is used for C 1 due to its high area efficiency. As highlighted on the photo, the total area of the LNA excluding the pads is 600-µm 660-µm. The chip is bonded onto a PCB (chip-on-board technology) to provide DC biasing, and the RF functionality is tested using a probe station. The chip-on-board approach provides more flexibility in controlling the length of the bond wires, which is used to reduce the associated parasitics. A common-source buffer (shown in Fig. 7 2) is used to drive the 50-Ω resistance of the measurement equipment, and its characteristics were de-embedded from the results. The measured and post-layout simulation (PLS) results of S21 and S11 are plotted in Fig for two supply voltages (V DD =0.5-V and V DD =0.4-V). The LNA achieves an S21 Ph.D. Thesis 2015 M. Parvizi

161 S21 & S11 (db) 7.3. MEASUREMENT RESULTS AND DISCUSSIONS L1 G IN G 0.6 mm L2 L mm G Out G Figure 7 10: The die micrograph of the LNA S Measured Vdd=0.5V Measured Vdd=0.4V -30 PLS Vdd=0.5V S11 PLS Vdd=0.4V Frequency (GHz) Figure 7 11: The post-layout simulation and measured results for S21 and S11 of the LNA for two supply voltages. Ph.D. Thesis 2015 M. Parvizi

162 NF (db) 7.3. MEASUREMENT RESULTS AND DISCUSSIONS Measured Vdd=0.5V 2 PLS Vdd=0.5V PLS Vdd=0.4 Measured Vdd=0.4V Frequency (GHz) Figure 7 12: The post-layout simulation and measurement results for NF of the proposed LNA for two supply voltages. of 14-dB with a 3-dB BW of GHz, for V sub =0.25-V and V DD =0.5-V. For the case of V DD =0.4-V, the maximum S21 of 13-dB with a 3-dB BW of GHz. For both supply voltages, the S11 is better than 9-dB in the band of interest, and can be tuned by 6-dB using the body bias of M3. In all cases, there is a good agreement between the measured and post-layout simulation results. It should also be noted that the DAC to control the body bias of M3 is implemented off-chip in this work for greater control. The NF of the LNA is plotted in Fig As can be seen, for V DD =0.5-V the minimum NF is 4-dB at 2.5-GHz. However, as discussed in section 7.2.5, the NF increases at low frequencies due to the low impedance of L 1 and L 2 and hence higher noise contribution from M2. With V DD =0.4-V the minimum NF is 4.5-dB at 2.5-GHz and it increases at lower frequencies. The linearity of the LNA is mostly limited by the low voltage headroom available for the transistors. The input-output characteristics of the LNA are shown in Fig The measured IIP3 is 10-dBm at 2-GHz with a two-tone spacing of 100-MHz for V DD =0.5-V. The 1-dB compression point is at 20-dBm. The IIP3 is improved compared to simulation results since the voltage gain of the fabricated LNA is slightly lower than simulated. The Ph.D. Thesis 2015 M. Parvizi

163 Output Power (dbm) 7.4. SUMMARY IIP3=-10 dbm -80 Fund Sim IM3 Sim -100 IM3 Meas Fund Meas Input Power (dbm) Figure 7 13: The measurement results for input and output characteristics of the LNA. measured IIP3 for V DD =0.4-V is degraded by 2-dB and is 12-dBm. The power consumption of the LNA operating from 0.5-V and 0.4-V supplies is 250-µW and 160-µW, respectively. As can be seen from the measured results, the supply voltage can be dropped to 0.4-V while maintaining the good overall performance. The performance of the LNA is summarized in Table 7 3 and is compared with stateof-the-art works in the literature. A figure of merit (FoM) introduced in 5.8 [148] is used to compare the overall performance of the LNAs, and it is given by ) F om = 20 log 10 ( S21av.[lin] BW [GHz] P dc[mw ] ( Fav.[lin] 1 ). (7.24) The proposed LNA achieves the best FoM among the other works in the literature, and it also offers the lowest overall power consumption with ultra-low supply voltages of 0.4-V and 0.5-V. This is illustrated in Fig. 7 14, where the FoM of this LNA is compared to the FoM obtained from other works in the literature. 7.4 Summary In this chapter, the use of forward body biasing to mitigate short channel effects with no additional power consumption was examined. This technique along with a complementary Ph.D. Thesis 2015 M. Parvizi

164 FoM 7.4. SUMMARY This Work Vdd=0.4V This Work Vdd=0.5V [8] [24] [6] [10] [12] Power Consumption (mw) Figure 7 14: The FOM versus power consumption comparison of the proposed LNA with recent works from literature. current-reuse scheme, tunable forward body biased active shunt-feedback, inductive g m - boosting, and a folded-cascode architecture were used to realize an ultra-low power and ultra-low voltage wideband LNA. The proposed LNA was fabricated in an IBM 0.13-µm 1P8M CMOS technology. The measured LNA has a 14-dB gain, 4-dB minimum noise figure, IIP3 of 10-dBm and GHz bandwidth, while consuming only 250-µW of power from a 0.5-V supply. The LNA can operate with supply voltages as low as 0.4-V while maintaining good performance. The chip area occupied by the LNA is 0.39-mm 2. The proposed LNA here, to the best of the author s knowledge achieves the best FOM I in the literature while [11] burning very low power from a 0.5-V supply voltage. [9] Ph.D. Thesis 2015 M. Parvizi

165 7.4. SUMMARY Table 7 3: Performance summary and comparison with state-of-the-art LNAs Parameter This Work This Work [120] JSSC 2006 [148] JSSC 2012 [118] E. Lett [123] RFIC 2013 [22] TVLSI 2014 [124] E. Lett [121] E. Lett dB BW (GHz) Power (mw) Supply (V) Gain max (db) min NF (db) * IIP3 (dbm) NA NA NA Technology 0.13-µm 0.13-µm 0.13-µm 0.13-µm 0.18-µm 0.18-µm 90-nm 0.18-µm 0.13-µm Area (mm 2 ) FoM *Estimated from the curves Ph.D. Thesis 2015 M. Parvizi

166 CHAPTER 8 Ultra-Low Power Injection Locked Clock Recovery Scheme for a Chirp FSK UWB Receiver 8.1 Introduction In this chapter, a low power chirp-fsk IR-UWB receiver is designed, implemented and fabricated. Also, a new synchronization scheme based on injection locking suitable for low power, low data rate chirp-fsk IR-UWB receivers is proposed. The new clock recovery scheme eliminates the need for complex and high power PLLs/DLLs and external crystal references. Furthermore, a novel injection locked based phase shifter follows the clock recovery to adjust the right clock edge to achieve maximum SNR in the receiver. The RF front-end of a chirp-fsk UWB receiver which consists of two separate frequency channels is implemented in a 0.13µm CMOS technology and achieves dB tuned voltage gain in each band with a NF of 7-dB. The organization of the chapter is as follows: first, the chirp-fsk signalling and the link budget for the receiver will be discussed. Next, the top-level view of the proposed IR- UWB receiver will be presented. Then, the circuit blocks employed in the receiver will be presented in details. Finally, the proposed clock recovery scheme for chirp-fsk IR-UWB receivers along with the phase shifting mechanism are presented. 8.2 Chirp-FSK IR-UWB Signalling Chirp UWB frequency shift keying (C-FSK) uses ultra-wideband pulses with different center frequencies for data transmission. Chirp-FSK modulation can trade the pulse amplitude with pulse width to achieve evenly spread spectrum over 500-MHz bandwidth. Hence, this modulation scheme is voltage scalable with CMOS technology. Fig. 8 1 (a) shows the chirp-fsk signalling and compares it with OOK, PPM and SOOK modulation formats shown in Fig. 8 1 (b), (c) and (d), respectively. As can be seen in the figure, the main 145

167 8.3. LINK BUDGET advantage of C-FSK UWB modulation scheme over OOK modulation is that unlike OOK it takes advantage of differential signalling to increase the noise, multipath and interference immunity. Moreover, the main advantage of C-FSK over PPM, transmitted reference and S-OOK modulation is that unlike the aforementioned modulation schemes, and similar to OOK modulation, it does not require high active time to transmit and receive pulses. Therefore, it lends itself to a highly power efficient transceiver by using heavy duty cycling. Furthermore, chirp-fsk signalling provides an opportunity for a low complexity, low power synchronization scheme based on injection locking, which will be discussed in details in this chapter. Due to the aforementioned advantages of chirp-fsk modulation format over other IR-UWB modulation schemes, this signalling scheme is adopted in this work to realize a low power IR-UWB receiver. 8.3 Link Budget One of the main differences of IR-UWB transceivers compared to the conventional narrowband communication links is the fact that the data rate, R b, and the bandwidth, B, of the transmitted signal are independent to each other [45]. This stems from the fact that, in conventional narrowband radios the pulse period and the symbol duration is usually the same, however, in IR-UWB receivers, the pulse duration is a small fraction of the symbol period. Furthermore, while in IR-UWB links the maximum transmitted pulse energy is controlled by the FCC regulation, in conventional radios the maximum transmitted power is enforced by the regulations. These differences have strong impacts on the link budget of the IR-UWB receivers. For example, in impulse radios the FCC regulation limits the maximum allowable power spectral density (PSD), P SD F CC, to 41.3-dBm/MHz in the GHz band, which introduces the following constraint E pt X P SD F CC B R b, (8.1) Ph.D. Thesis 2015 M. Parvizi

168 8.3. LINK BUDGET (a) (b) (c) (d) Figure 8 1: The comparison of chirp-fsk modulation scheme shown in (a) with other common modulation formats used in IR-UWB transceivers (b) on-off keying modulation (OOK) (c) pulse position modulation (PPM) (d) synchronized-ook (S-OOK). where R b is the data rate of the receiver. In this design, with the bandwidth of 500-MHz and the data rate of 1-Mb/s the E pt X = 74.3-dBm. On the other hand, the pulse energy E brx at the receiver is related to the E pt X by E brx = EpT X P athloss. (8.2) where P athloss is the free space path loss between the receiver and the transmitter. Considering a 3-meter distance between the transmitter and the receiver, the received pulse energy at the receiver is dBm. The modulation scheme used in this work sets the required Ph.D. Thesis 2015 M. Parvizi

169 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN noise figure of the receiver. Hence, the maximum allowable noise figure of the receiver can be found by NF E brx E b N dB. (8.3) The next step after finding the required NF of the receiver is to find the sensitivity of the receiver. As mentioned earlier in section 2.3.3, the sensitivity of IR-UWB receivers is data rate dependant and can be found by S in = logB + NF + SNR SNR=10log R B B + E b N0 S in = logR b + NF + E b N 0, (8.4) which is equivalent to 74-dBm, given the design parameters. 8.4 Chirp-FSK IR-UWB Receiver Design The block diagram of the proposed receiver is shown in Fig It is composed of RF front-end, analog baseband, and clock recovery section. The RF front-end is composed of an ultra-low power LNA which also performs single-ended to differential conversion. To recover the chirp-fsk signal, two parallel RF paths which are tuned at the center frequencies of the chirp-uwb pulses are used. Two bandpass filters (BPF1 and BPF2) tuned at the center frequency of C-FSK pulses perform the required RF filtering and RF amplification with low power consumption. A squarer, low frequency VGA and a comparator follow the RF bandpass filters. The squarer detects the energy of the received pulses in each frequency band and the resulting output voltage is amplified by the VGA. Then, the output of the VGA is continuously sliced with the aid of the comparator. The threshold voltage of the comparator is set by the other channel which increases the immunity of the receiver to the noise and interference. The clock recovery block is composed of a summer block which sums up the signals coming from both frequency bands and injects the pulses to a ring oscillator. The pulses from both frequency bands are used for injection locking to maximize the bandwidth of the injection locking. The center frequency of the ring oscillator is controlled through varactors Ph.D. Thesis 2015 M. Parvizi

170 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN BPF1 ( ) 2 VGA Vb2 + Vb1 - LNA Injection Locked Clk recovery Recovered Clk Σ - FF Data out Vb2 - Clk BPF2 ( ) 2 VGA Vb1 + Figure 8 2: The block diagram of the proposed chirp-fsk IR-UWB receiver with injection locked based clock recovery. to bring it close enough to the frequency of the incoming pulses. The output data of the receiver is sampled using a flip flop working on the recovered clock. In the following subsections, we will describe the main building blocks of the receiver. First the low noise amplifier will be described in details and the measurement results for a breakout will be presented. Then the RF bandpass filters and amplifiers will be presented. The measurement results of the RF front-end of the receiver is presented next. The details of the squarer circuit and the baseband amplifier will follow Low Noise Amplifier This section describes the operating principle of the proposed noise cancelling LNA. The current-reuse noise cancelling structure is presented first. Then, a g m -boosting technique for ULP and ULV LNA design is described. Noise Canceling LNA with Current-reuse In the conventional noise cancelling LNA topology shown in Fig. 8 3(a) [95], a CG transistor provides wideband input matching while a common-source (CS) transistor creates a path for noise cancellation of the CG transistor. The wideband input stage of the noise cancelling LNA can be combined with a wideband load to cover a broad band or by a Ph.D. Thesis 2015 M. Parvizi

171 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN L3 M1 Rd Rd Vout+ Vout- Vout+ RS M1 M2 RS L1 VX R1 C1 Vin Vin L2 R2 Vout- M2 (a) (b) Figure 8 3: The schematic of (a)conventional noise cancelling LNA (b) proposed ULV, ULP current-reuse noise cancelling LNA. architecture. frequency tunable pass band load to operate only at the desired frequency. The benefit of a pass band load is the ability of the LNA to filter adjacent noise or interferers. In addition, this type of noise cancelling LNA architecture realizes single-ended to differential conversion of the input, removing the need for an external balun. This reduces the cost and allows for an improved overall receiver noise figure. To decrease the current consumption of the noise cancelling LNA this work proposes a current reuse scheme that is utilized along with a new noise cancelling LNA architecture, shown in Fig. 8 3(b). The basic idea of the current reuse technique is introduced in [122] and it is altered for use in a noise cancelling architecture. The current reuse technique allows the CG and CS branches to share the same current. M1 is a PMOS transistor in a CG configuration to provide wideband 50-Ω input matching, while M2 is a CS NMOS transistor which creates a feed-forward path to cancel the drain current noise of M1. The two branches are AC decoupled from each other by capacitor C1 such that node V X is at signal ground. Therefore, only the DC currents are shared between the two transistors. The loads of both CG and CS branches are implemented using a parallel RLC resonant circuit. A parallel RL combination was employed for two main reasons: using inductors Ph.D. Thesis 2015 M. Parvizi

172 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN eliminates the voltage drop across the load so an ULV LNA can be realized, and using resistors reduces the Q factor of the inductors to increase bandwidth of operation at the cost of increased NF. Consequently, inductor L1 resonates with the parasitic capacitance at the drain of M1 with a Q-factor that is determined by resistor R1. The same scenario happens for the load at the drain of M2. Inductor L3 is chosen to be sufficiently large to act as a current source at the input of M1. Using an inductor instead of a resistor or a transistor for the current source allows the use of a lower voltage supply. The output balancing criteria is determined using the following formula g m1 Z 1 = g m2 Z 2, (8.5) where the left side of (8.5) is the gain of the CG stage and the right side is the gain of CS stage. Z 1 represents the total impedance at the drain of the CG transistor and Z 2 is the total impedance at the drain of the CS transistor. There are three important factors that needs to be considered for the design of Z 1 and Z 2 impedances. The first factor is the difference in the parasitic capacitance of the PMOS and NMOS transistors. The second one is due to the difference in the parasitic capacitance of a CG and CS configurations. The Miller effect in CS architectures increases the parasitic capacitance and leads to some imbalance especially at high frequencies. Therefore, these parasitic capacitances must be taken into account when sizing L1 and L2. Furthermore, There is a slight difference in the transconductance of M1 and M2 which is due to different sizing and mobility of the devices. It should be noted that higher transconductance for M2 is desirable since it leads to lower NF. Consequently, Z 1 is slightly higher than Z 2 to satisfy the output balancing criteria. Additionally, g m1 has to satisfy the input matching condition such that R S = 1/g m. The impact of these differences over effective noise cancellation are studied and simulated over the process corners and will be discussed in the simulation results section. It will be seen that, in spite of slight mismatch specially at high frequencies, the noise cancellation is effective. Ph.D. Thesis 2015 M. Parvizi

173 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN The noise factor of this LNA is calculated assuming that the transistors have infinite output impedance and that the gate resistance is negligible for simplicity. The noise factor of this LNA is determined using the following formula F = 1 + (γ/α) g m1 (Z 1 g m2 Z 2 R S ) 2 R S A 2 V + (γ/α) g m2z 2 (1 + g m1 R S ) 2 R S A 2 V + (R 1 + R 2 ) (1 + g m1 R S ) 2, R S A 2 V (8.6) where, A V = (g m1 Z 1 + g m2 Z 2 ), γ is the MOSFET noise parameter and α = g m /g d0. The second term is the contribution of PMOS transistor, the third term is due to the NMOS transistor, and the last term comes from the load resistors. Inductive gm-boosting Technique Because of the ULP and the ULV design restrictions and biasing the transistors in moderate inversion region, attaining a high intrinsic f T is not possible in this work. To overcome this challenge, a g m -boosting technique on the common-gate transistor as discussed in section is employed. Furthermore, the common-source transistor requires the same type of g m -boosting. Hence, the same approach is taken for the CS transistor in the feedforward path. By adding an inductor in the gate, g m -boosting for the CS transistor can be achieved. By using a similar analysis to the one employed for the CG transistor, the G m,eff of the CS transistor and its simplified high frequency approximation is given by G m,eff = g m 1 + jωc gs R s ω 2 L g C gs ω T ω 1 jr s ωl g, (8.7) which is similar to that of the gm-boosted CG transistor. The proposed g m -boosting technique also improves the input matching condition, especially at high frequencies. The input impedance of the proposed circuit is given by Z in = (1 + g o1 Z 1 ) (1 ω 2 L g1 C gs1 ) ( gm1 + jωc gs1 (1 + g o1 Z 1 ) ) + g o1 (1 ω 2 L g1 C gs1 ) (1 ω2 L g2 C gs2 ), (8.8) jωc gs2 where g O1 is the output conductance of the M1 transistor. This equation shows that the effect of parasitic capacitances at the input will be reduced by adding inductors L4 and L5. Ph.D. Thesis 2015 M. Parvizi

174 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN Buffer L3 M4 L4 M1 M3 Vout+ RS L1 VX R1 C1 Buffer Vin L2 R2 M4 L5 M2 M3 Vout- Figure 8 4: The circuit schematic of the proposed ultra low voltage, low power current-reuse LNA with g m -boosting. The complete circuit schematic of the proposed LNA is illustrated in Fig The input signal is applied to the source of M1 and gate of M2, which are biased in the moderate inversion region. Inductors L1, L2 and L5 are on-chip spiral inductors while L3 and L4 are implemented using bond wire inductances to reduce the area of the LNA. Capacitor C1 is an on-chip capacitor. Simulation Results The circuit is designed in a TSMC 90-nm CMOS technology using BSIM4 models and simulations are carried out using SpectreRF. The input matching (S11), reverse isolation (S12), output matching (S22) and voltage gain (Av) of the LNA are plotted in Fig The maximum gain of the LNA is 15-dB and its 3-dB bandwidth is between 3.2-GHz and 10-GHz. The S11 is well below 10-dB in this band thanks to resonance at the input and inductive gm-boosting. The S22 is below 10-dB as well, and the S12 is less than 35-dB. The noise figure of the proposed LNA is shown in Fig The NF varies between 4.5-dB and 5.3-dB across the bandwidth. The LNA consumes only 410-µW from a 0.4-V supply voltage. Ph.D. Thesis 2015 M. Parvizi

175 Noise Figure (db) Av, S11, S22,S12 (db) 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN S12 S22 S11 Av Frequency (GHz) Figure 8 5: The simulated voltage gain, S11, S12 and S22 of the proposed LNA Frequency (GHz) Figure 8 6: The simulated noise figure of the proposed LNA. It is important to examine the performance of the noise cancelling LNA at different process corners. Fig. 8 7 illustrates the contribution of the drain noise current of the CG and CS transistors in the bandwidth of operation. As can be seen, the noise contribution of CG transistor is very small (less than 1.5%) between 3-GHz and 5-GHz. However, it gradually increases at high frequencies. This is due to a phase imbalance between the two outputs which increases at high frequencies and reduces the drain current noise cancellation. Ph.D. Thesis 2015 M. Parvizi

176 Noise Contribution (%) 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN Common-Source Transistor TT FF SS SF FS Common-Gate Transistor Frequency (GHz) Figure 8 7: The simulated noise figure of the proposed LNA. L3 Buffer + - M1 Vout+ M3 RS L1 R1 C1 VX V in L2 R2 Buffer + - M2 Vout- (b) M4 Figure 8 8: The schematic of the fabricated noise cancelling LNA. Measurement Results The LNA used in the chirp-fsk IR-UWB receiver is a simplified version of the LNA described in the previous section, with L4 and L5 removed. Also, L1 and L2 are implemented as a differential inductor. The reason for these changes is that the operating frequency for this receiver is only between 3 5-GHz and does not require the bandwidth enhancement Ph.D. Thesis 2015 M. Parvizi

177 S21 (db) 0.8-mm 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN G IN G L3 L1 & L2 0.5-mm G S G S G Figure 8 9: The Die micrograph of the fabricated LNA Out + Measured -10 Out - Measured Out + PLS Out - PLS Frequency (GHz) Figure 8 10: The measured S21 and post layout simulation of the proposed LNA. Ph.D. Thesis 2015 M. Parvizi

178 S11(dB) 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN 0 S11 Measured S11 PLS Frequency (GHz) Figure 8 11: The measured S11 and post layout simulation of the proposed LNA. techniques. As a result, the LNA shown in Fig. 8 8 is designed and implemented in an IBM 0.13-µm CMOS technology. The principle of the operation and noise cancelling mechanism are exactly the same as what explained in the previous section, except the bandwidth enhancement techniques are not employed here. Fig. 8 9 shows the die micrograph of the LNA. The total area of the LNA is 0.8-mm 0.5-mm. Fig shows the S21 measurement and the post-layout simulation results of the LNA. As can be seen, there is a good agreement between the single-ended outputs. Fig highlights the input matching of the LNA compared with the post-layout simulation results. Due to the process shift in the fabricated die, the achieved g m for the input matching is not sufficient and the measured result deviates from the post-layout simulation. The LNA draws 1.2-mA current from a 0.5-V supply voltage. Also, it should be noted that due to lack of equipments it was not possible to measure the differential gain and NF of the LNA RF Amplifier As discussed in Chapter 2, RF amplifier and gain stages are one of the main power consuming blocks in the receiver. Therefore, the main focus of the design in this work is to reduce the power consumption of the RF amplifier without impacting its performance. Fig shows the schematic of the designed RF amplifier. As discussed in the previous Ph.D. Thesis 2015 M. Parvizi

179 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN Main Amplifier Vbp Cross-coupled negative g m cell Main Amplifier Vbp M2 VbpQ VbpQ M4 M6 M8 Vin+ Vo- Vo+ Vin- M5 M7 M1 VbnQ VbnQ M3 Vbn Vbn Figure 8 12: The schematic of the designed RF gain stage tuned for two different bands. section, the LNA provides single-ended to differential conversion, hence the RF amplifier has to be differential. The RF amplifier is composed of two main inverters with separate biases for NMOS and PMOS transistors to enable them to be biased in the biasing sweet spot for ULP and ULV operation described in section [22]. A cross coupled inverter cell which generates negative g m to boost the gain of the amplifier with low power consumption. The size of the inverters in the negative g m section is smaller than the main inverters and the value of negative g m is tunable through the bias voltage of the PMOS and NMOS transistors to compensate the process variation. The use of negative g m to boost the gain of amplifiers is a well-known technique [162]. The impact of this gain enhancement technique can be described by considering the case that only the main inverters exist. Hence, the single-ended gain of the amplifier can be found by A v = (g m1 + g m2 ) (g o1 + g o2 ) (8.9) Ph.D. Thesis 2015 M. Parvizi

180 Voltage Gain (db) 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN Higher Capacitance Lower Q-factor Frequency (GHz) Figure 8 13: The voltage gain simulation result of the designed RF gain stage tuned at 4.5-GHz frequency band with the frequency tunability of 400-MHz. where g o1,2 are the output conductances of M1 and M2. by adding the negative g m stage at the output, the overall voltage gain of the amplifier can be found by A v = (g m1 + g m2 ) (g o1 + g o2 (g m7 + g m8 )) (8.10) The output conductance of M7 and M8 are ignored due to the small current flowing into them. In theory, by making (g m7 + g m8 ) = (g o1 + g o2 ) and infinite gain can be achieved with a low power consumption. To make the response tuned at the center frequency of C-FSK pulses, an inductor and a varactor is used. The center frequency of these RF bandpass filters are tunable by 500- MHz around the center frequencies of 3.5-GHz and 4.5-GHz using the varactors. The supply voltage of the RF amplifier is 0.5-V and burns only 270-µW power. Fig highlights the simulation results of the RF gain stage. As can be seen in the figure, about 10-dB at 4.5-GHz frequency band can be achieved with a low power consumption. The bandwidth of the amplifier is more than 700-MHz. Moreover, the frequency tuning through varactors to compensate for the process variation is highlighted in the figure. Ph.D. Thesis 2015 M. Parvizi

181 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN To VNA Buffer BPF1 ( ) 2 VGA Vb2 + LNA Vb1 Vb FF Data out BPF2 ( ) 2 VGA Vb1 + Clk To VNA Buffer Figure 8 14: The placement of measurement buffers after the RF front-end of the receiver RF Front-End Measurement Results The front-end section of the receiver is measured separately, to verify the performance and compare the results with simulation data. Two common-source buffers are placed at the output of the of RF amplifiers, before the squarer circuits, for measurement purposes. These buffers are only turned ON for measurement of the RF front-end. Fig shows the location of the measurement buffers in the receiver chain. The RF front-end of the receiver along with part of the digital back-end is implemented in an IBM 0.13-µm CMOS technology. Fig shows the die micrograph of the receiver. The total area of the receiver is 1.1-mm 1.4-mm. The RF front-end operates from a 0.5-V supply voltage and burns only 1.75-mW static power. The measured S11 of the receiver is plotted in Fig and is compared with postlayout simulation results. The measured S11 is slightly degraded and is higher than 10-dB due to the process shift of the dies to slow-slow corner, as discussed earlier in the LNA section. The S21 measurement and post-layout simulation results are illustrated in Fig for the two channels. As shown in the figure, there is a good agreement between the simulation and measurement results. The center frequency of the two channels are at 3.5- GHz and 4.35-GHz. The gain of the RF front-end of the receiver is tunable by more than Ph.D. Thesis 2015 M. Parvizi

182 S11 (db) 1.4-mm 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN RF In 1.1-mm RF Out1 RF Out2 Figure 8 15: The die micrograph of the receiver. 0-5 S11 PLS S11 Measured Frequency (GHz) Figure 8 16: The S11 measurement and post-layout simulation results of the receiver. Ph.D. Thesis 2015 M. Parvizi

183 S21 (db) 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN S21 Measured Ch 2-60 S21 Measured Ch1 S21 PLS Ch2 S21 PLS Ch Frequency (GHz) Figure 8 17: The S21 measurement and post-layout simulation results of the receiver. 10-dB through the negative g m stage. A cross-talk between the two channels is seen in the plot which can be attributed to the coupling between inductors of the RF amplifiers in the two channels. The S22 and S12 measurement results are highlighted in Fig The S22 of the buffers are around 10-dB as predicted by simulation results. The S12 of the receiver is better than 40-dB in the whole band. The NF simulation results of the RF front-end of the receiver are also plotted in Fig Unfortunately, due to lack of equipments it was not possible to measure the differential NF of RF front-end. However, As can be seen, the minimum simulated NF is 5-dB in the 3.5-GHz band and 6.5-dB in 4.5-GHz band Squarer This section presents the proposed squarer theory of operation, and describes the circuit which includes the addition of a capacitor cross-coupling circuitry as a g m -boosting stage. Theory of squarer operation The squaring operation proposed in this work uses the intrinsic MOS transistor characteristics to square the input signal. To explain how the MOS transistor characteristics can be used to implement a squaring operation, consider an NMOS transistor in a common-source configuration. Its small signal drain current can be expanded to the following Taylor series in terms of the small signal gate-source voltage, v gs, around the bias point, where the drain Ph.D. Thesis 2015 M. Parvizi

184 NF (db) S22 & S12 (db) 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN S22 Ch1-60 S22 Ch2-70 S12 Ch1 S12 Ch Frequency (GHz) Figure 8 18: The S22 and S12 measurement results of the receiver NF Band #2 NF Band # Frequency (GHz) Figure 8 19: The NF measurement results of the RF front-end of the receiver. current dependence on the drain-source voltage has been neglected to simplify the analysis i ds = g m v gs + g mv 2 gs + g mv 3 gs, (8.11) where g m is the transconductance of the transistor and g m and g m denote its first and secondorder derivatives with respect to the gate-source voltage, respectively. A squarer circuit can be realized by cancelling the g m and g m terms in (8.11), and this can be accomplished using the circuit structure shown in Fig The drain currents of the transistors can be Ph.D. Thesis 2015 M. Parvizi

185 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN iout Vin+ M2 M1 Vin- Figure 8 20: The structure utilized for cancelling odd order terms in the Taylor series of the drain current of two MOSFETs. described by the following equations i ds1 = g m1 v in + g m1v 2 in + g m1v 3 in, (8.12) i ds2 = g m1 ( v in ) + g m1 ( v in ) 2 + g m1 ( v 3 in ). (8.13) Assuming M1 and M2 are identical, the output current can be written as i out = i ds1 + i ds2 = 2g mv 2 gs. (8.14) This equation shows that the output current is ideally dependant only on the square of the input voltage and g m. To maximize the gain of the squarer, g m has to be maximized. Since g m is a function of the V GS, the maximum value can be found by plotting g m versus V GS. The g m and g m curves as a function of the gate-source voltage are depicted in Fig The curves are extracted using a TSMC 90-nm CMOS technology. Conventionally, transistors are biased in the saturation region to get a high squaring gain from the analog multipliers [128]. However, by looking more precisely at the curves in Fig. 8 21, it can be observed that there is a peak in g m in the moderate inversion region that would also lead to a high squaring gain. This peaking is caused by the change in the drain current flow mechanism from diffusion to drift, and this occurs as the transistor bias changes from weak inversion to strong inversion. The drift current can be described by an exponential function, while the diffusion current has a slightly less than square-law behaviour. Over a narrow bias range, the transistor operates in the moderate inversion region, and the current follows Ph.D. Thesis 2015 M. Parvizi

186 g m (A/V) & g' m (A/V 2 ) 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN High 2nd Order gain g m g' m Gate-Source Voltage (V) Figure 8 21: The g m and its derivative with respect to the gate-source voltage. The NMOS width and length are W=40-µm and L=100-µm, respectively, Vds=250-mV. a nearly ideal square-law behaviour [145]. In this work, the transistors are biased in the moderate inversion region at the point where g m is maximized. This not only allows the circuit to achieve a very high squaring gain, but the current consumption is also very low, due to the operation in moderate inversion. To gain more insight into the design procedure of the squarer, the short channel model for the drain current of an NMOS transistor (neglecting channel length modulation for simplicity) is used to find g m [145] i ds = µ 0C ox W X 2 2nL 1 + αx, (8.15) X = 2nφ t ln (1 + e Vgs V T H 2nφ t ), (8.16) where C ox is the gate oxide capacitance per unit area, µ 0 is the mobility factor, W and L are the width and length of the device, respectively, φ t = kt/q is the thermal voltage, V th is the threshold voltage, and V gs is the gate-source voltage. The parameter n models the sub-threshold factor and describes the rate of the exponential increase of the drain current with V gs in the sub-threshold region. The parameter α models both the velocity saturation Ph.D. Thesis 2015 M. Parvizi

187 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN and the mobility degradation effects, and is described by the equation α = θ + µ 0 2nv sat L, (8.17) where θ is the mobility-reduction coefficient and v sat is the saturation velocity. Typical values for α are 23 V 1 or higher in modern deep sub-micron devices, depending on the channel length and velocity saturation effects. Using the above model, g m and g m can be found as follows where g m = Kσ (1 + αx) 2 (1 + σ) g m = Kσ 1 + σ 2X (1 + αx) 2, (8.18) ( 1 (2X + αx2 ) + 2αX 2X 2X ), 2nφ t 1 + σ 1 + αx (8.19) K = µ 0C ox W 2nL, (8.20) σ = e V GS V T H 2nφ t. (8.21) Using ( ), it can be shown that at a fixed biasing point where g m is at its peak, it is proportional to g m µ 0C ox W 2nL. (8.22) Therefore, the peak value of g m can be increased by using wider devices, which would have the effect of increasing the power consumption. Capacitor cross-coupling for g m -boosting As shown in (8.19), one possible way of increasing the peak value of g m, is to increase the width of the input transistors; however, this approach would lead to higher power consumption. To increase the peak value of g m without additional power consumption, a capacitor cross-coupling scheme is introduced as a g m boosting technique. Fig illustrates the principal of this boosting scheme, which is accomplished by introducing an inverting gain stage, A, between the gate and source terminals of the transistors so that Ph.D. Thesis 2015 M. Parvizi

188 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN Vin M1-1 Ls Figure 8 22: Basic gain stage, with g m boosting amplifier. g meff = (1 + A)g m [163]. As the transconductance increases, the slope of the transconductance, g m, also increases, giving rise to a higher conversion gain in the squarer. Using capacitors to realize the inverting gain stage leads to an increase in the conversion gain without increasing the power consumption. Circuit schematic The complete circuit schematic of the proposed squarer is illustrated in Fig The input signal is applied differentially to transistors M1 and M2, which are biased in the subthreshold region, at the point where the derivative of g m is at its peak. The cross-coupled capacitors, Cs, are used to create the negative gain stage by applying the input signal with opposite polarity to the source of each transistor. These capacitors and inductors Ls are selected such that they resonate with each other at the centre of the UWB frequency band to minimize signal losses. A PMOS transistor (M3) is used as a current source to bias the circuit and as an active load to provide a high load resistance while consuming low voltage headroom. The bias voltage for the current source is provided by a constant-g m reference to account for process variations. The conversion gain of the circuit is found by A = 2g m v 2 in R o ut, (8.23) where and r o is the output resistance of the transistor. R out = r o1 r o2 r o3, (8.24) Ph.D. Thesis 2015 M. Parvizi

189 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN M3 Vout M1 M2 Vin+ Ls Cs Cs Ls Vin- Figure 8 23: The complete schematic of the proposed squarer (the bias circuit is not shown). Simulation Results The circuit is designed in a TSMC 90-nm CMOS technology using BSIM4 models. The transistors are sized and biased to get the maximum gain while consuming very little power. The value of the inductors is designed to be less than 1-nH, and they are implemented using the bond wire models in the design kit. The cross-coupled capacitors are implemented using MIM capacitors. The input and output waveforms of the squarer are demonstrated in Fig The input is fourth-order derivative of a Gaussian pulse which is a common waveform generated in UWB transmitters, and it has a 10-dB bandwidth of 7.2-GHz centered at 7-GHz. The output bandwidth of the squarer is limited by a pole around 1.5-GHz, and this performs a part of the integration function that is needed for energy detection. The squarer gain is calculated using the RMS values of the input and output voltages (as opposed to the peak values) since this will best indicate the output of the integrator. Fig plots the gain with respect to the input voltage. It can be observed that the squarer circuit is linear up to input levels of 80-mV, after which saturation occurs. The maximum conversion gain of the circuit is 16-dB for a 100-mV input, and the power consumption is only 160-µW power from a 0.5-V power supply. Ph.D. Thesis 2015 M. Parvizi

190 RMS Gain (db) Input Pulse (V) Output Waveform (V) 8.4. CHIRP-FSK IR-UWB RECEIVER DESIGN Time (ns) (a) Time (ns) (b) Figure 8 24: (a) The fourth-order derivative of a Gaussian-pulse used as an input pulse and (b) the output waveform of the squarer Designed Squarer Ideal Squarer Input Voltage (V id =V in+ -V in- ) (mv) Figure 8 25: The RMS gain of the squarer vs. the input signal amplitude. Since the magnitude of g m is bias dependent, it is important to analyse the sensitivity of the squarer to the bias voltage of M1 and M2. Fig illustrates the results for an input signal of 10-mV. It can be seen that a bias voltage variation of 10% causes a gain variation of less than 1-dB. Monte Carlo simulation was utilized to verify the performance of the circuit against process variations and mismatch. The results, presented in Fig. 8 27, demonstrate that Ph.D. Thesis 2015 M. Parvizi

191 RMS Gain Deviation (db) 8.5. DEMODULATION V bias (V) Figure 8 26: Sensitivity of the squarer gain to the biasing voltage of the input transistors for an input signal of 10-mV. the gain deviation has a mean value of 15.5-dB and a standard deviation of 0.92-dB. These results indicate that the performance of this circuit will be acceptable when fabricated. The sensitivity of the squarer performance to the length of the bond wires was also examined by varying their length by 10%, and this resulted in a small gain variation of only ±0.15-dB RMS. While there are no specific metrics for comparing squarers performance, it is clear that gain, power consumption, and bandwidth are among the most important parameters. Table 8 1 presents a comparison of this work to squarer circuits in the literature. It can be seen that the proposed circuit consumes much less power when compared to other works, while simultaneously achieving a higher gain. Naturally, the passive squarer has lower power consumption, but this comes at the price of high signal losses. 8.5 Demodulation In this work a continuous time slicing scheme [38] is used to enable the use of simple and low power injection locking clock recovery scheme. The use of continuous time slicing technique in the demodulator leads to decoupling the demodulation and the synchronization and reduces the required time for synchronization. Hence, the timing information can be Ph.D. Thesis 2015 M. Parvizi

192 Number of Occurrance 8.5. DEMODULATION mu = 15.5 sd = 0.92 N = RMS Gain (db) Figure 8 27: Monte Carlo simulation results for the RMS gain. Table 8 1: Performance summary of the squarer and comparison with prior published works Parameter This Work JSSC 2010 [42] ISCAS 2007 [126] ICUWB 2010 [130] RMS Gain (db) 100-mV 10-mV NA NA NA Peak Voltage Gain (db) 10-mV input NA 150-mV input Freq. Band GHz NA GHz NA Squarer Type Passive Active /Saturation Active/Subthreshold Active/Subthreshold Supply (V) 0.5 NA 1.8 NA Technology 90-nm CMOS 90-nm CMOS 180-nm CMOS 130-nm CMOS Power 160-µW mW 1.5-mW extracted from the return to zero (RZ) baseband signals. Fig shows the principle of the continuous time slicing scheme. As can be seen in the figure, for the case of OOK modulation a threshold voltage is required in the comparator. The continuous time slicing Ph.D. Thesis 2015 M. Parvizi

193 8.6. INJECTION LOCKING BASED SYNCHRONIZATION SCHEME RFin ( ) 2 Digital Vth Backend Asynchronous RX front-end Figure 8 28: The block diagram of the continuous time slicing scheme. technique produces digital pulses from the received analog signal and preserve the timing information in the rising edge of the pulses. The demodulation scheme employed in this work is similar to the work presented in [38]. However, as shown in Fig. 8 2, the main difference is that, here a threshold voltage is not required. The need for a separate threshold voltage is eliminated by using the output of each channel as a threshold for the other channel hence an automatic threshold generation is achieved. By employing this scheme, the real characteristic of a channel is used for threshold generation and reduces the sensitivity to the interferers and noise in the channel. 8.6 Injection Locking Based Synchronization Scheme As discussed in section 2.4, pulse level synchronization is one of the challenges in the design of IR-UWB receivers. Determining the exact arrival time of UWB pulses is extremely hard, mostly due to the short timing duration of the transmitted impulses (which are in the order of nano-seconds). These challenges usually make the receiver to rely on an accurate off-chip crystal oscillator and a phased locked loop (PLL) [47, 71] or a delay locked loop (DLL) [41, 42] to generate the accurate phases, which increase the cost, complexity and power consumption of the receiver. Injection locking has become popular in many applications like, clock distribution, clock recovery and frequency generation [50, 84, ]. It is considered as a low power and low complexity alternative for PLLs in clock recovery section, since the need for phase frequency Ph.D. Thesis 2015 M. Parvizi

194 8.6. INJECTION LOCKING BASED SYNCHRONIZATION SCHEME Va BPF1 ( ) 2 VGA Vb2 + Vb1 - LNA Injection Locked Clk recovery Recovered Clk Phase Shifter Vc Pulse Width Adj. Σ Vb2 - BPF2 ( ) 2 VGA Vb1 + Figure 8 29: The block diagram of the proposed low power clock recovery scheme based on injection locking. detector, divider and loop filters is eliminated. Most importantly, a crystal oscillator is not required as a reference in injection locking based clock recovery schemes, which further reduces the cost, area and power consumption. Also, in wireless sensor network applications where stringent jitter requirements do not exist a low power, low area injection locked ring oscillator can deliver the required jitter performance. In this section a new ultra-low power injection locking based clock recovery and synchronization scheme is proposed for the chirp-fsk IR-UWB receivers. The proposed clock recovery scheme generates pulses proportional to the random chirp-uwb data and injects them to the ring oscillator Clock Recovery Architecture and Circuits The detailed block diagram of the clock recovery section is illustrated in Fig As can be seen in the figure, the clock recovery section is fed by the demodulated input bit stream from both frequency bands. The two bit streams coming from two frequency bands are summed together to maximize the effective number of pulses used for injection. The next block is a pulse width adjustment which is used to increase the pulse width of the received pulses to perform a reliable injection locking. Fig highlights the details of the circuitry. Ph.D. Thesis 2015 M. Parvizi

195 8.6. INJECTION LOCKING BASED SYNCHRONIZATION SCHEME Buffer Input Pulse Delay Extended Width Pulse Delayed Input Pulse Output Pulse Figure 8 30: The detailed block diagram of the pulse width adjustment block along with the input and output pulses. It is composed of a delay cell and an OR gate. The input pulses and the extended width output pulses are also illustrated in the figure. The adjusted pulse is then used for injection to a ring oscillator. The time domain signals at different parts of the receiver and the clock recovery block are illustrated in Fig As can be seen in the figure, the high frequency UWB pulse at V a is demodulated using the squarer. The signal v b shows the time-domain signal after the squarer and baseband gain stages. The continuous-time slicing demodulator allows for asynchronous detection and the received pulse is sliced after amplification. Then, the pulse width of the sliced pulse is adjusted/increased to be prepared for a reliable injection locking (signal V c ). The center frequency of the ring oscillator is controlled through a varactor bank which provides a reliable and low noise frequency/phase control mechanism. Fig illustrates the ring oscillator and the injection mechanism. A switch is used to short one of the ring stages to ground at the injection time. This method is found to be the optimum technique for this type of injection locking Injection Locking The injection locking mechanism is similar to a first-order PLL, which has a finite bandwidth and reduces the phase noise of the VCO within the locking range. In this section Ph.D. Thesis 2015 M. Parvizi

196 V c (V) V b (V) V a (V) 8.6. INJECTION LOCKING BASED SYNCHRONIZATION SCHEME 2 x Time (ns) Figure 8 31: The time domain signals at multiple points in the receiver. Injection Signal Vo V cont SC SC SC Figure 8 32: The block diagram of the injection locking to the ring oscillator. the phase noise and locking range of the proposed injection locking based clock recovery scheme will be discussed. Phase Noise The theory of injection locking used in this work is similar to sub-harmonic injection locking proposed in [164]. Even though the frequency of the injection locking is the same as the VCO, but since the duty cycle of the injection pulse is less than 50% it is considered Ph.D. Thesis 2015 M. Parvizi

197 8.6. INJECTION LOCKING BASED SYNCHRONIZATION SCHEME to be a sub-harmonic injection. However, unlike the work in [84] the sub-harmonic injection locking ratio is not large. The sub-harmonic injection locking ratio in [84] is given by N = 1 α β DR inj W pulse, (8.25) where α is the probability of the transmitted data to be 1, β is the roll-off coefficient due to pulse shaping, DR inj is the data rate and W pulse is the width of the transmitted pulse. As can be seen, N is data rate dependant and increases at low data rates which makes this scheme improper for low data applications. However, using the proposed architecture N will increased to N proposed = 1 β W pulse /T osc. (8.26) In this work α which is the probability of the transmitted data be 1, does not impact the injection locking ratio because the pulses coming from both frequency channels are used for injection locking. It is well known that the phase noise of a free running VCO is shaped by the injection locking [164,170]. Under fundamental injection locking it can be shown that the phase noise of the VCO is suppressed and tracks that of the injection locked reference [170] within the locking range, ω L. Therefore, in a sub-harmonic injection the phase noise will also track the injected clock but with an offset which is dependant on the sub-harmonic injection ratio. Hence, the phase noise of the sub-harmonically injected VCO can be found by L inj (ω) + 20log 10 N, where L inj (ω) denotes the phase noise of the injection clock. Fig shows the phase noise prediction of an injection locked VCO [164]. As can be seen in the figure, the phase noise response can be divided into three different regions depending on the offset frequency. The first region is for the case that the offset frequency is smaller than the locking range, ω L, and as discussed before, the phase noise of the VCO is suppressed by the injection clock. The second region is for the offset frequencies which are between the locking range and injection clock. In this region, there is a competition between injection locking and the free running VCO. Finally, the third region is for the offset frequencies larger than injection Ph.D. Thesis 2015 M. Parvizi

198 Phase Noise (Log Scale) 8.6. INJECTION LOCKING BASED SYNCHRONIZATION SCHEME Region I Region II Region III VCO PN ω L ω inj ω (Log Scale) Figure 8 33: The prediction of the phase noise of an injection locked VCO. clock frequency. In this region the phase noise tracks the phase noise of a free running VCO. Locking Range The injection locking mechanism is similar to a first-order PLL, which has a finite bandwidth and reduces the phase noise of the VCO within the locking range. The locking range is an important parameter in an injection locked VCO. The locking range, ω L, degrades as N increases. The lock range of a full-rate injection locking is given by [164, 170] ω L = ω out 2Q I inj I osc 1 1 I2 inj Iosc 2, (8.27) where I inj and I osc are the average of large signal injection current and oscillation currents, respectively and Q is the quality factor of the tank. In sub-harmonic injection the effective injection current is N times lower than I inj since injection occurs every N cycles. Consequently, the locking range becomes ω L = ω out 2Q I inj 1 I osc N 1 1 I2 inj Iosc 2 N 2 ω out 2Q I inj I osc 1 N. (8.28) Therefore, in sub-harmonic injection the locking range, ω L, is reduced by a factor of N. Therefore, in this design where N is reduced by using injection pulses from both frequency Ph.D. Thesis 2015 M. Parvizi