Lecture 13 CMOS Power Dissipation

Similar documents
Power and Energy. Courtesy of Dr. Daehyun Dr. Dr. Shmuel and Dr.

19. Design for Low Power

Power Spring /7/05 L11 Power 1

ECE 484 VLSI Digital Circuits Fall Lecture 02: Design Metrics

EE 42/100 Lecture 23: CMOS Transistors and Logic Gates. Rev A 4/15/2012 (10:39 AM) Prof. Ali M. Niknejad

EE434 ASIC & Digital Systems

VLSI Design I; A. Milenkovic 1

UNIT-II LOW POWER VLSI DESIGN APPROACHES

Low Power Design for Systems on a Chip. Tutorial Outline

1. Short answer questions. (30) a. What impact does increasing the length of a transistor have on power and delay? Why? (6)

Low-Power Digital CMOS Design: A Survey

LOW POWER VLSI TECHNIQUES FOR PORTABLE DEVICES Sandeep Singh 1, Neeraj Gupta 2, Rashmi Gupta 2

UNIT-1 Fundamentals of Low Power VLSI Design

Chapter 2 Combinational Circuits

CPE/EE 427, CPE 527 VLSI Design I: Homeworks 3 & 4

2009 Spring CS211 Digital Systems & Lab 1 CHAPTER 3: TECHNOLOGY (PART 2)

Low Power Design. Prof. MacDonald

Pramoda N V Department of Electronics and Communication Engineering, MCE Hassan Karnataka India

Reading. Lecture 17: MOS transistors digital. Context. Digital techniques:

Static Energy Reduction Techniques in Microprocessor Caches

ECE 471/571 The CMOS Inverter Lecture-6. Gurjeet Singh

A HIGH SPEED & LOW POWER 16T 1-BIT FULL ADDER CIRCUIT DESIGN BY USING MTCMOS TECHNIQUE IN 45nm TECHNOLOGY

Low-Power VLSI. Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering

EE 330 Lecture 43. Digital Circuits. Other Logic Styles Dynamic Logic Circuits

EECS 141: SPRING 98 FINAL

EE 330 Lecture 43. Digital Circuits. Other Logic Styles Dynamic Logic Circuits

Introduction. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. July 30, 2002

Low Power Design in VLSI

CHAPTER 3 PERFORMANCE OF A TWO INPUT NAND GATE USING SUBTHRESHOLD LEAKAGE CONTROL TECHNIQUES

Power dissipation in CMOS

ELEC Digital Logic Circuits Fall 2015 Delay and Power

Announcements. Advanced Digital Integrated Circuits. Midterm feedback mailed back Homework #3 posted over the break due April 8

Lecture 04 CSE 40547/60547 Computing at the Nanoscale Interconnect

Jan Rabaey, «Low Powere Design Essentials," Springer tml

Combinational Logic Gates in CMOS

Low Power Realization of Subthreshold Digital Logic Circuits using Body Bias Technique

EEC 216 Lecture #10: Ultra Low Voltage and Subthreshold Circuit Design. Rajeevan Amirtharajah University of California, Davis

International Journal of Innovative Research in Technology, Science and Engineering (IJIRTSE) Volume 1, Issue 1.

Sleepy Keeper Approach for Power Performance Tuning in VLSI Design

EE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling

Ultra Low Power VLSI Design: A Review

ECE520 VLSI Design. Lecture 5: Basic CMOS Inverter. Payman Zarkesh-Ha

DESIGN AND SIMULATION OF A HIGH PERFORMANCE CMOS VOLTAGE DOUBLERS USING CHARGE REUSE TECHNIQUE

Low Power VLSI Circuit Synthesis: Introduction and Course Outline

Chapter 6 Combinational CMOS Circuit and Logic Design. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

EE 330 Lecture 44. Digital Circuits. Other Logic Styles Dynamic Logic Circuits

Chapter 1 Introduction

Chapter 4. Problems. 1 Chapter 4 Problem Set

INTERNATIONAL JOURNAL OF APPLIED ENGINEERING RESEARCH, DINDIGUL Volume 1, No 3, 2010

Novel Buffer Design for Low Power and Less Delay in 45nm and 90nm Technology

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2012

Energy Reduction of Ultra-Low Voltage VLSI Circuits by Digit-Serial Architectures

1 Digital EE141 Integrated Circuits 2nd Introduction

DESIGNING OF SRAM USING LECTOR TECHNIQUE TO REDUCE LEAKAGE POWER

Digital Microelectronic Circuits ( ) Terminology and Design Metrics. Lecture 2: Presented by: Adam Teman

Introduction to VLSI ASIC Design and Technology

Temperature-adaptive voltage tuning for enhanced energy efficiency in ultra-low-voltage circuits

Low Power Techniques for SoC Design: basic concepts and techniques

Leakage Current Analysis

Comparative Study of Different Low Power Design Techniques for Reduction of Leakage Power in CMOS VLSI Circuits

Digital Electronics Part II - Circuits

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

A Survey of the Low Power Design Techniques at the Circuit Level

Implementation of dual stack technique for reducing leakage and dynamic power

Chapter 6 DIFFERENT TYPES OF LOGIC GATES

Leakage Power Reduction by Using Sleep Methods

Processor Power and Power Reduction

Double Stage Domino Technique: Low- Power High-Speed Noise-tolerant Domino Circuit for Wide Fan-In Gates

A Static Power Model for Architects

ESTIMATION OF LEAKAGE POWER IN CMOS DIGITAL CIRCUIT STACKS

An Overview of Static Power Dissipation

Course Content. Course Content. Course Format. Low Power VLSI System Design Lecture 1: Introduction. Course focus

ECE/CoE 0132: FETs and Gates

ECE520 VLSI Design. Lecture 2: Basic MOS Physics. Payman Zarkesh-Ha

Logic Restructuring Revisited. Glitching in an RCA. Glitching in Static CMOS Networks

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013

Minimizing the Sub Threshold Leakage for High Performance CMOS Circuits Using Stacked Sleep Technique

Low-Power CMOS VLSI Design

Minimization Of Power Dissipation In Digital Circuits Using Pipelining And A Study Of Clock Gating Technique

Characterization of Variable Gate Oxide Thickness MOSFET with Non-Uniform Oxide Thicknesses for Sub-Threshold Leakage Current Reduction

LEAKAGE POWER REDUCTION IN CMOS CIRCUITS USING LEAKAGE CONTROL TRANSISTOR TECHNIQUE IN NANOSCALE TECHNOLOGY

Keywords : MTCMOS, CPFF, energy recycling, gated power, gated ground, sleep switch, sub threshold leakage. GJRE-F Classification : FOR Code:

NOVEL OSCILLATORS IN SUBTHRESHOLD REGIME

Integrated Circuit Amplifiers. Comparison of MOSFETs and BJTs

Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University

Introduction. Introduction. Digital Integrated Circuits A Design Perspective. Introduction. The First Computer

Design & Analysis of Low Power Full Adder

ECE 334: Electronic Circuits Lecture 10: Digital CMOS Circuits

Microcontroller Systems. ELET 3232 Topic 13: Load Analysis

EECS 427 Lecture 22: Low and Multiple-Vdd Design

On Chip Active Decoupling Capacitors for Supply Noise Reduction for Power Gating and Dynamic Dual Vdd Circuits in Digital VLSI

Leakage Power Minimization in Deep-Submicron CMOS circuits

Design and Implementation of Digital CMOS VLSI Circuits Using Dual Sub-Threshold Supply Voltages

Propagation Delay, Circuit Timing & Adder Design. ECE 152A Winter 2012

Propagation Delay, Circuit Timing & Adder Design

International Journal of Engineering Trends and Technology (IJETT) Volume 45 Number 5 - March 2017

EECS 427 Lecture 13: Leakage Power Reduction Readings: 6.4.2, CBF Ch.3. EECS 427 F09 Lecture Reminders

EEC 118 Lecture #11: CMOS Design Guidelines Alternative Static Logic Families

6.012 Microelectronic Devices and Circuits

[Singh*, 5(3): March, 2016] ISSN: (I2OR), Publication Impact Factor: 3.785

Transcription:

EE 471: Transport Phenomena in Solid State Devices Spring 2018 Lecture 13 CMOS Power Dissipation Bryan Ackland Department of Electrical and Computer Engineering Stevens Institute of Technology Hoboken, NJ 07030 Adapted from Digital Integrated Circuits: A Design Perspective, Rabaey et. al., 2003 and Lecture Notes, David Mahoney Harris CMOS VLSI Design 1

CMOS a Low Power Technology CMOS developed in 1970 s as a low power technology (almost) no DC current when gate is not switching no static power dissipation CMOS replaces NMOS in 1980 s as dominant digital technology NMOS designs dissipated about 200µW/gate Power dissipation no longer an issue! CMOS process technology evolves to provide: more transistors per chip (Moore s Law) faster switching speed (few MHz hundreds of MHz) 1992 DEC announces Alpha 64-bit microprocessor triumph of high speed CMOS digital design first 200MHz processor, 1.7M transistors 30W power dissipation Power dissipation is once again an issue! 2

Why Power Matters: Package & System Cooling Need to remove heat from high performance chips max. operating temperature silicon transistors: 150 200 C Chip on PC board can dissipate 2-3 watts With suitable heatsink, maybe 10 watts With forced-air cooling (fans), up to 150W With sophisticated liquid cooling, maybe 1000W 3

Why Power Matters: Battery Size & Weight Today, we see more hand-held battery operated devices Unlike CMOS technology, battery technology has seen only modest improvements over last few decades Mobile Computing Environment, Paradiso et. al. Pervasive Computing, IEEE 2005 Expected battery lifetime increase over the next 5 years: 30 to 40% 4

Why Power Matters: Power Distribution Power Supply and Ground design If VDD=1.0V, a 100W chip draws 100 amps! Many package pins required Virtex-6 1924-pin package: 220 power and 484 GND pins On-chip wiring distribute this current Electro-migration issues On-chip noise and system reliability Large currents switched through package and PCB inductance Environmental Concerns Computers and consumer electronics account for 15% of residential energy consumption 5

Back to Basics: Power & Energy Power is drawn from a voltage source attached to the V DD and GND pins of a chip. Pt () = ItV () () t Instantaneous Power: (watts) E Energy: (joules) T = 0 P() t dt Average Power: P avg T = E 1 T = T 0 P() t dt 6

Back to Basics: Power in Circuit Elements Power Supply: Resistor Capacitor P t = I tv ( ) ( ) VDD DD DD ( t) 2 VR 2 PR( t) = = IR( t) R R Capacitors don t dissipate power! but they do store energy: V c R t=0 V(t) C dv EC = I ( t) V ( t) dt = C V ( t) dt dt 0 0 V C 0 ( ) = C V t dv = CV 1 2 2 C 7

Power Dissipation in CMOS P total = P dynamic + P static Dynamic power: P dynamic = P switching + P shortcircuit Switching load capacitances Short-circuit current Static power: P static = (I sub + I gate + I junct + I contention )V DD Subthreshold leakage Gate leakage Junction leakage Contention current 8

Dynamic Power: Charging a Capacitor When the gate output rises from GND to V DD : Energy stored in capacitor is E = CV 1 2 C 2 L DD But energy drawn from the supply is dv E = I t V dt = C V dt ( ) VDD DD L DD dt 0 0 VDD 2 L DD L DD 0 = CV dv= CV Half the energy from V DD is dissipated in the pmos transistor as heat, other half stored in capacitor When the gate output falls from V DD to GND Stored energy in capacitor is dumped to GND Dissipated as heat in the nmos transistor independent of size of transistors! 9

Switching Waveforms Example: V DD = 1.0 V, C L = 150 ff, f = 1 GHz 10

PP sssssssssssssssss = 1 TT 0 Switching Waveforms TT = VV TT DDDD TT ii DDDD (tt) dddd 0 ii DDDD (tt)vv DDDD dddd = VV tttttttttt ccccccccccc dddddddddd DDDD TT ffffffff pppppppppp ssssssssssss iiii tttttttt TT = VV DDDD TT TTff ssssccvv DDDD PP sssssssssssssssss = CC. VV DDDD 2. ff ssss Note: P switching is independent of drive strength of the nmos and pmos transistors 11

Activity Factor Suppose the system clock frequency = f Most gates do not switch every clock cycle Let f sw = αf, where α = activity factor α = P 0 1 : probability that a signal switches from 0 to 1 in any clock cycle If the signal is the system clock, α = 1 If the signal switches once per cycle, α = 0.5 If the signal is random (clocked) data, α = 0.25 Static CMOS logic has (empirically) α 0.1 Dynamic power of a circuit: (summing over all the nodes in the circuit) PP sssssssssssssssss = VV 2 DDDD. ff. αα ii. CC ii ii 12

Dynamic Power Example 1 billion transistor chip 50M logic transistors Average width: 12 λ Activity factor = 0.1 950M memory transistors Average width: 4 λ Activity factor = 0.02 65 nm, 1.0V process (λ = 25nm) C = 1 ff/µm (gate) + 0.8 ff/µm (diffusion) Estimate dynamic power consumption @ 1 GHz. Neglect wire capacitance and short-circuit current. 13

Reducing Switching Power switching 2 DD P = αcv f So try to minimize: Activity factor Capacitance Supply voltage Frequency 14

Activity Factor Estimation Let P i = probability (node i = 1) and P i = (1 P i ) = probability (node i = 0) α i = prob. that node i makes a transition from 0 to 1, so α i = P i P i = (1 P i ) P i α i P i 15

Activity Factor Estimation For random data, α = 0.5 0.5 = 0.25 Data is often not completely random e.g. upper 9 bits of 16-bit word representing somebody s age Data propagating through ANDs and ORs has lower activity factor 16

Example: Switching Probability of NOR2 For NOR2, P Y = P A P B A B Y P Y = (1 P Y ) = (1 P A P B ) α Y = P Y P Y = (P A P B ) (1 P A P B ) A B Y 0 0 1 0 1 0 1 0 0 1 1 0 If P A = P B = 0.5, P Y = 0.25, α Y = 3/16 0.19 17

Switching Probabilities (Static Gates) Remember α Y = P Y P Y 18

Example: 4-input AND gate Assume all inputs have P=0.5 A B C D P=15/16 α=15/256 P=1/16 α=15/256 Y A B C D P=3/4 α=3/16 P=3/4 α=3/16 P=1/16 α=15/256 Y A B P=3/4 α=3/16 P=1/4 α=3/16 C P=7/8 α=7/64 P=1/8 α=7/64 D P=15/16 α=15/256 P=1/16 α=15/256 Y Which has the lowest power? 19

Number of Stages vs. Power Power depends on activity and capacitance at each node Generally fewer stages usually mean less power Compare this to delay frequently add stages to improve delay Tradeoff between speed and power 20

Beware of Glitches! Extra transitions caused by finite propagation delay A B n3 C n4 n5 n6 n7 D Y Suppose input changes from ABCD = 1101 to 0111? Glitching occurs whenever a node makes more transitions than necessary to reach its final value Glitching can raise the activity factor of a gate to greater than 1! 21

Clock Gating Another way to reduce the activity is to turn off the clock to registers in unused blocks Saves clock activity (α = 1) Eliminates all switching activity in the block Requires determining if block will be used 22

Capacitance Extra capacitance slows response and increases power Always try to reduce parasitic and wiring capacitance Good floorplanning to keep high activity communicating gates close to each other Drive long wires with inverters or buffers rather than complex gates Gate sizing and number of stages Designing network for minimum delay will usually result in a high-power network. Small increase in delay (e.g. by reducing the # of stages) can give large reduction in power There are no closed form solutions to determine gate sizes that minimize power under a delay constraint. Can be solved numerically Energy Delay 23

Voltage Power dissipated in gate is P av = α.f.c L.V DD 2 Energy per switching event* is E s = P av /(2.α.f) = (C L.V DD2 )/2 Power & Energy can be significantly reduced by decreasing V DD But delay of gate is D = (C L. V)/I Decreasing V DD increases delay (C L.V DD )/[(β/2).(v DD -V t ) 2 ] Circuit can be made (almost) arbitrarily low power at the expense of performance not very useful * switching event is defined as a transition from 0 1 or 1 0 24

Energy-Delay Product Introduce metric energy-delay product (EDP) = (energy per switching event) X (gate delay) EEEEEE = EE ss. DD = kk. CC LL 2 3. VV DDDD VV DDDD VV 2 tt normalized units V T = 0.4V V DD Minimum EDP at V DD = 3.V t (for long channel process) 25

Frequency Suppose we can do a task in T sec. on one processor Can we do it in T/2 sec. on two processors? if application has sufficient intrinsic parallelism How about doing it in T sec. on two processors running at half clock frequency? Proc. at V volts, f Hz = P watts Proc. at V volts, f/2 Hz = P/2 watts + Proc. at V volts, f/2 Hz = P/2 watts This gives no net power savings. But ssssssssss (VV DDDD VV TT ) 2 VV DDDD, so if we reduce clock frequency, we can also reduce VV DDDD : 26

Reduced Frequency & Voltage Rel. Speed V T = 0.5 V DD (volts) In this example, reducing speed by factor of 50% allows voltage reduction of ~35% Proc. at V volts, f Hz = P watts Proc. at 0.65V volts, f/2 Hz 0.2 P watts + Proc. at 0.65V volts, f/2 Hz 0.2 P watts Parallelism with reduced ff and VV DDDD leads to lower power diminishing returns as VV DDDD approaches VV TT 27

Dynamic Power Dissipation Example A B 12 36 Y 120 A NAND2 gate of size (input capacitance) 12C is driving an inverter of size 36C which in turn drives a load of 120C units of capacitance. Assume the inputs A, B are independent and uniformly distributed. What is the dynamic switching power dissipation of this gate if the gate capacitance C of a unit sized transistor is 0.1fF, V DD is 1.0V and the operating frequency is 1GHz? 28

Short-Circuit Power Finite slope of the input signal sets up a direct current path between V DD and GND for a short period during switching when both the NMOS and PMOS devices are conducting. I SC E sc t sc.v DD.I SC Depends on duration (slope) of the input transition, t sc I SC which is determined by saturation current of the P and N transistors depends on sizes, process technology, temperature, etc. ratio between input and output slopes (a function of C L ) 29

Slope Engineering Small Capacitive Load Large Capacitive Load I SC I SC 0 Output fall time significantly shorter than input rise time Output tracks input as per DC transfer function Large I SC when V IN V SW Output fall time significantly longer than input rise time Output transition lags input When V IN = V SW, V dsp is still very small, so small I SC 30

Impact of C L on I SC 500 psec input slope C L = 20 ff C L = 100 ff C L = 500 ff time ( 10-10 sec) When C L is small, I SC is large! Short circuit dissipation is minimized by matching the rise/fall times of the input and output signals - slope engineering. Typically less than 10% of dynamic power if rise/fall times are comparable for input and output 31

Static Power Dissipation Static power is consumed even when chip is quiescent i.e. powered up but not running Leakage consumes power from current passing through normally off devices sub-threshold current gate leakage current diode junction leakage current 32

Leakage Sources junction leakage gate leakage sub-threshold leakage Leakage currents are very small (per transistor basis) prior to 130 nm, not usually an issue (except in sleep mode of battery operated devices) but when multiplied by hundreds of millions of nanometer devices, can account for as much as 1/3 of active power All increase exponentially with temperature 33

Sub-threshold Leakage Shockley model assumes I ds = 0 when V gs V t But in real transistors, II dddd 100nnnn (WW/LL) when V gs = V t For V gs < V t, I ds decreases exponentially with V gs II dddd = II 0 10 VV gggg VV tt SS where S is sub-threshold slope 100mV/decade In nanometer processes, as we reduce V DD, we also reduce V t to maintain good on-current But reducing V t increases the off-current V DD Max. on current : II ssssss = ββ 2mm VV DDDD VV 2 tt V t GND Min. off current : II ssssss = II 0 10 0 VV tt SS 34

Sub-threshold Leakage Tradeoff between on current (performance) and off current (static power dissipation) as we adjust V t Typical values for off-current in 65nm with V DD =1V I off = 100 na/µm @ V t = 0.3 V I off = 10 na/µm @ V t = 0.4 V I off = 1 na/µm @ V t = 0.5 V 35

Stack Effect Series OFF transistors have less leakage for N1 to have any leakage, V x > 0 so N2 has negative V gs leakage through 2-stack reduces ~10x leakage through 3-stack reduces further Leakage and delay trade off Aim for low leakage in sleep and low delay in active mode 0 0 1 N2 V x N1 To reduce leakage: Increase V t : multiple V t Use low V t only in speed critical circuits Increase V s : stack effect Input vector control in sleep 36

Gate & Junction Leakage Gate leakage extremely strong function of t ox and V gs Negligible for older processes Approaches sub-threshold leakage at 65 nm An order of magnitude less for pmos than nmos Control gate leakage in the process using t ox > 10 Å High-k gate dielectrics help Some processes provide multiple t ox e.g. thicker oxide for 3.3 V I/O transistors Junction leakage usually negligible becoming little more significant in nanometer processes Control gate & junction leakage in circuits by limiting V DD 37

Power Gating Turn OFF power to blocks when they are idle to save leakage Use virtual V DD (V DDV ) Gate outputs to prevent invalid logic levels to next block Voltage drop across sleep transistor degrades performance during normal operation Size the transistor wide enough to minimize impact Switching wide sleep transistor costs dynamic power Only justified when circuit sleeps long enough 38

Voltage & Frequency Control Run each block at the lowest possible voltage and frequency that meets performance requirements Multiple Voltage Domains Provide separate supplies to different blocks Level converters required when crossing from low to high V DD domains Dynamic Voltage Scaling Adjust V DD and f according to workload 39