Power Spring /7/05 L11 Power 1

Similar documents
UNIT-II LOW POWER VLSI DESIGN APPROACHES

Low-Power Digital CMOS Design: A Survey

A Survey of the Low Power Design Techniques at the Circuit Level

Lecture 13 CMOS Power Dissipation

Low Power Design in VLSI

Jan Rabaey, «Low Powere Design Essentials," Springer tml

Low-Power VLSI. Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering

Low Power Design. Prof. MacDonald

Design of Low Power Vlsi Circuits Using Cascode Logic Style

Power and Energy. Courtesy of Dr. Daehyun Dr. Dr. Shmuel and Dr.

19. Design for Low Power

Low Power Design Part I Introduction and VHDL design. Ricardo Santos LSCAD/FACOM/UFMS

Low Power VLSI Circuit Synthesis: Introduction and Course Outline

Low Power Design for Systems on a Chip. Tutorial Outline

The challenges of low power design Karen Yorav

CHAPTER 3 PERFORMANCE OF A TWO INPUT NAND GATE USING SUBTHRESHOLD LEAKAGE CONTROL TECHNIQUES

Energy Efficiency of Power-Gating in Low-Power Clocked Storage Elements

Contents 1 Introduction 2 MOS Fabrication Technology

EE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling

A Static Power Model for Architects

EECS 427 Lecture 22: Low and Multiple-Vdd Design

Course Content. Course Content. Course Format. Low Power VLSI System Design Lecture 1: Introduction. Course focus

Designing of Low-Power VLSI Circuits using Non-Clocked Logic Style

Sleepy Keeper Approach for Power Performance Tuning in VLSI Design

Chapter 1 Introduction

UNIT-1 Fundamentals of Low Power VLSI Design

Low Power Design of Successive Approximation Registers

Low-Power CMOS VLSI Design

A Literature Review on Leakage and Power Reduction Techniques in CMOS VLSI Design

Ruixing Yang

Leakage Power Minimization in Deep-Submicron CMOS circuits

Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University

EECS 427 Lecture 13: Leakage Power Reduction Readings: 6.4.2, CBF Ch.3. EECS 427 F09 Lecture Reminders

Leakage Current Analysis

An Overview of Static Power Dissipation

Novel Low-Overhead Operand Isolation Techniques for Low-Power Datapath Synthesis

LOW POWER VLSI TECHNIQUES FOR PORTABLE DEVICES Sandeep Singh 1, Neeraj Gupta 2, Rashmi Gupta 2

Pramoda N V Department of Electronics and Communication Engineering, MCE Hassan Karnataka India

Design of High Performance Arithmetic and Logic Circuits in DSM Technology

UNIT-III POWER ESTIMATION AND ANALYSIS

Total reduction of leakage power through combined effect of Sleep stack and variable body biasing technique

1. Short answer questions. (30) a. What impact does increasing the length of a transistor have on power and delay? Why? (6)

Low Power Techniques for SoC Design: basic concepts and techniques

EE434 ASIC & Digital Systems. Partha Pande School of EECS Washington State University

ECE 484 VLSI Digital Circuits Fall Lecture 02: Design Metrics

Innovations In Techniques And Design Strategies For Leakage And Overall Power Reduction In Cmos Vlsi Circuits: A Review

Topics. Low Power Techniques. Based on Penn State CSE477 Lecture Notes 2002 M.J. Irwin and adapted from Digital Integrated Circuits 2002 J.

Low Transistor Variability The Key to Energy Efficient ICs

Ultra Low Power VLSI Design: A Review

DESIGN OF A NOVEL CURRENT MIRROR BASED DIFFERENTIAL AMPLIFIER DESIGN WITH LATCH NETWORK. Thota Keerthi* 1, Ch. Anil Kumar 2

Power dissipation in CMOS

Reduce Power Consumption for Digital Cmos Circuits Using Dvts Algoritham

DESIGNING OF SRAM USING LECTOR TECHNIQUE TO REDUCE LEAKAGE POWER

Introduction. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. July 30, 2002

Interconnect-Power Dissipation in a Microprocessor

EE 42/100 Lecture 23: CMOS Transistors and Logic Gates. Rev A 4/15/2012 (10:39 AM) Prof. Ali M. Niknejad

Keywords : MTCMOS, CPFF, energy recycling, gated power, gated ground, sleep switch, sub threshold leakage. GJRE-F Classification : FOR Code:

ELEC Digital Logic Circuits Fall 2015 Delay and Power

Lecture 11: Clocking

Leakage Control Techniques for Designing Robust, Low Power Wide-OR Domino Logic for Sub-130nm CMOS Technologies

Active Decap Design Considerations for Optimal Supply Noise Reduction

Processor Power and Power Reduction

Low Power, Area Efficient FinFET Circuit Design

ESTIMATION OF LEAKAGE POWER IN CMOS DIGITAL CIRCUIT STACKS

A Novel Low-Power Scan Design Technique Using Supply Gating

Lecture 9: Clocking for High Performance Processors

A NEW APPROACH FOR DELAY AND LEAKAGE POWER REDUCTION IN CMOS VLSI CIRCUITS

Leakage Power Reduction by Using Sleep Methods

THERE is a growing need for high-performance and. Static Leakage Reduction Through Simultaneous V t /T ox and State Assignment

Energy-Recovery CMOS Design

2009 Spring CS211 Digital Systems & Lab 1 CHAPTER 3: TECHNOLOGY (PART 2)

Digital Integrated Circuits Lecture 20: Package, Power, Clock, and I/O

Lecture 04 CSE 40547/60547 Computing at the Nanoscale Interconnect

EECS150 - Digital Design Lecture 19 CMOS Implementation Technologies. Recap and Outline

LEAKAGE POWER REDUCTION IN CMOS CIRCUITS USING LEAKAGE CONTROL TRANSISTOR TECHNIQUE IN NANOSCALE TECHNOLOGY

Temperature-adaptive voltage tuning for enhanced energy efficiency in ultra-low-voltage circuits

Static Energy Reduction Techniques in Microprocessor Caches

Minimizing the Sub Threshold Leakage for High Performance CMOS Circuits Using Stacked Sleep Technique

DIGITAL INTEGRATED CIRCUITS A DESIGN PERSPECTIVE 2 N D E D I T I O N

4 principal of JNTU college of Eng., JNTUH, Kukatpally, Hyderabad, A.P, INDIA

Design and Implementation of Digital CMOS VLSI Circuits Using Dual Sub-Threshold Supply Voltages

Homework 10 posted just for practice. Office hours next week, schedule TBD. HKN review today. Your feedback is important!

Design of a Tri-modal Multi-Threshold CMOS Switch with Application to Data Retentive Power Gating

EEC 216 Lecture #8: Leakage. Rajeevan Amirtharajah University of California, Davis

POWER GATING. Power-gating parameters

Static Power and the Importance of Realistic Junction Temperature Analysis

STUDY OF VOLTAGE AND CURRENT SENSE AMPLIFIER

Design of 32-bit ALU using Low Power Energy Efficient Full Adder Circuits

Design & Analysis of Low Power Full Adder

Implementation of dual stack technique for reducing leakage and dynamic power

Analysis of shift register using GDI AND gate and SSASPL using Multi Threshold CMOS technique in 22nm technology

White Paper Stratix III Programmable Power

Ramon Canal NCD Master MIRI. NCD Master MIRI 1

EE241 - Spring 2013 Advanced Digital Integrated Circuits. Announcements. Lecture 16: Power and Performance

DESIGN & IMPLEMENTATION OF SELF TIME DUMMY REPLICA TECHNIQUE IN 128X128 LOW VOLTAGE SRAM

18nm FinFET. Lecture 30. Perspectives. Administrivia. Power Density. Power will be a problem. Transistor Count

Power-Area trade-off for Different CMOS Design Technologies

A Case Study of Nanoscale FPGA Programmable Switches with Low Power

Leakage Power Reduction for Logic Circuits Using Variable Body Biasing Technique

INTERNATIONAL JOURNAL OF APPLIED ENGINEERING RESEARCH, DINDIGUL Volume 1, No 3, 2010

Transcription:

Power 6.884 Spring 2005 3/7/05 L11 Power 1

Lab 2 Results Pareto-Optimal Points 6.884 Spring 2005 3/7/05 L11 Power 2

Standard Projects Two basic design projects Processor variants (based on lab1&2 testrigs) Non-blocking caches and memory system Possible project ideas on web site Must hand in proposal before quiz on March 18th, including: Team members (2 or 3 per team) Description of project, including the architecture exploration you will attempt 6.884 Spring 2005 3/7/05 L11 Power 3

Non-Standard Projects Must hand in proposal early by class on March 14th, describing: Team members (2 or 3) The chip you want to design The existing reference code you will use to build a test rig, and the test strategy you will use The architectural exploration you will attempt 6.884 Spring 2005 3/7/05 L11 Power 4

Power Trends 1000 1000W CPU? 100 Pentium R 4 proc Power ( watts ) 10 Pentium R proc 1 8086 386 8080 0.1 1970 1980 1990 2000 2010 2020 Figure by MIT OCW. Adapted from Intel. Used with permission. CMOS originally used for very low-power circuitry such as wristwatches Now some CPUs have power dissipation >100W 6.884 Spring 2005 3/7/05 L11 Power 5

Power Concerns Power dissipation is limiting factor in many systems battery weight and life for portable devices packaging and cooling costs for tethered systems case temperature for laptop/wearable computers fan noise not acceptable in some settings Internet data center, ~8,000 servers,~2mw 25% of running cost is in electricity supply for supplying power and running air-conditioning to remove heat Environmental concerns ~2005, 1 billion PCs, 100W each => 100 GW 100 GW = 40 Hoover Dams 6.884 Spring 2005 3/7/05 L11 Power 6

On-Chip Power Distribution Supply pad G V G V B A Routed power distribution on two stacked layers of metal (one for VDD, one for GND). OK for low-cost, low-power designs with few layers of metal. V G V G V G V G V G V G V G V G V G V G V G V G V G V G V G V G Power Grid. Interconnected vertical and horizontal power bars. Common on most highperformance designs. Often well over half of total metal on upper thicker layers used for VDD/GND. Via Dedicated VDD/GND planes. Very expensive. Only used on Alpha 21264. Simplified circuit analysis. Dropped on subsequent Alphas. 6.884 Spring 2005 3/7/05 L11 Power 7

Power Dissipation in CMOS Short-Circuit Current Diode Leakage Current Gate Leakage Current Capacitor Charging Current C L Subthreshold Leakage Current Primary Components: Capacitor charging, energy is 1/2 CV 2 per transition the dominant source of power dissipation today Short-circuit current, PMOS & NMOS both on during transition kept to <10% of capacitor charging current by making edges fast Subthreshold leakage, transistors don t turn off completely approaching 10-40% of active power in <180nm technologies Diode leakage from parasitic source and drain diodes usually negligible Gate leakage from electrons tunneling across gate oxide was negligible, increasing due to very thin gate oxides 6.884 Spring 2005 3/7/05 L11 Power 8

Energy to Charge Capacitor V DD T T Isupply E0 1 = dt V 0 0 out C L P(t) = VDD Isupply(t) dt VDD = VDD C dv out = C V 2 L L DD 0 During 0->1 transition, energy C L V DD 2 removed from power supply After transition, 1/2 C L V DD 2 stored in capacitor, the other 1/2 C L V DD 2 was dissipated as heat in pullup resistance The 1/2 C L V DD 2 energy stored in capacitor is dissipated in the pulldown resistance on next 1->0 transition 6.884 Spring 2005 3/7/05 L11 Power 9

Power Formula Power = activity * frequency * (1/2 CV DD 2 + ) V DD I SC + V DD I Subthreshold + V DD I Diode + V DD I Gate Activity is average number of transitions per clock cycle (clock has two) 6.884 Spring 2005 3/7/05 L11 Power 10

Switching Power Power activity * 1/2 CV 2 * frequency Reduce activity Reduce switched capacitance C Reduce supply voltage V Reduce frequency 6.884 Spring 2005 3/7/05 L11 Power 11

Reducing Activity with Clock Gating Clock Gating don t clock flip-flop if not needed avoids transitioning downstream logic enable adds to control logic complexity Pentium-4 has hundreds of gated clock domains Global Clock D Enable Latch (transparent on clock low) Gated Local Clock Q Clock Enable Latched Enable Gated Clock 6.884 Spring 2005 3/7/05 L11 Power 12

Reducing Activity with Data Gating Avoid data toggling in unused unit by gating off inputs A B Shifter infrequently used Shifter Adder 1 0 Shift/Add Select A B Could use transparent latch instead of AND gate to reduce number of transitions, but would be bigger and slower. Shifter Adder 1 0 6.884 Spring 2005 3/7/05 L11 Power 13

Other Ways to Reduce Activity Bus Encodings choose encodings that minimize transitions on average (e.g., Gray code for address bus) compression schemes (move fewer bits) Freeze Don t Cares If a signal is a don t care, then freeze last dynamic value (using a latch) rather than always forcing to a fixed 1 or 0. E.g., 1, X, 1, 0, X, 0 ===> 1, X=1, 1, 0, X=0, 0 Remove Glitches balance logic paths to avoid glitches during settling 6.884 Spring 2005 3/7/05 L11 Power 14

Reducing Switched Capacitance Reduce switched capacitance C Careful transistor sizing (small transistors off critical path) Tighter layout (good floorplanning) Segmented structures (avoid switching long nets) Bus A B C Shared bus driven by A or B when sending values to C A B C Insert switch to isolate bus segment when B sending to C 6.884 Spring 2005 3/7/05 L11 Power 15

Reducing Frequency Doesn t save energy, just reduces rate at which it is consumed (lower power, but must run longer) Get some saving in battery life from reduction in rate of discharge 6.884 Spring 2005 3/7/05 L11 Power 16

Reducing Supply Voltage Quadratic savings in energy per transition (1/2 CV DD 2) Circuit speed is reduced Must lower clock frequency to maintain correctness CV DD Td = k(v - V )α DD th α = 1 2 Delay rises sharply as supply voltage approaches threshold voltages Courtesy of Mark Horowitz and Stanford University. Used with permission. 6.884 Spring 2005 3/7/05 L11 Power 17

Voltage Scaling for Reduced Energy Reducing supply voltage by 0.5 improves energy per transition by ~0.25 Performance is reduced need to use slower clock Can regain performance with parallel architecture Alternatively, can trade surplus performance for lower energy by reducing supply voltage until just enough performance Dynamic Voltage Scaling 6.884 Spring 2005 3/7/05 L11 Power 18

Parallel Architectures Reduce Energy at Constant Throughput 8-bit adder/comparator 40MHz at 5V, area = 530 kµ 2 Base power Pref Two parallel interleaved adder/compare units 20MHz at 2.9V, area = 1,800 kµ 2 (3.4x) Power = 0.36 Pref One pipelined adder/compare unit 40MHz at 2.9V, area = 690 kµ 2 (1.3x) Power = 0.39 Pref Pipelined and parallel 20MHz at 2.0V, area = 1,961 kµ 2 (3.7x) Power = 0.2 Pref Chandrakasan et. al. Low-Power CMOS Digital Design, IEEE JSSC 27(4), April 1992 6.884 Spring 2005 3/7/05 L11 Power 19

Just Enough Performance Frequency Run fast then stop Run slower and just meet deadline t=0 Time t=deadline Save energy by reducing frequency and voltage to minimum necessary 6.884 Spring 2005 3/7/05 L11 Power 20

Voltage Scaling on Transmeta Crusoe TM5400 Frequency (MHz) Relative Performance (%) Voltage (V) Relative Energy (%) Relative Power (%) 700 100.0 1.65 100.0 100.0 600 85.7 1.60 94.0 80.6 500 71.4 1.50 82.6 59.0 400 57.1 1.40 72.0 41.4 300 42.9 1.25 57.4 24.6 200 28.6 1.10 44.4 12.7 6.884 Spring 2005 3/7/05 L11 Power 21

Leakage Power Under ideal scaling, want to reduce threshold voltage as fast as supply voltage But subthreshold leakage is an exponential function of threshold voltage and temperature 1E-06 1E-07 Isubthresho ld = k e -q V T a k B T Subthreshold Current (A/ µm) 1E-08 1E-09 1E-10 1E-11 0 o C 55 o C 110 o C 1E-12 0.0 0.2 0.4 0.6 0.8 Figure by MIT OCW. Threshold Voltage (V T ) 6.884 Spring 2005 3/7/05 L11 Power 22

Rise in Leakage Power 250 120% 200 Power ( watts ) 150 100 50 80% 40% 0 0% 0.25m 0.18m 0.13m 0.1m 0.07m Technology Active Power Active Leakage Power Figure by MIT OCW. 6.884 Spring 2005 3/7/05 L11 Power 23

Design-Time Leakage Reduction Use slow, low-leakage transistors off critical path leakage proportional to device width, so use smallest devices off critical path leakage drops greatly with stacked devices (acts as drain voltage divider), so use more highly stacked gates off critical path leakage drops with increasing channel length, so slightly increase length off critical path dual V T - process engineers can provide two thresholds (at extra cost) use high V T off critical path (modern cell libraries often have multiple V T ) 6.884 Spring 2005 3/7/05 L11 Power 24

Critical Path Leakage Critical paths dominate leakage after applying designtime leakage reduction techniques Example: PowerPC 750 5% of transistor width is low Vt, but these account for >50% of total leakage Possible approach, run-time leakage reduction switch off critical path transistors when not needed 6.884 Spring 2005 3/7/05 L11 Power 25

Run-Time Leakage Reduction Body Biasing Vt increase by reverse-biased body effect Large transition time and wakeup latency due to well cap and resistance Power Gating Sleep transistor between supply and virtual supply lines Increased delay due to sleep transistor Sleep Vector Drain Gate Input vector which minimizes leakage Increased delay due to mux and active energy due to spurious toggles after applying sleep vector Source Body Vbody > Vdd Vdd Sleep signal Virtual Vdd Logic cells 0 0 6.884 Spring 2005 3/7/05 L11 Power 26

Power Reduction for Cell-Based Designs Minimize activity Use clock gating to avoid toggling flip-flops Partition designs so minimal number of components activated to perform each operation Floorplan units to reduce length of most active wires Use lowest voltage and slowest frequency necessary to reach target performance Use pipelined architectures to allow fewer gates to reach target performance (reduces leakage) After pipelining, use parallelism to further reduce needed frequency and voltage if possible Always use energy-delay plots to understand power tradeoffs 6.884 Spring 2005 3/7/05 L11 Power 27

Energy versus Delay Energy A B C D Constant Energy-Delay Product Delay Can try to compress this 2D information into single number Energy*Delay product Energy*Delay 2 gives more weight to speed, mostly insensitive to supply voltage Many techniques can exchange energy for delay Single number (ED, ED 2 ) often misleading for real designs usually want minimum energy for given delay or minimum delay for given power budget can t scale all techniques across range of interest To fully compare alternatives, should plot E-D curve for each solution 6.884 Spring 2005 3/7/05 L11 Power 28

Energy versus Delay Energy A better B better Architecture A Architecture B Delay (1/performance) Should always compare architectures at the same performance level or at the same energy Can always trade performance for energy using voltage/frequency scaling Other techniques can trade performance for energy consumption (e.g., less pipelining, fewer parallel execution units, smaller caches, etc) 6.884 Spring 2005 3/7/05 L11 Power 29

Temperature Hot Spots Not just total power, but power density is a problem for modern high-performance chips Some parts of the chip get much hotter than others Transistors get slower when hotter Leakage gets exponentially worse (can get thermal runaway with positive feedback between temperature and leakage power) Chip reliability suffers Few good solutions as yet Better floorplanning to spread hot units across chip Activity migration, to move computation from hot units to cold units More expensive packaging (liquid cooling) 6.884 Spring 2005 3/7/05 L11 Power 30

Itanium Temperature Plot Image removed due to copyright restrictions. Please see: Krishnamurthy, R., A. Alvandpour, S. Mathew, M. Anders, V. De, and S. Borkar. "High-Performance, Low-Power, and Leakage-Tolerance Challenges for Sub-70nm Microprocessor Circuits." (Session Invited Paper). IEEE European Solid State Circuits Conference, Sept. 25, 2002. Paper no. C17.01. 6.884 Spring 2005 3/7/05 L11 Power 31