Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University

Similar documents
Low-Power VLSI. Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering

Leakage Current Analysis

A Survey of the Low Power Design Techniques at the Circuit Level

Power Spring /7/05 L11 Power 1

EECS 427 Lecture 22: Low and Multiple-Vdd Design

EECS 427 Lecture 13: Leakage Power Reduction Readings: 6.4.2, CBF Ch.3. EECS 427 F09 Lecture Reminders

UNIT-II LOW POWER VLSI DESIGN APPROACHES

Ruixing Yang

POWER GATING. Power-gating parameters

EE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling

Reliability and Energy Dissipation in Ultra Deep Submicron Designs

Low Power Techniques for SoC Design: basic concepts and techniques

Low Power Design in VLSI

Leakage Power Reduction for Logic Circuits Using Variable Body Biasing Technique

Keywords : MTCMOS, CPFF, energy recycling, gated power, gated ground, sleep switch, sub threshold leakage. GJRE-F Classification : FOR Code:

Pramoda N V Department of Electronics and Communication Engineering, MCE Hassan Karnataka India

The challenges of low power design Karen Yorav

A NEW APPROACH FOR DELAY AND LEAKAGE POWER REDUCTION IN CMOS VLSI CIRCUITS

Low Power Design of Successive Approximation Registers

Total reduction of leakage power through combined effect of Sleep stack and variable body biasing technique

Energy Efficiency of Power-Gating in Low-Power Clocked Storage Elements

Low-Power Digital CMOS Design: A Survey

Low Power High Performance 10T Full Adder for Low Voltage CMOS Technology Using Dual Threshold Voltage

EEC 216 Lecture #10: Ultra Low Voltage and Subthreshold Circuit Design. Rajeevan Amirtharajah University of California, Davis

EEC 216 Lecture #8: Leakage. Rajeevan Amirtharajah University of California, Davis

Contents 1 Introduction 2 MOS Fabrication Technology

A Literature Review on Leakage and Power Reduction Techniques in CMOS VLSI Design

Design of low power SRAM Cell with combined effect of sleep stack and variable body bias technique

Low Power Design for Systems on a Chip. Tutorial Outline

Analysis and Simulation of a Low-Leakage 6T FinFET SRAM Cell Using MTCMOS Technique at 45 nm Technology

Practical Information

Sleepy Keeper Approach for Power Performance Tuning in VLSI Design

LOW POWER VLSI TECHNIQUES FOR PORTABLE DEVICES Sandeep Singh 1, Neeraj Gupta 2, Rashmi Gupta 2

Ultra Low Power VLSI Design: A Review

PHYSICAL STRUCTURE OF CMOS INTEGRATED CIRCUITS. Dr. Mohammed M. Farag

Introduction. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. July 30, 2002

4 principal of JNTU college of Eng., JNTUH, Kukatpally, Hyderabad, A.P, INDIA

Low Power VLSI Circuit Synthesis: Introduction and Course Outline

Leakage Power Minimization in Deep-Submicron CMOS circuits

Opportunities and Challenges in Ultra Low Voltage CMOS. Rajeevan Amirtharajah University of California, Davis

Pushing Ultra-Low-Power Digital Circuits

Why Scaling? CPU speed Chip size R, C CPU can increase speed by reducing occupying area.

19. Design for Low Power

Optimization of power in different circuits using MTCMOS Technique

FinFET-based Design for Robust Nanoscale SRAM

A Static Power Model for Architects

Lecture 13 CMOS Power Dissipation

Elements of Low Power Design for Integrated Systems

Design of Low Power Vlsi Circuits Using Cascode Logic Style

Reduce Power Consumption for Digital Cmos Circuits Using Dvts Algoritham

Practical Information

New Approaches to Total Power Reduction Including Runtime Leakage. Leakage

INTERNATIONAL JOURNAL OF APPLIED ENGINEERING RESEARCH, DINDIGUL Volume 1, No 3, 2010

Implementation of dual stack technique for reducing leakage and dynamic power

Low-Power CMOS VLSI Design

Probabilistic and Variation- Tolerant Design: Key to Continued Moore's Law. Tanay Karnik, Shekhar Borkar, Vivek De Circuit Research, Intel Labs

EMT 251 Introduction to IC Design

Low Power System-On-Chip-Design Chapter 12: Physical Libraries

Introduction to VLSI ASIC Design and Technology

Power and Energy. Courtesy of Dr. Daehyun Dr. Dr. Shmuel and Dr.

Course Content. Course Content. Course Format. Low Power VLSI System Design Lecture 1: Introduction. Course focus

Low Transistor Variability The Key to Energy Efficient ICs

A HIGH SPEED & LOW POWER 16T 1-BIT FULL ADDER CIRCUIT DESIGN BY USING MTCMOS TECHNIQUE IN 45nm TECHNOLOGY

DESIGN FOR LOW-POWER USING MULTI-PHASE AND MULTI- FREQUENCY CLOCKING

DESIGNING OF SRAM USING LECTOR TECHNIQUE TO REDUCE LEAKAGE POWER

Leakage Power Reduction by Using Sleep Methods

Design of a Tri-modal Multi-Threshold CMOS Switch with Application to Data Retentive Power Gating

Course Outcome of M.Tech (VLSI Design)

CMPEN 411 VLSI Digital Circuits Spring Lecture 24: Peripheral Memory Circuits

LEAKAGE POWER REDUCTION IN CMOS CIRCUITS USING LEAKAGE CONTROL TRANSISTOR TECHNIQUE IN NANOSCALE TECHNOLOGY

Design & Analysis of Low Power Full Adder

EDA Challenges for Low Power Design. Anand Iyer, Cadence Design Systems

ESTIMATION OF LEAKAGE POWER IN CMOS DIGITAL CIRCUIT STACKS

Zero Steady State Current Power-on-Reset Circuit with Brown-Out Detector

A Novel Dual Stack Sleep Technique for Reactivation Noise suppression in MTCMOS circuits

Comparative Study of Different Low Power Design Techniques for Reduction of Leakage Power in CMOS VLSI Circuits

Low Power, Area Efficient FinFET Circuit Design

ECE 5745 Complex Digital ASIC Design Topic 2: CMOS Devices

BICMOS Technology and Fabrication

Leakage Control Techniques for Designing Robust, Low Power Wide-OR Domino Logic for Sub-130nm CMOS Technologies

Ramon Canal NCD Master MIRI. NCD Master MIRI 1

Logic Restructuring Revisited. Glitching in an RCA. Glitching in Static CMOS Networks

A Novel Radiation Tolerant SRAM Design Based on Synergetic Functional Component Separation for Nanoscale CMOS.

Jan Rabaey, «Low Powere Design Essentials," Springer tml

A/D Conversion and Filtering for Ultra Low Power Radios. Dejan Radjen Yasser Sherazi. Advanced Digital IC Design. Contents. Why is this important?

MTCMOS Post-Mask Performance Enhancement

Minimizing the Sub Threshold Leakage for High Performance CMOS Circuits Using Stacked Sleep Technique

Lecture #29. Moore s Law

Homework 10 posted just for practice. Office hours next week, schedule TBD. HKN review today. Your feedback is important!

Low Power Design Part I Introduction and VHDL design. Ricardo Santos LSCAD/FACOM/UFMS

Design and Analysis of Sram Cell for Reducing Leakage in Submicron Technologies Using Cadence Tool

A Case Study of Nanoscale FPGA Programmable Switches with Low Power

A DUAL-EDGED TRIGGERED EXPLICIT-PULSED LEVEL CONVERTING FLIP-FLOP WITH A WIDE OPERATION RANGE

Design of High Performance Arithmetic and Logic Circuits in DSM Technology

DIGITAL INTEGRATED CIRCUITS A DESIGN PERSPECTIVE 2 N D E D I T I O N

An Overview of Static Power Dissipation

Semiconductor Memory: DRAM and SRAM. Department of Electrical and Computer Engineering, National University of Singapore

Jack Keil Wolf Lecture. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. Lecture Outline. MOSFET N-Type, P-Type.

Chapter 1 Introduction

Week 7: Common-Collector Amplifier, MOS Field Effect Transistor

Transcription:

Low-Power VLSI Seong-Ook Jung 2011. 5. 6. sjung@yonsei.ac.kr VLSI SYSTEM LAB, YONSEI University School of Electrical l & Electronic Engineering i

Contents 1. Introduction 2. Power classification 3. Power performance relationship 4. Low power design 1. Architecture and algorithm level 2. Block and logic level 3. Circuit level 4. Device level 5. OMAP processor 6. Summary 2 YONSEI Univ.

Introduction ti

Technology Scaling Technology scaling : Moore s law The number of transistors t that t can be placed on an integrated t circuit it has doubled approximately every 18 months [1] Microprocessor Hall of Fame, Intel, 2004 4 YONSEI Univ.

Development Trend Scaling (More Moore) More devices are integrated in a chip New scaling road map Not only geometrical scaling for 2D device, but also equivalent scaling for 3D device Beyond bulk CMOS FinFET, SOI Functional diversification ifi (More than Moore) Several functions are merged in a chip 5 YONSEI Univ. [2] ITRS (International Technology Roadmap for Semiconductors) 2009

SoC Performance SoC performance : exponentially increase!! Thanks to both device technology and design methodology [2] ITRS 2009 6 YONSEI Univ.

SoC Power Consumption Problem SoC power consumption : also severely increase After 15 years, x10 power is required [2] ITRS 2009 7 YONSEI Univ.

SoC Power Density Problem Power density : exponentially increase!! Power consumption per die area (W/cm 2 ) We would soon reach power densities of nuclear power plants or rocket nozzles in a few years!! [1] Microprocessor Hall of Fame, Intel, 2004 8 YONSEI Univ.

Process Variation Problem Process variation : Result of scaling Global variation and local variation Global variation Comes from fabrication, lot, wafer processes Different process corner (NMOS-PMOS : SS/SF/TT/FS/FF) Local variation Truly random variation between device with identical layout [3] Synopsis, 2005 9 [4] http://cnx.org YONSEI Univ.

Process Variation Problem Performance variation due to process variation Frequency difference 30% Leakage current difference x20 Process variation should be considered in SoC design [5] A. Devgan, Berkeley 10 YONSEI Univ.

Effect of the Process Variation Low Voltage / Low Power limitation I D W/L*(V DD -V TH ) α V TH variation I D variation Performance Variation!! Need more design margin due to process variation V DD Yield limitation Because of process variation, failure probability Yield 11 YONSEI Univ.

저전력 SoC Low power VLSI design!!! Low process variation (high yield) design [1] ITRS Roadmap 2009 12 YONSEI Univ.

Power Classification

Power Classification Power consumption of CMOS circuits P total = P dynamic + P static P dynamic = P sw + P sc 14 YONSEI Univ.

Switching Power I=C L dv/dt=c L ΔVf P sw =IV DD =C L ΔV V DD f In digital circuit, ΔV=V DD P sw =IV DD =C L V DD2 f P sw is due to the charge and discharge (output t transition) of the capacitors driven by the circuit according to input transition. P sw = C L V DD2 f 15 YONSEI Univ.

Short Circuit Power P sc is caused by the simultaneous conductance of PMOS and NMOS during input and output transitions. P sc = (β/12)(v DD -2V TH ) 3 (t 3 -t 1 ) 16 YONSEI Univ.

Static Power : P sub, P gate & P junc P sub Sub-V TH leakage : V GS < V TH P sub Exp[(V GS -V TH )/mv T ] V DD P gate Ideal MOSFET : I gate = 0 In short channel MOSFET, I gate exists because of thin T OX P WL (V /T ) 2 gate GS OX) V DD P junc Reverse PN junction leakage P junc Exp[V D /v T -1] V DD 17 YONSEI Univ. [6] K.M.Cao, BSIM4 Gate Leakage Model Including Source-Drain Partition, IEDM, 2000

Power Performance Relationship

V DD Reduction Power consumption equation P sw = C L V DD2 f P sc = (β/12) (V DD -2V TH ) 3 (t 3 -t 1 ) P sub Exp[(V GS -V TH )/mv T ]V DD P gate WL (V GS /T OX ) 2 V DD P junc Exp[V D/v T -1] V DD Case.1 : V DD All power consumption However Delay C L V DD /I D C L V DD /(V DD -V TH ) α If V DD, Delay Performance loss 19 YONSEI Univ.

V DD Scaling Limitation Low V DD limitation with process variation V DD.min = V T0 +Kσ(V T ) σ(v T ) : 1-sigma of V T variation T N 0.25 ox A (LW) -0.5 Significant increment of σ(v T ) with technology scaling (LW ) V DD scaling meets the limitation!! Process variation tolerant circuit design technique is required!! 20 YONSEI Univ. [7] K.Itoh, Adaptive Circuits for the 0.5-V Nanoscale CMOS Era, ISSCC, 2009

High V TH Power consumption equation P sw = C L V DD2 f P sc = (β/12) (V DD -2V TH ) 3 (t 3 -t 1 ) P sub Exp[(V GS -V TH )/mv T ]V DD P gate WL (V GS /T OX ) 2 V DD P junc Exp[V D/v T -1] V DD Case.2 : V TH P sc and especially, P sub However Delay C L V DD /I D C L V DD /(V DD -V TH ) α If V TH, Delay Performance e o a loss 21 YONSEI Univ.

Low Frequency Power consumption equation P sw = C L V DD2 f P sc = (β/12) (V DD -2V TH ) 3 (t 3 -t 1 ) P sub Exp[(V GS -V TH )/mv T ]V DD P gate WL (V GS /T OX ) 2 V DD P junc Exp[V D/v T -1] V DD Case.3 : f P sw However Throughput f Performance loss 22 YONSEI Univ.

Tradeoff Tradeoff between low power and high h performance Low power design : - power reduction without performance degradation 23 YONSEI Univ.

Low power design

Low Power Design Methodology To make low power SoC Architecture and algorithm levels Parallelism, Pipeline Block and logic levels V DD / Frequency scheduling by monitoring workload (AVFS) Temperature management to reduce leakage current Circuit level Circuit type (Dynamic, static, ) Circuit technique (Dual V DD, Dual V TH, MTCMOS, Device level Control the process parameter Halo doping, retrograde well Low leakage new device SOI, FinFET 25 YONSEI Univ.

Architecture and Algorithm Levels

Parallelism P P ref P par ref < A simple adder comparator DP > C ref C V 2 ref f f ref V P par C par V 2 par f par 2 2 par par par 1 par ( N ) 2 2 Cref fref Vref N Vref V < Parallel implementation> N: # of parallelism : a slight increase in capacitance due to the extra routing 27 YONSEI Univ. [8] A.P. Chandrakasan, Minimizing power consumption in digital CMOS circuits, Proc. of IEEE,995

Pipeline < A simple adder comparator DP > < Pipeline implementation> P P C 2 2 ref C refv ref f ref pipe pipe pipe pipe P pipe ref C C pipe ref f f pipe ref V V 2 pipe 2 ref V (1 ) V 2 pipe 2 ref P V f N: # of pipeline stage : a slight increase in capacitance due to the extra latch 28 YONSEI Univ. [8] A.P. Chandrakasan, Minimizing power consumption in digital CMOS circuits, Proc. of IEEE,995

Circuit it Level

Circuit Level Low Power Techniques Low power techniques Multiple channel length Stacked transistor Dual V DD Dual V TH MTCMOS (Multi Threshold voltage CMOS) DVS (Dynamic Voltage Scaling) : open-loop / closed loop 30 YONSEI Univ.

Critical Path Critical Path : The worst case delay path Determines SoC s maximum performance # of critical path << # of non-critical path Fast non-critical path is just wasteful By increasing non-critical path s delay, we may achieve power reduction because of tradeoff relation between power & performance 31 YONSEI Univ.

Multiple Channel Length Threshold voltage roll-off Longer L Higher Vt Low leakage with low performance Used in non-critical path 32 YONSEI Univ.

Stacked Transistor V M level V M >0d due to leakage current. Negative V GS_MN1 Positive V SB_MN1 Increase in V TH by body effect P e ( V V ) V mvt sub V dd Large reduction in I sub gs th Primary input vector control to utilize the stack effect in the standby mode 33 YONSEI Univ.

Dual V DD Basic idea V DDL Logic gates off the critical path V DDH Logic gate on the critical path Reduce power without degrading the performance Shaded : VDDL Non-shaded: VDDH 34 YONSEI Univ.

Dual V DD : Design Issue & Target Issue Static ti current flow at a V DDH gate if it is directly drive by a V DDL gate Level converter is needed Overhead of area and power V SG >0 Static Current Design target For a give circuit, choose gates for V DDL application to minimize power consumption while maintaining performance with consider level converter. 35 YONSEI Univ.

Dual V TH Voltages HVt LVt Assigned to transistors in noncritical path. Leakage saving in both standby and active modes Assigned to transistors in critical path Maintained performance 36 YONSEI Univ.

MTCMOS : Basic MTCMOS : Multiple Threshold voltage CMOS Low power & low Energy E ToT = E STD + E ACT = P static * t STD + P dynamic * t ACT Portable device : t STD >> t ACT Basic circuit scheme Two different Vt HVt (0.5~0.6V) LVt (0.2~0.3V) Two operating mode Active Standby 37 YONSEI Univ.

MTCMOS : Scheme Active mode SL=1 / SL=0 V DDV V DD / V GNDV V GND LVt operating frequency Standby mode SL=0 / SL=1 V DDV & V GNDV = floating HVt leakage 38 YONSEI Univ.

MTCMOS : Constraint Performance constraint according to Normalized foot/head switch size : W H /W L Normalized cap on VDDV/VGNDV : C V /C O Area penalty Relatively small because Head/Footswitches are shared by all logic gates on a chip p(g (global foot switch) 39 YONSEI Univ.

DVFS : Basic Concept Basic concept P dynamic = CV DD2 f V DD and frequency scaling simultaneously V DD scaling A best way to get low P dynamic because P dynamic V DD 2 Frequency scaling Operating frequency = throughput Not all task requires maximum throughput By controlling the frequency, SoC improves energy efficiency 40 YONSEI Univ. [10] T.Burd, A Dynamic Voltage Scaled Microprocessor System, JSSC, 2000

DVFS : Open loop VS. Closed Loop Open loop system Can not adapt to PVT variations Need more design margin Example Enhanced SpeedStep technology of Intel Closed loop system Can adapt to PVT variations Need less design margin Example Intelligent Energy Management technology of ARM SmartReflex2 of TI OMAP processor [11] Enhanced Speed Step technology, Intel 41 YONSEI Univ.

DVFS (SONY, PDA) Block Diagram Closed loop system 42 YONSEI Univ. [12] M.Nakai, Dynamic Voltage and Frequency Management for a Low-Power Embedded Microprocessor, JSSC, 2005

Delay Synthesizer Structure Composed not only a simple transistor delay factor, but also wire delay and rise/fall delay Gate delay component : one of nominal gate length and another of long gate length RC delay component : wires from each of the four metal layers and its total length is 14mm 43 YONSEI Univ.

Delay Synthesizer Effect 44 YONSEI Univ.

Operation (DVC+DFC) Operation procedure Low High : The main logic clock frequency is changed after the DVC confirms the voltage has increased enough High Low : Both the DVC reference clock and the system clock are changed simultaneously 45 YONSEI Univ.

Device Level

Device Level Low Power Technique FinFET FinFET : Vertical structure Planar MOSFET width = FinFET height σ(v T ) T ox N A 0.25 (LW) -0.5 As scaling goes on, variation of planar MOSFET get worse V DD scaling is impossible However, FinFET s σ(v T ) doesn t degraded FinFET width doesn t occupy the active area As scaling goes on, L*W of FinFET can be maintained V DD scaling is possible low power!! 47 YONSEI Univ. [7] K.Itoh, Adaptive Circuits for the 0.5-V Nanoscale CMOS Era, ISSCC, 2009

OMAP Processor

OMAP Processor OMAP Processor Dual core platform Multimedia hardware accelerators for video and graphics Frame buffers Various dedicated and general purpose interfaces Power saving mode Idle (Clock stopped) Retention for low leakage Fast re-start and power-off mode Power gating technique ISSCC05, 138-139 49 YONSEI Univ.

Power Domains 5 power domains Processor core 1 Processor core 2 Hardware accelerator (Graphic) Always on Rest of the chip (including the interconnects and various peripherals) ISSCC05, 138-139 50 YONSEI Univ.

Power Gating Power gating Global mesh built with the highest metal layer distributes power and ground across the chip Local mesh is broken to reflect the power domain partitioning Power switch makes connection between global l mesh and local l mesh according to operating modes and switch control If power domain is on, its power switches connect its local plane to the global plane., i.e., the constant power supply Otherwise that plane drifts to a potential near ground Power switch Embedded in power domains by placing power switches at a regular pitch in a staggered manner by placing power switches around hard Ips Header switch 90um PMOS with 200uA current driving capability at worst case Multiple fingers and redundant vias ISSCC05, 138-139 51 YONSEI Univ.

Embedded Power Domains Other power management cells Retention ti flip-flops Constantly powered buffers to transport critical signals through a power domain potentially off Isolation cells to prevent the propagation of a non-state ISSCC05, 138-139 52 YONSEI Univ.

Power Switching Control Current surges and dynamic IR drop Two-pass turn-on mechanism Weak PMOS to sinks low current for power restore: Turn-on first Strong PMOS to deliver current for normal operation: Turn-on next ISSCC05, 138-139 53 YONSEI Univ.

Current Surge and Power Restore ISSCC05, 138-139 54 YONSEI Univ.

Leakage Current Reduction In off mode Leakage current comes from power switches and power management cells 4 power switches per Kgate ~40X leakage reduction ISSCC05, 138-139 55 YONSEI Univ.

SRAM Retention Footer and header diodes In active mode, the diodes are bypassed During retention mode, one diode is enabled and Field across the array is reduced Reverse body bias Leakage saving (x2) ISSCC05, 138-139 56 YONSEI Univ.

Dual Gate Length Dual gate length Standby mode: 30% leakage reduction Active mode: active leakage current saving: very useful if many blocks are idle in active mode Vdd scaling during the slow active mode 300mV scaling: 2X leakage reduction ISSCC05, 138-139 57 YONSEI Univ.

Summary

Summary Green SoC design Low power & process variation tolerant SoC design P = P sw + P sc + P sub + P gate + P junc P dynamic P static Power and performance : Trade-off Low power design Architecture and algorithm level : parallelism, pipe line Block and logic level : workload monitoring, V DD /frequency scheduling Circuit level Long channel : Reduce I leak by using V TH roll off (V TH ) Stacked MOSFET : Reduce I leak by using body effect (V TH ) & negative V GS Dual V DD : Use low V DD at non-critical path Dual V TH : Use low V TH at non-critical path MTCMOS: Use high V TH sleep TR (low leakage in stand-by mode) & low V TH logic (high TH performance in active mode) DVFS : Reduce dynamic power by controlling both V DD & frequency Device level : FinFET TH 59 YONSEI Univ.