Leakage Power Minimization in Deep-Submicron CMOS circuits

Similar documents
Ruixing Yang

Low Power Design Methods: Design Flows and Kits

EECS 427 Lecture 13: Leakage Power Reduction Readings: 6.4.2, CBF Ch.3. EECS 427 F09 Lecture Reminders

Low Power System-On-Chip-Design Chapter 12: Physical Libraries

Probabilistic and Variation- Tolerant Design: Key to Continued Moore's Law. Tanay Karnik, Shekhar Borkar, Vivek De Circuit Research, Intel Labs

Low-Power VLSI. Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering

A Case Study of Nanoscale FPGA Programmable Switches with Low Power

POWER GATING. Power-gating parameters

New Approaches to Total Power Reduction Including Runtime Leakage. Leakage

FPGA Based System Design

EECS 427 Lecture 22: Low and Multiple-Vdd Design

Low Power High Performance 10T Full Adder for Low Voltage CMOS Technology Using Dual Threshold Voltage

CHAPTER 3 PERFORMANCE OF A TWO INPUT NAND GATE USING SUBTHRESHOLD LEAKAGE CONTROL TECHNIQUES

Lecture 9: Clocking for High Performance Processors

Reducing the Sub-threshold and Gate-tunneling Leakage of SRAM Cells using Dual-V t and Dual-T ox Assignment

Leakage Current Analysis

UNIT-II LOW POWER VLSI DESIGN APPROACHES

PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS

Sleepy Keeper Approach for Power Performance Tuning in VLSI Design

Low-Power Digital CMOS Design: A Survey

2009 Spring CS211 Digital Systems & Lab 1 CHAPTER 3: TECHNOLOGY (PART 2)

Power Spring /7/05 L11 Power 1

Power and Energy. Courtesy of Dr. Daehyun Dr. Dr. Shmuel and Dr.

Introduction to VLSI ASIC Design and Technology

BICMOS Technology and Fabrication

Low Transistor Variability The Key to Energy Efficient ICs

A Novel Low-Power Scan Design Technique Using Supply Gating

Leakage Power Reduction for Logic Circuits Using Variable Body Biasing Technique

Design of low power SRAM Cell with combined effect of sleep stack and variable body bias technique

A Novel Dual Stack Sleep Technique for Reactivation Noise suppression in MTCMOS circuits

Tiago Reimann Cliff Sze Ricardo Reis. Gate Sizing and Threshold Voltage Assignment for High Performance Microprocessor Designs

A Survey of the Low Power Design Techniques at the Circuit Level

Preface to Third Edition Deep Submicron Digital IC Design p. 1 Introduction p. 1 Brief History of IC Industry p. 3 Review of Digital Logic Gate

A Literature Review on Leakage and Power Reduction Techniques in CMOS VLSI Design

Energy Efficiency of Power-Gating in Low-Power Clocked Storage Elements

Keywords : MTCMOS, CPFF, energy recycling, gated power, gated ground, sleep switch, sub threshold leakage. GJRE-F Classification : FOR Code:

EEC 118 Lecture #12: Dynamic Logic

Design of High Performance Arithmetic and Logic Circuits in DSM Technology

Design of a Tri-modal Multi-Threshold CMOS Switch with Application to Data Retentive Power Gating

CHAPTER 3 NEW SLEEPY- PASS GATE

EC 1354-Principles of VLSI Design

ISSCC 2003 / SESSION 6 / LOW-POWER DIGITAL TECHNIQUES / PAPER 6.2

EDA Challenges for Low Power Design. Anand Iyer, Cadence Design Systems

THERE is a growing need for high-performance and. Static Leakage Reduction Through Simultaneous V t /T ox and State Assignment

ESTIMATION OF LEAKAGE POWER IN CMOS DIGITAL CIRCUIT STACKS

Ultra Low Power VLSI Design: A Review

Datorstödd Elektronikkonstruktion

Logic Restructuring Revisited. Glitching in an RCA. Glitching in Static CMOS Networks

19. Design for Low Power

Interconnect-Power Dissipation in a Microprocessor

A/D Conversion and Filtering for Ultra Low Power Radios. Dejan Radjen Yasser Sherazi. Advanced Digital IC Design. Contents. Why is this important?

EE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling

Lecture Perspectives. Administrivia

LEAKAGE POWER REDUCTION IN CMOS CIRCUITS USING LEAKAGE CONTROL TRANSISTOR TECHNIQUE IN NANOSCALE TECHNOLOGY

A Dual-V DD Low Power FPGA Architecture

Lecture 30. Perspectives. Digital Integrated Circuits Perspectives

Digital Design and System Implementation. Overview of Physical Implementations

Pramoda N V Department of Electronics and Communication Engineering, MCE Hassan Karnataka India

Active Decap Design Considerations for Optimal Supply Noise Reduction

Minimizing the Sub Threshold Leakage for High Performance CMOS Circuits Using Stacked Sleep Technique

Implementation of dual stack technique for reducing leakage and dynamic power

Novel Low-Overhead Operand Isolation Techniques for Low-Power Datapath Synthesis

Low Power Design of Schmitt Trigger Based SRAM Cell Using NBTI Technique

Digital Integrated CircuitDesign

EE4800 CMOS Digital IC Design & Analysis. Lecture 1 Introduction Zhuo Feng

4 principal of JNTU college of Eng., JNTUH, Kukatpally, Hyderabad, A.P, INDIA

Total reduction of leakage power through combined effect of Sleep stack and variable body biasing technique

A Novel Radiation Tolerant SRAM Design Based on Synergetic Functional Component Separation for Nanoscale CMOS.

Low Power Techniques for SoC Design: basic concepts and techniques

Reduce Power Consumption for Digital Cmos Circuits Using Dvts Algoritham

High Speed Low Power Noise Tolerant Multiple Bit Adder Circuit Design Using Domino Logic

ZIGZAG KEEPER: A NEW APPROACH FOR LOW POWER CMOS CIRCUIT

International Journal of Innovative Research in Technology, Science and Engineering (IJIRTSE) Volume 1, Issue 1.

Leakage Diminution of Adder through Novel Ultra Power Gating Technique

IJMIE Volume 2, Issue 3 ISSN:

Low Power Design of Successive Approximation Registers

A 0.9 V Low-power 16-bit DSP Based on a Top-down Design Methodology

Post-Layout Leakage Power Minimization Based on Distributed Sleep Transistor Insertion

Characterization of 6T CMOS SRAM in 65nm and 120nm Technology using Low power Techniques

BiCMOS Circuit Design

PE713 FPGA Based System Design

Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University

[Singh*, 5(3): March, 2016] ISSN: (I2OR), Publication Impact Factor: 3.785

Low Power Design Part I Introduction and VHDL design. Ricardo Santos LSCAD/FACOM/UFMS

A NEW APPROACH FOR DELAY AND LEAKAGE POWER REDUCTION IN CMOS VLSI CIRCUITS

Design Methodologies. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Lecture 19: Design for Skew

Semiconductor Memory: DRAM and SRAM. Department of Electrical and Computer Engineering, National University of Singapore

Homework 10 posted just for practice. Office hours next week, schedule TBD. HKN review today. Your feedback is important!

ISSN:

Intel's 65 nm Logic Technology Demonstrated on 0.57 µm 2 SRAM Cells

TRENDS in technology scaling make leakage power an

EEC 216 Lecture #10: Ultra Low Voltage and Subthreshold Circuit Design. Rajeevan Amirtharajah University of California, Davis

Trends and Challenges in VLSI Technology Scaling Towards 100nm

CMOS VLSI IC Design. A decent understanding of all tasks required to design and fabricate a chip takes years of experience

A High Performance IDDQ Testable Cache for Scaled CMOS Technologies

CPE/EE 427, CPE 527 VLSI Design I: Homeworks 3 & 4

Introduction. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. July 30, 2002

Modeling the Effect of Wire Resistance in Deep Submicron Coupled Interconnects for Accurate Crosstalk Based Net Sorting

Sub-Clock Power-Gating Technique for Minimising Leakage Power During Active Mode

Transcription:

Outline Leakage Power Minimization in Deep-Submicron circuits Politecnico di Torino Dip. di Automatica e Informatica 1019 Torino, Italy enrico.macii@polito.it Introduction. Design for low leakage: Basics. Existing approaches. Sleep Transistor Insertion: Principle. Automated STI. Methodology. Preliminary results. Extensions. Conclusions. Electronic Technology Today: Convergence Power Dissipation in Circuits technology dominates in modern ICs. 1960s 1970s 1980s 1990s 000s Watch Chip Calculator PMOS SRAM Microprocessor FLASH Power dissipation of a gate: P = P SW + P SC + k P SW = Switching (or dynamic) power. P SC = Short -circuit power. k = Leakage (or stand-by) power. In older technologies (0.um and above), k was marginal w.r.t. switching power: Switching power minimization was the primary objective. DRAM Server/Mainframe PMOS Bipolar ECL BI 1960s 1970s 1980s 1990s 000s In deep sub-micron processes, k becomes critical. Leakage vs. Dynamic Power in Current Circuits Power Dissipation Due to Leakage Leakage power becomes comparable to dynamic power as technology scales. inverter: Example: ASICs [source: STMicroelectronics]. V DD Power Density (Watts/cm ) 10 1 100 7 Example: Microprocessors [source: Intel]. 0 0 0nm 180nm 10nm 90nm 6nm Leakage Power Dynamic Power V IN PMOS I sub V OUT Itanium : 180nm, 1.V, 1.0Hz, 1MTx (core+cache) Itanium : 10nm, 1.V, 1.Hz, 10MTx (core+cache) 100% 80% 60% 0% 0% 0% Itanium Itanium Leakage Power I/O Power Dynamic Power I gate C L 6 1

Power Dissipation Due to Leakage (Cont.) Leakage power of a gate: V DD = Supply voltage. I L = Leakage current. eakage = I L Leakage current I L consists of two major contributions: I L = I sub + I gate I sub = Sub-threshold current caused by low threshold voltage. I gate = ate current caused by reduced thickness of gate oxide. I sub dominates, but grows by X per generation. I gate is less relevant, but grows much faster (00X per generation). Low-Leakage Design Leakage power minimization: Design problem (and not just a technology/process problem). For memory macros: Optimization based on ad-hoc solutions (cell optimization). For cell-based logic: Optimization requires design automation. Integration with existing tools (both at logic and physical level) is mandatory. Different solutions proposed for both sub-threshold and gate leakage. 7 8 Existing Approaches to Low-Leakage Design DT Sub-threshold leakage: Variable-threshold (VT). Dual-threshold (DT). Multi-threshold (MT). Sleep transistor insertion (STI). Multi-voltage (MV). Body biasing (reverse -- RBB and forward -- FBB). State assignment. ate leakage: Boosted gate MOS (BMOS). P-type Domino. Pin reordering and state assignment. Low-threshold cells: 1-0% faster. 10x higher leakage. than high-threshold cells. Libraries containing high-v Th and low-v Th gates do exist. Use low-v Th gates for critical paths, high-v Th cells for the rest. Approach: Synthesize and map the design onto all high- V Th cells. Minimum leakage implementation. Replace high-v Th cells on the critical path with low-v Th cells to meet timing constraints. Leakage power increase required to meet timing constraints may vary from 0% to 00%. 9 10 MT MT (Cont.) Use multi-threshold cells with capability of operating at:, when in active mode; High-V Th, when in stand-by mode. Leakage power control obtained thanks to two effects: Transistor stacking. Low sub-threshold leakage current of high-v Th transistors. Principle of MT: Insertion of high-v Th transistors in series to the pull-up and pull-down networks in order to reduce the sub-threshold leakage current while maintaining high-speed operation in active mode. gate Sleep gate Virtual ND Sleep Virtual ND ND 11 1

MT (Cont.) Limitations of MT: Impact on area: Each cell includes two extra transistors for low -leakage stand-by operation Significant cost in terms of area. The PMOS transistor is normally much larger than the (e,.g., Form factor ~ 0) Need of huge buffering circuitry. Process modifications for supporting the implementation of high-v Th transistors. Impact on performance: Slow-down of power gated logic cells when the circuit is active. Re-activation delay for re -enabling a set of powered down cells. block (N cells) STI Modify the MT approach by: Using the same sleep transistors to control blocks of higher complexity. Avoiding the PMOS sleep transistor. block (N cells) ND Sleep Virtual ND ND 1 1 STI (Cont.) Further modification: Use low-v th sleep transistors. Consequences: All devices are fabricated using the same process. Sub-threshold leakage power reduced by transistor stacking effect only (smaller, but still significant reduction). Example: LOW - V TH ATED LOIC 1 SLEEP Vgnd 6 7 Automated STI Issues: ranularity of STI insertion. Large blocks: Size of sleep transistors and driving strengths of sleep signals. Small blocks: Number of sleep transistors and size of control logic. Design of sleep transistor cells: Different sizes and driving strenghts. Must be compliant with the cells in the library. Area and delay control. Selection of gates to which STI should be applied: Requires layout information. eneration of sleep signals: Area, timing and power overhead. LEAKAE - CONTROL CELL 1 16 Post-layout STI for combinational circuits. STI is performed on a row-by-row basis. Sleep transistors are added at the boundaries of each row and they are connected to a common virtual ground. Assumptions: All cells in the circuit can be potentially controlled by sleep transistors. Only one control signal is used to drive all the sleep transistors and it is available from some external module (e.g., a microprocessor). Design and characterization of a library of sleep transistor cells. Flow: Placed Row Area Constraint Calculate Cluster of ates Delay Constraint Size and Insert Sleep Transistor Update Layout Sleep Transistor Library 17 18

Controlling area penalty. Use part of the area of empty regions between cells according to the tolerated congestion overhead (compaction). Resize the floorplan according to the tolerated area overhead. For each row, consider the largest sleep transistor that can be inserted (free area + allowed overhead). Free Area Controlling delay penalty. The maximum sustainable current of each sleep transistor is computed according to the tolerated slow-down in active mode. The cell selection process performs a gate-by-gate exploration of each row starting from the cell with the longest timing path and going back towards the prymary inputs. The re-activation time penalty is traded (or nullified) by preventing the power gating in the circuit of some of the cells whose arrival times are shorter than the re -activation delay of the sleep transistors. 1 1 Free Area + Area Overhead For each row, the process stops when the current budget is exhausted or the re-activation time penalty is violated. 19 0 Experimental set-up: Six benchmarks (from 1900 to 600 standard cells). Circuits synthesized onto 0.1um technology library from STMicroelectronics. Sleep transistor cells chosen so as to guarantee a total perform ance degradation below % in active mode. Tolerated area overhead set to %. Results: Leakage power reductions around 80%. Total power savings, accounting for cell dynamic and internal power, are around 19%. Benchmark Block1 0.11 Original 0.9 0.0 0.0 Optimized 0. 0. 78.9 D -9.0 1.0 Block 0.19 0. 0.1 0.0 0. 0.8 80.0-10.1 1.0 Block 0.16 0.1 0.7 0.0 0. 0.7 7.6-8.8 1. Block 0.6 0.60 0.86 0.0 0.6 0.68 8.7 -.0 18.6 Block 0.1 0.9 0.1 0.0 0. 0. 78.9-9.7 1. Block6 0.6 0.88 1. 0.09 0.98 1.07 8. -1. 0.1 Avg. 79.7-9.6 18.9 1 Results (cont.): Area overhead around.% and circuit delay increase of %. Benchmark Block1 Block Block ates 18 1916 1 Sleep Cells 1 1 Area_Orig [µm ] 691 610 60 Area_Opt [µm ] 6679 66710 660 D.9.. Extensions to the post-layout STI approach: Handle sequential circuits: Problem of state retention in sleep mode. Need to design low-leakage flip-flops (based on the concept of Baloon Circuit ). Automatic extraction of logic conditions for sleep. Can reuse idle conditions from clock gating. Exploit ODC -driven clock gating approach. Minimum overhead (shared logic with gated clock circuitry). Block 67 1 61 66 1.7 Block 0 6 6918 6819. Block6 61 0 7098 7170.0

Preliminary results on the design of a low-leakage storage element (flip-flop with Baloon circuit ): Leakage current in stand-by mode: Regular flip-flop: 116nA. Low-leakage flip-flop: 10nA. Marginal delay increase. Conclusions Leakage accounts for around -10% of power budget at 180nm; this grows to 0-% at 10nm and to -0% at 90nm. Leakage power minimization must be faced from the design stand-point, not just at the technology/process level. Several low-leakage design approaches introduced recently. STI is promising, although it requires significant methodology and tool infrastructure support. Preliminary results are currently under evaluation. 6