Low-Power CMOS VLSI Design

Similar documents
Low Power Design in VLSI

Low Power Design for Systems on a Chip. Tutorial Outline

Jan Rabaey, «Low Powere Design Essentials," Springer tml

Low-Power Digital CMOS Design: A Survey

A Survey of the Low Power Design Techniques at the Circuit Level

Course Content. Course Content. Course Format. Low Power VLSI System Design Lecture 1: Introduction. Course focus

Data Word Length Reduction for Low-Power DSP Software

Chapter 1 Introduction

Implementation of High Performance Carry Save Adder Using Domino Logic

Power Spring /7/05 L11 Power 1

Pramoda N V Department of Electronics and Communication Engineering, MCE Hassan Karnataka India

DYNAMIC VOLTAGE FREQUENCY SCALING (DVFS) FOR MICROPROCESSORS POWER AND ENERGY REDUCTION

Low Power Design of Successive Approximation Registers

Low Power Design Part I Introduction and VHDL design. Ricardo Santos LSCAD/FACOM/UFMS

Low Power VLSI CMOS Design. An Image Processing Chip for RGB to HSI Conversion

EE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling

Design & Analysis of Low Power Full Adder

Reduce Power Consumption for Digital Cmos Circuits Using Dvts Algoritham

A Low Power Array Multiplier Design using Modified Gate Diffusion Input (GDI)

Low Power VLSI Circuit Synthesis: Introduction and Course Outline

DESIGN AND ANALYSIS OF LOW POWER CHARGE PUMP CIRCUIT FOR PHASE-LOCKED LOOP

the cascading of two stages in CMOS domino logic[7,8]. The operating period of a cell when its input clock and output are low is called the precharge

LOW POWER NOVEL HYBRID ADDERS FOR DATAPATH CIRCUITS IN DSP PROCESSOR

Design of Low Power Vlsi Circuits Using Cascode Logic Style

UNIT-II LOW POWER VLSI DESIGN APPROACHES

DESIGN FOR LOW-POWER USING MULTI-PHASE AND MULTI- FREQUENCY CLOCKING

A HIGH SPEED & LOW POWER 16T 1-BIT FULL ADDER CIRCUIT DESIGN BY USING MTCMOS TECHNIQUE IN 45nm TECHNOLOGY

LOW POWER VLSI TECHNIQUES FOR PORTABLE DEVICES Sandeep Singh 1, Neeraj Gupta 2, Rashmi Gupta 2

Mohit Arora. The Art of Hardware Architecture. Design Methods and Techniques. for Digital Circuits. Springer

Low-Power Multipliers with Data Wordlength Reduction

SURVEY AND EVALUATION OF LOW-POWER FULL-ADDER CELLS

METHODS FOR TRUE ENERGY- PERFORMANCE OPTIMIZATION. Naga Harika Chinta

A New network multiplier using modified high order encoder and optimized hybrid adder in CMOS technology

Power Issues with Embedded Systems. Rabi Mahapatra Computer Science

LOW POWER & LOW VOLTAGE APPROXIMATION ADDERS IMPLEMENTATION FOR DIGITAL SIGNAL PROCESSING Raja Shekhar P* 1, G. Anad Babu 2

A New Configurable Full Adder For Low Power Applications

Low-Power VLSI. Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering

Chapter 2 Combinational Circuits

Low Power Adiabatic Logic Design

Run-time Power Control Scheme Using Software Feedback Loop for Low-Power Real-time Applications

International Journal of Advanced Research in Computer Science and Software Engineering

Improved Two Phase Clocked Adiabatic Static CMOS Logic Circuit

An Overview of Static Power Dissipation

Topics. Low Power Techniques. Based on Penn State CSE477 Lecture Notes 2002 M.J. Irwin and adapted from Digital Integrated Circuits 2002 J.

LOW POWER DATA BUS ENCODING & DECODING SCHEMES

A Novel Approach for High Speed and Low Power 4-Bit Multiplier

Low Power High Performance 10T Full Adder for Low Voltage CMOS Technology Using Dual Threshold Voltage

Design of 32-bit ALU using Low Power Energy Efficient Full Adder Circuits

UNIT-1 Fundamentals of Low Power VLSI Design

Contents 1 Introduction 2 MOS Fabrication Technology

ZIGZAG KEEPER: A NEW APPROACH FOR LOW POWER CMOS CIRCUIT

Design of Parallel Prefix Tree Based High Speed Scalable CMOS Comparator for converters

TODAY S digital signal processor (DSP) and communication

Power-Area trade-off for Different CMOS Design Technologies

COMPARATIVE ANALYSIS OF 32 BIT CARRY LOOK AHEAD ADDER USING HIGH SPEED CONSTANT DELAY LOGIC

The challenges of low power design Karen Yorav

International Journal of Scientific & Engineering Research, Volume 6, Issue 7, July ISSN

EECS 427 Lecture 22: Low and Multiple-Vdd Design

Topic 6. CMOS Static & Dynamic Logic Gates. Static CMOS Circuit. NMOS Transistors in Series/Parallel Connection

Design of Robust and power Efficient 8-Bit Ripple Carry Adder using Different Logic Styles

ECE 484 VLSI Digital Circuits Fall Lecture 02: Design Metrics

Designing of Low-Power VLSI Circuits using Non-Clocked Logic Style

Leakage Power Reduction by Using Sleep Methods

Investigation on Performance of high speed CMOS Full adder Circuits

1. Short answer questions. (30) a. What impact does increasing the length of a transistor have on power and delay? Why? (6)

Dual-K K Versus Dual-T T Technique for Gate Leakage Reduction : A Comparative Perspective

Minimizing the Sub Threshold Leakage for High Performance CMOS Circuits Using Stacked Sleep Technique

Lecture 13 CMOS Power Dissipation

Low-Power Approximate Unsigned Multipliers with Configurable Error Recovery

Low Power 8-Bit ALU Design Using Full Adder and Multiplexer

POWER DELAY PRODUCT AND AREA REDUCTION OF FULL ADDERS USING SYSTEMATIC CELL DESIGN METHODOLOGY

Reduced Swing Domino Techniques for Low Power and High Performance Arithmetic Circuits

An energy efficient full adder cell for low voltage

Low-Power Design for Embedded Processors

Design of New Full Swing Low-Power and High- Performance Full Adder for Low-Voltage Designs

PERFORMANCE ANALYSIS OF LOW POWER FULL ADDER CELLS USING 45NM CMOS TECHNOLOGY

Design and Analyse Low Power Wallace Multiplier Using GDI Technique

Novel Buffer Design for Low Power and Less Delay in 45nm and 90nm Technology

VLSI Design I; A. Milenkovic 1

High-Speed Analog to Digital Converters. ELCT 1003:High Speed ADCs

Energy Reduction of Ultra-Low Voltage VLSI Circuits by Digit-Serial Architectures

Lecture 7: Components of Phase Locked Loop (PLL)

Low Power Realization of Subthreshold Digital Logic Circuits using Body Bias Technique

Propagation Delay, Circuit Timing & Adder Design. ECE 152A Winter 2012

Propagation Delay, Circuit Timing & Adder Design

Design of Low power and Area Efficient 8-bit ALU using GDI Full Adder and Multiplexer

Design and Implementation of Pipelined 4-Bit Binary Multiplier Using M.G.D.I. Technique

Energy-Recovery CMOS Design

High Performance and Low power VLSI CMOS Circuit Designs using ONOFIC Approach

Power Efficient adder Cell For Low Power Bio MedicalDevices

Leakage Power Reduction in 5-Bit Full Adder using Keeper & Footer Transistor

Design of Multiplier using Low Power CMOS Technology

DIGITAL INTEGRATED CIRCUITS A DESIGN PERSPECTIVE 2 N D E D I T I O N

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013

Sleepy Keeper Approach for Power Performance Tuning in VLSI Design

CPE/EE 427, CPE 527 VLSI Design I: Homeworks 3 & 4

Timing and Power Optimization Using Mixed- Dynamic-Static CMOS

DESIGN AND ANALYSIS OF LOW POWER 10- TRANSISTOR FULL ADDERS USING NOVEL X-NOR GATES

19. Design for Low Power

INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND TECHNOLOGY (IJARET)

Transcription:

Low-Power CMOS VLSI Design ( 范倫達 ), Ph. D. Department of Computer Science, National Chiao Tung University, Taiwan, R.O.C. Fall, 2017 ldvan@cs.nctu.edu.tw http://www.cs.nctu.tw/~ldvan/

Outline Introduction Low-Power Process-Level Design (Ignore here) Low-Power Logic/Circuit-Level Design Low-Power Algorithm/Architecture-Level Design Low-Power System-Level Design Conclusion References VLSI-DSP-6-2

Low Power Design An Ongoing and Important Discipline Historical figure of merit for VLSI design Performance (circuit speed and system quality) Chip area (circuit cost). But now, Power dissipation is now an important metric in VLSI design. No single major source for power savings across all design levels Required a new way of THINKING!!! Companies lack the basic power-conscious culture and designers need to be educated in this respect. Overall Goal - To reduce power dissipations but maintaining adequate throughput rate. VLSI-DSP-6-3

Motivation - Microprocessor VLSI-DSP-6-4

Low Power Competitive Reasons Battery Powered Systems Extend battery life Reduce weight and size High-Performance Systems Cost Package (chip carrier, heat sink, card slots, ) Power Systems (supplies, distribution, regulators, ) Fans (noise, power, reliability, area, ) Operating cost to customer Re-start issue. Reliability Failure rate increases by 4X for T @ 110C vs 70C Size and Weight VLSI-DSP-6-5

The Power Crisis: Portability PDA, Cellular Phone, Notebook Computer,etc. Expected Battery Lifetime increase Over next 5 years: 30-40% VLSI-DSP-6-6

A Multimedia Terminal: The Infopad Present day battery technology (year 1990) 20 lbs for 10hrs VLSI-DSP-6-7

VLSI Signal Processing System Design Space Cost Performance Test Power Area System Level Algorithm Level Architecture Level Logic Level Circuit Level Process Level VLSI-DSP-6-8

Low Power System Design Space System Algorithm Architecture Logic/Circuit Process Power budgeting, S/H partitioning, power management, core selection Algorithmic reduction, data transformation, CSE, low-complexity operation Parallelism, pipelining, re-timing, unfolding, signal ordering, glitch minimization, data representation, resource allocation, multi-clock Logic style, arithmetic, glitch/noise minimization, re-sizing, adaptive voltage scaling, multi-vdd, multi-vth, multi-clock, layout, power-driven P&R Low-power device, alternative technology, multi- Vth VLSI-DSP-6-9

Outline Introduction Low-Power Process-Level Design (Ignore here) Low-Power Logic/Circuit-Level Design Low-Power Algorithm/Architecture-Level Design Low-Power System-Level Design Conclusion References VLSI-DSP-6-10

Where Does Power Go in CMOS? Source of power dissipation P = P dynamic + P short-circuit + P leakage + P static Definitions: Dynamic/switching power: P = αcv 2 f Charging and discharging parasitic capacitors α : switching activity factor Short circuit power P = I sc V Direct path between supply rail during switching Leakage power P = I leakage V Reverse bias diode leakage Sub-threshold conduction Static power P = I static V Each input node is connected to fixed stable voltage VLSI-DSP-6-11

Dynamic Power Consumption (1/2) Power = Energy/transition * transition rate = C L * V dd2 * f 0->1 = C L * V dd2 * Pb 0->1 * f = C EFF * V dd2 * f = Pb 0->1 *C L *V dd2 * f C EFF = Effective Capacitance = C L * Pb 0->1 VLSI-DSP-6-12

Dynamic Power Consumption (2/2) Need to reduce Pb 0->1, C L, V dd, and f for low power design Reduce the probability, P 0 -> 1 Minimize the geometry and remove the redundancy Reduce the power supply level Use lowest clock frequency Power dissipation is data dependent function of switching activity. => Pattern Dependent! VLSI-DSP-6-13

Choice of Logic Style VLSI-DSP-6-14

Choice of Logic Style Power-delay product improves as voltage decreases The best logic style minimizes power-delay (i.e, energy) for a given delay constraint. VLSI-DSP-6-15

Type of Logic Function: NOR Example : Static-style 2-input NOR gate A B Out 0 0 1 0 1 0 1 0 0 1 1 0 Truth Table of 2-Input NOR Gate Assume : P(A=1) = ½ P(B=1) = ½ Then : P(Out=1) = ¼ P(0 1) = P(Out=0)*P(Out=1) =3/4 * 1/4 = 3/16 α 0->1 = 3/16 VLSI-DSP-6-16

2-Input NOR Gate Transition Probability P 1 =(1-P A )(1-P B ) P 0->1 =P 0 P 1 =(1-(1-P A )(1-P B ))(1-P A )(1-P B ) VLSI-DSP-6-17

Type of Logic Function: XOR Example : Static-style 2-input XOR gate A B Out 0 0 0 0 1 1 1 0 1 1 1 0 Truth Table of 2-Input XOR Gate Assume : P(A=1) = 1/2 P(B=1) = 1/2 Then : P(Out=1) = 1/2 P(0 1) = P(Out=0)*P(Out=1) =1/2 * 1/2 = 1/4 α 0->1 = 1/4 VLSI-DSP-6-18

2-Input XOR Gate Transition Probability P 1 =P A (1-P B )+P B (1-P A )=P A +P B -2P A P B P 0->1 =P 0 P 1 =(1-(P A +P B -2P A P B ))(P A +P B -2P A P B ) VLSI-DSP-6-19

Which One is Your Choice? XOR NOR Which one is for Low-Power design? VLSI-DSP-6-20

Glitching Activity in CMOS Network (x,c=0,0) (x,c=1,0) α 0->1 can be greater than 1 due to glitching! VLSI-DSP-6-21

Glitching in a Carry Ripple Adder VLSI-DSP-6-22

Chain vs Tree Datapath (1/2) A B O1 C Chain O2 D F A B B C Tree O1 O2 F O1 O2 F P 1 (Chain) 1/4 1/8 1/16 P 0 =1-P 1 (Chain) 3/4 7/8 15/16 P 0->1 (Chain) 3/16 7/64 15/256 P 1 (Tree) 1/4 1/4 1/16 P 0 =1-P 1 (Tree) 3/4 3/4 15/16 P 0->1 (Tree) 3/16 3/16 15/256 VLSI-DSP-6-23

Chain vs Tree Datapath (2/2) A B O1 C O2 D F A B B C O1 O2 F Chain Tree O1 O2 F P 0->1 (Chain)/P 0->1 (Tree) 1 0.58 1 α 0->1 (Chain)/α 0->1 (Tree) 1 0.83 1.47 Ideal w/t delay Practical with delay Which one is for Low-Power design? VLSI-DSP-6-24

Glitching at the Datapath Level Irregular Regular Two Glitches! VLSI-DSP-6-25

How to Minimize Glitching? Equalize Length of Timing Paths through Design! VLSI-DSP-6-26

Data Representation (1/2) Sign Bit Bit Position Bit Position VLSI-DSP-6-27

Data Representation (2/2) (Binary v.s. Gray Encoding) VLSI-DSP-6-28

Outline Introduction Low-Power Process-Level Design (Ignore here) Low-Power Logic/Circuit-Level Design Low-Power Algorithm/Architecture-Level Design Low-Power System-Level Design Conclusion References VLSI-DSP-6-29

Signal Reordering Operation Ex1. Y=AB+AC= A(B+C) Ex2. Y=3X=X+(X<<1) B B X + A + Y C X Y X A C X <<1 X Y + Y 3 X VLSI-DSP-6-30

Resource Sharing Can Increase Activity (1/2) Separate Bus Structure # of Bus Transitions Per Cycle =2(1+1/2+1/4+.)=4, Where 2 means 2 separate buses, 1 denotes the transition probability of LSB, ½ denotes the transition probability of 2nd LSB, and etc. Bus Sharing VLSI-DSP-6-31

Resource Sharing Can Increase Activity (2/2) Bit Position VLSI-DSP-6-32

Lowering V dd Increases Delay VLSI-DSP-6-33

Reducing V dd VLSI-DSP-6-34

Architecture Trade-offs: Reference Datapath + VLSI-DSP-6-35

Parallel Datapath + + VLSI-DSP-6-36

Pipelined Datapath + VLSI-DSP-6-37

Summary: A Low-Power Data Path Architecture type Voltage Area Power Reference Datapath (no pip/par) 5V 1 1 Pipelined datapath 2.9V 1.3 0.37 Parallel datapath 2.9V 3.4 0.34 Pipeline-parallel datapath 2.0V 3.7 0.18 Desire to operate at lowest possible speeds (using low supply voltages) Use architecture optimization to compensate for slower operation VLSI-DSP-6-38

Computational Complexity of DCT Algorithms VLSI-DSP-6-39

Low-Power Cache and Register Configuration Application profiling Trade-off between performance, power and size Rule of thumb Access and storage the most frequently used instructions Avoid accessing larger cache/register Partition cache and register Aware of partitioning Partition! Partition! CPU Reg Reg L1 Cache L2 Cache Memory VLSI-DSP-6-40

Outline Introduction Low-Power Process-Level Design (Ignore here) Low-Power Logic/Circuit-Level Design Low-Power Algorithm/Architecture-Level Design Low-Power System-Level Design Low Power System Perspective Low Power Applications Conclusion References VLSI-DSP-6-41

Power Down Techniques VLSI-DSP-6-42

Software versus Hardware Advantage Disadvantage Software Hardware Free but not always High flexibility Ease of compatibility High speed Low power High efficiency Less staff High power consumption Slow in execution Inefficient Larger staff High die cost Low flexibility Low compatibility VLSI-DSP-6-43

Energy-Efficient Software Coding Potential for power reduction via software modification is relatively unexploited. Code size and algorithmic efficiency can significantly affect energy dissipation. Pipelining at software level- VLIW coding style References: V. Tiwari et al., Power analysis of embedded software: a first step towards software power minimization, IEEE Trans. on VLSI, vol. 2, no. 4, Dec. 1994. J. Synder et al., Low-power software for low-power people, 1994 IEEE Symp. On Low Power Electronics. VLSI-DSP-6-44

Power Hunger Clock Network H-Tree design deficiencies based on Elmore delay model. PLL every designer (digital or analog) should have the knowledge of PLL. Multiple frequencies in chips/systems by PLL Low main frequency, But Jitter and noise, gain and bandwidth, pull-in and lock time, stability Asynchronous => Use gated clocks, sleep mode VLSI-DSP-6-45

Power Analysis in the Design Flow VLSI-DSP-6-46

Applications I: Wireless Computing/Communication VLSI-DSP-6-47

Applications II: A Portable Multimedia Terminal VLSI-DSP-6-48

Applications III: System on Chip (SOC) Entire system function Logic + Memory More than two types of devices Allow more freedoms in architecture Hardware and software partition VLSI-DSP-6-49

Conclusions Low-Power and high-speed tradeoff design is an essential requirement for many applications. Low power impacts on the cost, size, weight, performance, and reliability. Reduce P 0->1, C L, V dd, and f for low power design across each level!! VLSI-DSP-6-50

Reference [1] A. Chandrakasan and R. W. Brodersen, Minimizing power consumption in digital CMOS circuits, Proceedings of the IEEE, vol. 83, no. 4, pp. 498-523, Apr. 1995. [2] A. Chandrakasan, Architectures for Ultra Low-Power Design, in tutorial B3 of ASP-DAC, 1995. [3] A. Chandrakasan, Low-Voltage/Low-Power Digital Design, in tutorial of Workshop on Low-Power Low-Volgate and RF IC for Wireless Communication System, 1996, Taiwan. [4] T. Sakurai, Low Power Circuit Design Methodology, in tutorial B2 of ASP-DAC, 1995. [5] Chapter 17 of Textbook. VLSI-DSP-6-51

Self-Test Exercises STE1: Calculate the switching activity EQUATION EXPRESSION of 2-input AND gate and simulate the histogram of transition probability (P 0->1 ) vs P A and P B. STE2: Calculate the switching activity EQUATION EXPRESSION of 3-input NAND gate. VLSI-DSP-6-52