ECE260B CSE241A Winter Design Styles Multi-Vdd/ Vth Designs. Website: / vlsicad.ucsd.edu/ courses/ ece260bw05

Similar documents
Lecture Perspectives. Administrivia

Lecture 30. Perspectives. Digital Integrated Circuits Perspectives

Digital Integrated Circuits Perspectives. Administrivia

18nm FinFET. Lecture 30. Perspectives. Administrivia. Power Density. Power will be a problem. Transistor Count

Design Methodologies. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Design Methodologies. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Very Large Scale Integration (VLSI)

2009 Spring CS211 Digital Systems & Lab 1 CHAPTER 3: TECHNOLOGY (PART 2)

EECS 427 Lecture 13: Leakage Power Reduction Readings: 6.4.2, CBF Ch.3. EECS 427 F09 Lecture Reminders

Technology Timeline. Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs. FPGAs. The Design Warrior s Guide to.

Homework 10 posted just for practice. Office hours next week, schedule TBD. HKN review today. Your feedback is important!

Disseny físic. Disseny en Standard Cells. Enric Pastor Rosa M. Badia Ramon Canal DM Tardor DM, Tardor

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

EE 434 ASIC and Digital Systems. Prof. Dae Hyun Kim School of Electrical Engineering and Computer Science Washington State University.

Lecture 3, Handouts Page 1. Introduction. EECE 353: Digital Systems Design Lecture 3: Digital Design Flows, Simulation Techniques.

Leakage Power Minimization in Deep-Submicron CMOS circuits

UNIT-II LOW POWER VLSI DESIGN APPROACHES

Propagation Delay, Circuit Timing & Adder Design. ECE 152A Winter 2012

Propagation Delay, Circuit Timing & Adder Design

Low-Power Digital CMOS Design: A Survey

Lecture 9: Cell Design Issues

PHYSICAL STRUCTURE OF CMOS INTEGRATED CIRCUITS. Dr. Mohammed M. Farag

EC 1354-Principles of VLSI Design

PROGRAMMABLE ASIC INTERCONNECT

EE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling

Power Spring /7/05 L11 Power 1

PE713 FPGA Based System Design

EDA Challenges for Low Power Design. Anand Iyer, Cadence Design Systems

Jack Keil Wolf Lecture. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. Lecture Outline. MOSFET N-Type, P-Type.

Microelectronics, BSc course

Introduction to CMOS VLSI Design (E158) Lecture 9: Cell Design

Jan Rabaey, «Low Powere Design Essentials," Springer tml

New Approaches to Total Power Reduction Including Runtime Leakage. Leakage

Reference. Wayne Wolf, FPGA-Based System Design Pearson Education, N Krishna Prakash,, Amrita School of Engineering

A Survey of the Low Power Design Techniques at the Circuit Level

ECE 5745 Complex Digital ASIC Design Topic 2: CMOS Devices

Engr354: Digital Logic Circuits

Low-Power VLSI. Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering

PROGRAMMABLE ASIC INTERCONNECT

FPGA Based System Design

Lecture 12 Memory Circuits. Memory Architecture: Decoders. Semiconductor Memory Classification. Array-Structured Memory Architecture RWM NVRWM ROM

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

Ruixing Yang

CMOS VLSI IC Design. A decent understanding of all tasks required to design and fabricate a chip takes years of experience

THERE is a growing need for high-performance and. Static Leakage Reduction Through Simultaneous V t /T ox and State Assignment

Sleepy Keeper Approach for Power Performance Tuning in VLSI Design

Low Power VLSI Circuit Synthesis: Introduction and Course Outline

Contents 1 Introduction 2 MOS Fabrication Technology

1. Introduction. Institute of Microelectronic Systems. Status of Microelectronics Technology. (nm) Core voltage (V) Gate oxide thickness t OX

2 MARK QUESTIONS & ANSWERS UNIT1-MOS TRANSISTOR PRINCIPLE

Memory Basics. historically defined as memory array with individual bit access refers to memory with both Read and Write capabilities

+1 (479)

Leakage Current Analysis

! Review: MOS IV Curves and Switch Model. ! MOS Device Layout. ! Inverter Layout. ! Gate Layout and Stick Diagrams. ! Design Rules. !

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

! Review: MOS IV Curves and Switch Model. ! MOS Device Layout. ! Inverter Layout. ! Gate Layout and Stick Diagrams. ! Design Rules. !

1. Short answer questions. (30) a. What impact does increasing the length of a transistor have on power and delay? Why? (6)

EECS 427 Lecture 21: Design for Test (DFT) Reminders

Electronic Design Automation at Transistor Level by Ricardo Reis. Preamble

Trends and Challenges in VLSI Technology Scaling Towards 100nm

Learning Outcomes. Spiral 2 8. Digital Design Overview LAYOUT

VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur

CHAPTER 3 NEW SLEEPY- PASS GATE

Novel Low-Overhead Operand Isolation Techniques for Low-Power Datapath Synthesis

EEC 216 Lecture #10: Ultra Low Voltage and Subthreshold Circuit Design. Rajeevan Amirtharajah University of California, Davis

PROGRAMMABLE ASICs. Antifuse SRAM EPROM

Preface to Third Edition Deep Submicron Digital IC Design p. 1 Introduction p. 1 Brief History of IC Industry p. 3 Review of Digital Logic Gate

Leakage Control Techniques for Designing Robust, Low Power Wide-OR Domino Logic for Sub-130nm CMOS Technologies

Low Power System-On-Chip-Design Chapter 12: Physical Libraries

Design Methodologies. Design Trade-offs. System Design to Hardware. Design Gap. Speed (throughput and clock frequency) Area and

EEC 118 Lecture #12: Dynamic Logic

Announcements. Advanced Digital Integrated Circuits. Quiz #3 today Homework #4 posted This lecture until 4pm

Topic List: Review. CSE241 VLSI Digital Circuits Winter Lecture 20: Futures for VLSI. Logistics. HW Solutions. HW Solutions.

4 principal of JNTU college of Eng., JNTUH, Kukatpally, Hyderabad, A.P, INDIA

Lecture 1. Tinoosh Mohsenin

White Paper Stratix III Programmable Power

Design of low power SRAM Cell with combined effect of sleep stack and variable body bias technique

A HIGH SPEED & LOW POWER 16T 1-BIT FULL ADDER CIRCUIT DESIGN BY USING MTCMOS TECHNIQUE IN 45nm TECHNOLOGY

Probabilistic and Variation- Tolerant Design: Key to Continued Moore's Law. Tanay Karnik, Shekhar Borkar, Vivek De Circuit Research, Intel Labs

Dual-K K Versus Dual-T T Technique for Gate Leakage Reduction : A Comparative Perspective

Power and Energy. Courtesy of Dr. Daehyun Dr. Dr. Shmuel and Dr.

Digital Integrated Circuits 1: Fundamentals

Digital Design and System Implementation. Overview of Physical Implementations

Technical Paper FA 10.3

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006

EE4800 CMOS Digital IC Design & Analysis. Lecture 1 Introduction Zhuo Feng

EECS 427 Lecture 22: Low and Multiple-Vdd Design

19. Design for Low Power

TRENDS in technology scaling make leakage power an

EE141-Spring 2007 Digital Integrated Circuits

An Implementation of a 32-bit ARM Processor Using Dual Power Supplies and Dual Threshold Voltages

Leakage Power Reduction for Logic Circuits Using Variable Body Biasing Technique

VCTA: A Via-Configurable Transistor Array Regular Fabric

Integrated Power Delivery for High Performance Server Based Microprocessors

PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS

EE141- Spring 2004 Digital Integrated Circuits

A Novel Low-Power Scan Design Technique Using Supply Gating

EECS150 - Digital Design Lecture 19 CMOS Implementation Technologies. Recap and Outline

A Novel Dual Stack Sleep Technique for Reactivation Noise suppression in MTCMOS circuits

EMT 251 Introduction to IC Design

Transcription:

ECE260B CSE241A Winter 2005 Design Styles Multi-Vdd/ Vth Designs Website: / courses/ ece260bw05 ECE 260B CSE 241A Design Styles 1

The Design Problem Source: sematech97 A growing gap between design complexity and design productivity ECE 260B CSE 241A Design Styles 2

Design Methodology Design process traverses iteratively between three abstractions: behavior, structure, and geometry More and more automation for each of these steps ECE 260B CSE 241A Design Styles 3

Behavioral Description of Accumulator entity accumulator is port ( DI : in integer; DO : inout integer := 0; CLK : in bit ); end accumulator; architecture behavior of accumulator is begin process(clk) variable X : integer := 0; -- intermediate variable begin if CLK = '1' then X < = DO + D1; DO <= X; end if; end process; end behavior; ECE 260B CSE 241A Design Styles 4 Design described as set of input-output relations, regardless of chosen implementation Data described at higher abstraction level ( integer )

Structural Description of Accumulator entity accumulator is port ( -- definition of input and output terminals DI: in bit_vector(15 downto 0) -- a vector of 16 bit wide DO: inout bit_vector(15 downto 0); CLK: in bit ); end accumulator; architecture structure of accumulator is component reg -- definition of register ports port ( DI : in bit_vector(15 downto 0); DO : out bit_vector(15 downto 0); CLK : in bit ); end component; component add -- definition of adder ports port ( IN0 : in bit_vector(15 downto 0); IN1 : in bit_vector(15 downto 0); OUT0 : out bit_vector(15 downto 0) ); end component; -- definition of accumulator structure signal X : bit_vector(15 downto 0); begin add1 : add port map (DI, DO, X); -- defines port connectivity reg1 : reg port map (X, DO, CLK); end structure; ECE 260B CSE 241A Design Styles 5 Design defined as composition of register and full-adder cells ( netlist ) Data represented as {0,1,Z} Time discretized and progresses with unit steps Description language: VHDL Other options: schematics, Verilog

Implementation Methodologies Digital Circuit Implementation Approaches Semi-custom Custom Cell-Based Standard Cells Compiled Cells ECE 260B CSE 241A Design Styles 6 Macro Cells Array-Based Pre-diffused (Gate Arrays) Pre-wired (FPGA)

Full Custom Hand drawn geometry All layers customized Digital and analog Simulation at transistor level High density High performance Long design time Magic Layout Editor (UC Berkeley) ECE 260B CSE 241A Design Styles 7

Symbolic Layout V D D 3 O ut In 1 Dimensionless layout entities Only topology is important Final layout generated by compaction program G N D Stick diagram of inverter ECE 260B CSE 241A Design Styles 8

Standard Cells Organized in rows Cells made as full custom by Logic Cell Feedthrough Cell All layers customized Digital with possible special analog cells Simulation at gate level (digital) Medium-high density Medium-high performance Reasonable design time ECE 260B CSE 241A Design Styles 9 Rows of Cells vendor (not user) Routing Channel Functional Module (RAM, multiplier, ) Routing channel requirements are reduced by presence of more interconnect layers

Standard Cell Example [Brodersen92] ECE 260B CSE 241A Design Styles 10

Standard Cell - Example 3-input NAND cell (from Mississippi State Library) characterized for fanout of 4 and for three different technologies ECE 260B CSE 241A Design Styles 11

Automatic Cell Generation Random-logic layout generated by CLEO cell compiler (Digital) ECE 260B CSE 241A Design Styles 12

Module Generators Compiled Datapath buffer adder reg1 reg0 bus2 mux bus0 bus1 routing area feed-through bit-slice Advantages: One-dimensional placement/routing problem ECE 260B CSE 241A Design Styles 13

Macrocell-Based Design Predefined macro blocks (up, RAM, etc.) Macro blocks made as full custom by vendor (IP blocks) All layers customized Digital and some analog Macrocell Simulation at behavior or gate level High density High performance Short design time Use standard on-chip busses System on a chip (SOC) ECE 260B CSE 241A Design Styles 14 Interconnect Bus Routing Channel

Macrocell Design Methodogoly Floorplan: Defines overall topology of design, relative placement of modules, and global routes of busses, supplies, and clocks SRAM Routing Channel SRAM Data paths Standard cells Video-encoder chip [Brodersen92] ECE 260B CSE 241A Design Styles 15

Gate Array Predefined transistors connected via metal Two types: channel based, sea of gates Only metal layers customized Fixed array sizes Digital cells in library Simulation at gate level (digital) Medium density Medium performance Reasonable design time ECE 260B CSE 241A Design Styles 16 rows of uncommitted cells routing channel

Gate Array Primitive Cells polysilicon In1 In 2 In3 In4 VD D metal possible contact GND Out Uncommited Cell ECE 260B CSE 241A Design Styles 17 Committed Cell (4-input NOR)

Sea-of-gate Primitive Cells O x id e - i s o l a t io n PM O S PMOS NM OS NM OS NM OS Using oxide-isolation ECE 260B CSE 241A Design Styles 18 Using gate-isolation

Sea-of-gates Random Logic Memory Subsystem LSI Logic LEA300K (0.6 µm CMOS) ECE 260B CSE 241A Design Styles 19

Prewired Arrays Programmable logic blocks Programmable connections between logic blocks No layers customized (standard devices) Digital only Low-medium performance Low-medium density Programmable: SRAM, EPROM, Flash, Anti-fuse, etc. Easy and quick design changes Cheap design tools Low development cost High device cost NOT a real ASIC ECE 260B CSE 241A Design Styles 20 Courtesy Altera Corp.

Programmable Logic Devices PLA ECE 260B CSE 241A Design Styles 21 PROM PAL

EPLD Block Diagram Primary inputs Macrocell Courtesy Altera Corp. ECE 260B CSE 241A Design Styles 22

Field-Programmable Gate Arrays - Fuse-based I/O B u ffe r s P r o g r a m / T e s t / D ia g n o s t i c s V e r ti c a l ro u te s I/O B u ffe rs I/O B u ffe r s Standard-cell like floorplan R o w s o f lo g i c m o d u le s R o u tin g c h a n n e ls I/O B u ffe r s ECE 260B CSE 241A Design Styles 23

Interconnect P r o g r a m m e d in t e r c o n n e c t io n I n p u t/o u tp u t p in C e ll A n tifu s e H o riz o n ta l tra c k s V e r t ic a l t r a c k s ECE 260B CSE 241A Design Styles 24 Programming interconnect using anti-fuses

Field-Programmable Gate Arrays - RAM-based CLB CLB switching matrix Horizontal routing channel Interconnect point CLB CLB Vertical routing channel ECE 260B CSE 241A Design Styles 25

RAM-based FPGA - Basic Cell (CLB) C o m b in a tio n a l lo g ic S to ra g e e l e m e n ts R A B /Q 1 /Q 2 D Any function of up to 4 variables C /Q 1 /Q 2 R in F F D G B /Q 1 /Q 2 C /Q 1 /Q 2 F CE D A Q 1 Any function of up to 4 variables R G F Q 2 G D E D CE G C lo c k C E Courtesy of Xilinx ECE 260B CSE 241A Design Styles 26

RAM-based FPGA Xilinx XC4025 ECE 260B CSE 241A Design Styles 27

High Performance Devices Mixture of full custom, standard cells and macro s Full custom for special blocks: Adder (data path), etc. Macro s for standard blocks: RAM, ROM, etc. Standard cells for non critical digital blocks ECE 260B CSE 241A Design Styles 28

Global Signaling and Layout Global signaling and layout optimization Multi-Vdd Static power analysis Multi-Vth + Vdd + sizing ECE 260B CSE 241A Design Styles 29 D. Sylvester, DAC-2001

Global Signaling Current global signaling paradigm insert large static CMOS repeaters to reduce wire RC delay Impending problems: Too many repeaters - 180nm processors: 22K repeaters (Itanium), 70K (Power4) - Project 1-1.5M repeaters at 45-65nm technologies Too much power - Many large repeaters = significant static and dynamic power Too much noise - Repeater clustering complicates power distribution - Inductive coupling across wide bus structures ECE 260B CSE 241A Design Styles 30 D. Sylvester, DAC-2001

Cell Layout Optimization Advanced layout techniques must allow Continuous individual device sizing Variable p/n ratios Tapered FET stacking sizes Arbitrary Vth assignments within gates First cut: Cadabra 15-22% power reduction using 1st two approaches under fixed footprint constraint Optimize specific instances of standard gates Ref: Hurat, Cadabra GDSII Import ECE 260B CSE 241A Design Styles 31 D. Sylvester, DAC-2001 Compact fixed width

Multi-Vdd Global signaling and layout optimization Multi-Vdd Static power analysis Multi-Vth + Vdd + sizing ECE 260B CSE 241A Design Styles 32 D. Sylvester, DAC-2001

Multi-Vdd Status Idea: Incorporate two Vdd s to reduce dynamic power Limited to a few recent Japanese multimedia processors Example 0.3 µm, 75MHz, 3.3V media processor (Toshiba) - Total power savings of 47% in logic, 69% in clock Dynamic voltage scaling of mobile processors - Transmeta Crusoe, Intel Speedstep, etc. - Not considered in this talk Very powerful technique currently applied only in low-performance designs Mentality: today s high performance parts aren t limited by power ECE 260B CSE 241A Design Styles 33 D. Sylvester, DAC-2001

Lower Power Via Rich Replacement Media processors and 60-70% of paths have delay half the clock period After replacement, most paths become near critical What about high-speed % of total paths other low speed designs have many non-critical paths microprocessors? Path delay (normalized to clock period) ECE 260B CSE 241A Design Styles 34 D. Sylvester, DAC-2001

Similar Story For High-Performance IBM 480 MHz PowerPC shows over 50% of paths have delay less than half the clock period Implies that high-performance designs can benefit from multivdd Ref: Akrout, JSSC98 ECE 260B CSE 241A Design Styles 35 D. Sylvester, DAC-2001

Resizing Is Not The Right Answer Post-synthesis optimizations resize gates to recover power on non-critical paths Looks similar to pre- and post-replacement figures in media processor Before postsynthesis resizing After postsynthesis resizing This is the wrong approach for nanometer design! ECE 260B CSE 241A Design Styles 36 D. Sylvester, DAC-2001 Ref: Sirichotiyakul, DAC99

Multi-Vdd Instead of Sizing Power ~C Vdd2 f, where f is fixed Key: Reducing gate width impacts power sub-linearly Interconnect capacitance is not affected Reducing supply voltage cuts power quadratically All capacitive loads have lower voltage swing How can we minimize delay penalty at low Vdd? ECE 260B CSE 241A Design Styles 37 D. Sylvester, DAC-2001

Challenges For Multi-Vdd Area overhead Toshiba reported 7% rise in area due to placement restrictions, level converters, additional power grid routing EDA tool support for the above issues (placement, dual power routing) Noise analysis Additional shielding required between Vdd,low and Vdd,high signals? Including clock network ECE 260B CSE 241A Design Styles 38 D. Sylvester, DAC-2001

Static Power Global signaling and layout optimization Multi-Vdd Static power Multi-Vth + Vdd + sizing ECE 260B CSE 241A Design Styles 39 D. Sylvester, DAC-2001

Static Power Why do we care about static power in non-portable devices? Standby power is wasted -- leaves fewer Watts for computation Worsens reliability by raising die temperatures Leakage current is a function of Vth and subthreshold swing (Ss) (x10 at operating vs. room temp!) V th I off 10 10 Ss A/ m Ss expected to remain at 80-85 mv/dec (room temp) Device technology may cut this by ~ 20% Vth reductions are mandated by scaling Vdd V has been around VddD./5 Sylvester, DAC-2001 th Design Styles 40 ECE 260B CSE 241A

Current Status No sub-1v technologies demonstrate good on/ off current performance (yet expect improvements in production) Oxide scaling is running out of steam; overall 3~x Ioff per node Reference ITRS node Tox (Å) (electrical) Vdd Ion (µa/ µm) Ioff (na/ µm) Intel,00 50-70 18 0.85 514 100 Samsung,00 100 21 1.2 860 10 NEC,00 70 25 1.2 697 10 TI,99 100 27 1.2 800 10 Intel,99 70 32 1.2 650 3 NEC,00 100 13 (physical) 1.0 723 16 ITRS 2000 100 12-15 (physical) 1.2 750 13 ITRS 2000 70 8-12 (physical) 0.9 750 40 ITRS 2000 50 6-8 (physical) 0.6 750 80 ITRS 2001 45 11 (uses high-k) 0.6 1250 3000 ECE 260B CSE 241A Design Styles 41 D. Sylvester, DAC-2001 Working numbers

Leakage Suppression Approaches Dual-Vth (most common) Low-Vth on critical paths, high-vth off Vdd Only cost is additional masks MTCMOS Pull Up Series inserted high-vth device cuts leakage current when off (sleep mode) Vout Delay and area penalties, control device sizing is critical Pull Down Other techniques Substrate biasing to control Vth Dual-Vth domino - Use low-vth devices only in evaluate paths ECE 260B CSE 241A Design Styles 42 D. Sylvester, DAC-2001 Vcontrol Parasitic Node High Vth Device

Can Gate-length biasing help leakage reduction? Reduce leakage? 1.2 1 0.8 Leakage Delay 0.6 0.4 0.2 13 0 13 1 13 2 13 3 13 4 1 35 13 6 13 7 1 38 13 9 14 0 0 Variation of leakage and delay (each normalized to 1) for an NMOS device in an industrial 130nm technology Gate-length (nm) Reduce leakage variability? Leakage Variability Leakage Biasing Gate-length ECE 260B CSE 241A Design Styles 43 Leakage Leakage Variability Gate-length

Gate-length Biasing First proposed by Sirisantana et al. Comparative study of effect of doping, tox and gate-length Large bias used, significant slow down Small bias Little reduction in leakage beyond 10% bias while delay degrades linearly Preserves pin compatibility Technique applicable as post-ret step Salient features Design cycle not interfered Zero cost (no additional masks) ECE 260B CSE 241A Design Styles 44

Granularity Technology-level All devices in all cells have one biased gate-length Cell-level All devices in a cell have one biased gate-length Device-level All devices have independent biased gate-length Simplification: In each cell, NMOS devices have one gate-length and PMOS devices have another ECE 260B CSE 241A Design Styles 45

Device-Level Leakage Reduction Leakage saving with a delay penalty of up to 10% (Simplified device level biasing) 40 35 30 25 Low Vt 20 Nom Vt High Vt 15 10 5 0 INVX4 ECE 260B CSE 241A Design Styles 46 NANDX4 BUFX4 ANDX6

Circuit level Bias gate-length for non-critical cells Library extended with each cell having a biased version Benefits analyzed in conjunction with Multi-VT assignment and in isolation SVT-SGL DVT-SGL SVT-DGL DVT-DGL ECE 260B CSE 241A Design Styles 47

Normalized Leakage Results: Leakage Reduction 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 SVT-SGL SVT-DGL DVT-SGL DVT-DGL c5315 c6288 c7552 alu128 With less than 2.5% delay penalty Design Compiler used for VT assignment and gate-length biasing Better results expected with Duet (academic sizer from Michigan) ECE 260B CSE 241A Design Styles 48

Results: Leakage Variability Leakage distribution for the testcase alu128 Traces shown Unbiased circuit Technology level biasing Uniform biasing 60.00% Percentage Reduction in Leakage Spread 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% c5315 ECE 260B CSE 241A Design Styles 49 c6288 c7552 alu128

Futures Construction of effective biasing based leakage optimization heuristics Gate-length selection at true device-level granularity Evaluation of gate-length biasing at future technology nodes ECE 260B CSE 241A Design Styles 50

Multi-Vth + Vdd + Sizing Global signaling and layout optimization Multi-Vdd Static power analysis Multi-Vth + Vdd + sizing ECE 260B CSE 241A Design Styles 51 D. Sylvester, DAC-2001

Multi-Everything Need an approach that selects between speed, static power, and dynamic power Should be scalable to nanometer design Rules out dual-vth domino or other dynamic logic families (low supplies kill performance advantages) Techniques mentioned so far Flexible, optimized cell layouts Multi-Vdd Dual-Vth Put them all together ECE 260B CSE 241A Design Styles 52 D. Sylvester, DAC-2001

Multi-Vdd Can Leverage Vth s Existing designs using multi-vdd do not alter Vth in lowvdd cells Highly sub-optimal, delay is fully penalized Limits cell replacement limits power savings Much better solution: reduce Vth in low-vdd cells to carefully balance delay, static power, and dynamic power Enforce technology scaling within a chip whenever we reduce Vdd, we also reduce Vth to maintain speed ECE 260B CSE 241A Design Styles 53 D. Sylvester, DAC-2001

Multi-Vdd + Vth Negates Delay Penalty Delay ~CVdd/Ion Scenarios Constant Vth (current paradigm) Scale Vth to maintain constant static power Scale Vth to reduce static power linearly with Vdd Delay penalty is substantially offset I is very sensitive to V on th at Vdd < 1V Pstatic reduces with Vdd due to linear term and smaller Ioff (Ion and DIBL ) ECE 260B CSE 241A Design Styles 54 D. Sylvester, DAC-2001

Now Add Sizing Multi-Vdd + multi-vth + sizing/cell layout optimization attacks power from many angles (multi-dimensional) Depending on criticality and switching activities, noncritical gates can be: Assigned Vdd,low Assigned Vdd,low + lower Vth Assigned Vth,high Downsized (at the individual transistor level if advantageous) Assigned Vdd,low and upsized - For gates that cannot tolerate Vdd,low delay, this can be power efficient And others ECE 260B CSE 241A Design Styles 55 D. Sylvester, DAC-2001

Summary Power density must saturate to maintain affordable packaging options 50 W/ cm2 means 200-250W for future large MPUs Dynamic thermal management saves 25% on packaging power budget Multi-Vdd will leverage multiple Vth s to offset delay penalty at low Vdd More widespread re-assignment to Vdd,low Use Vdd first instead of re-sizing to take advantage of large path slacks Anticipated power savings of 50-80% Static power also addressed through multi-vth + Vdd + sizing Vth difficult to control in ultra-short channels DAC-2001 Intra-cell V assignment D.+Sylvester, MTCMOS/variants + sleep modes CSE 241A Design Styles ECE 260B th 56

Next Week: Project Meetings ECE 260B CSE 241A Design Styles 57 D. Sylvester, DAC-2001