Microprocessor Design in the Nanoscale Era

Similar documents
Trends and Challenges in VLSI Technology Scaling Towards 100nm

Transistor Scaling in the Innovation Era. Mark Bohr Intel Senior Fellow Logic Technology Development August 15, 2011

Intel's 65 nm Logic Technology Demonstrated on 0.57 µm 2 SRAM Cells

Probabilistic and Variation- Tolerant Design: Key to Continued Moore's Law. Tanay Karnik, Shekhar Borkar, Vivek De Circuit Research, Intel Labs

ECE 5745 Complex Digital ASIC Design Topic 2: CMOS Devices

Newer process technology (since 1999) includes :

Advanced Digital Integrated Circuits. Lecture 2: Scaling Trends. Announcements. No office hour next Monday. Extra office hour Tuesday 2-3pm

SoC Technology in the Era of 3-D Tri-Gate Transistors for Low Power, High Performance, and High Density Applications

Homework 10 posted just for practice. Office hours next week, schedule TBD. HKN review today. Your feedback is important!

EE241 - Spring 2013 Advanced Digital Integrated Circuits. Announcements. Sign up for Piazza if you haven t already

Low-Power VLSI. Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering

Lecture #29. Moore s Law

The future of lithography and its impact on design

Intel Demonstrates High-k + Metal Gate Transistor Breakthrough on 45 nm Microprocessors

Introduction. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. July 30, 2002

IFSIN. WEB PAGE Fall ://weble.upc.es/ifsin/

MICROPROCESSOR TECHNOLOGY

Interconnect-Power Dissipation in a Microprocessor

Power and Energy. Courtesy of Dr. Daehyun Dr. Dr. Shmuel and Dr.

EMT 251 Introduction to IC Design

Signal Integrity Design of TSV-Based 3D IC

Power Spring /7/05 L11 Power 1

ISSCC 2003 / SESSION 1 / PLENARY / 1.1

Leakage Power Minimization in Deep-Submicron CMOS circuits

04/29/03 EE371 Power Delivery D. Ayers 1. VLSI Power Delivery. David Ayers

ITRS MOSFET Scaling Trends, Challenges, and Key Technology Innovations

A 90 nm High Volume Manufacturing Logic Technology Featuring Novel 45 nm Gate Length Strained Silicon CMOS Transistors

EEC 216 Lecture #10: Ultra Low Voltage and Subthreshold Circuit Design. Rajeevan Amirtharajah University of California, Davis

EECS 151/251A Spring 2019 Digital Design and Integrated Circuits. Instructors: Wawrzynek. Lecture 8 EE141

+1 (479)

Semiconductor Memory: DRAM and SRAM. Department of Electrical and Computer Engineering, National University of Singapore

FinFET-based Design for Robust Nanoscale SRAM

2009 Spring CS211 Digital Systems & Lab 1 CHAPTER 3: TECHNOLOGY (PART 2)

Low Transistor Variability The Key to Energy Efficient ICs

Intel Xeon E3-1230V2 CPU Ivy Bridge Tri-Gate 22 nm Process

Design Challenges in Multi-GHz Microprocessors

Logic Technology Development, *QRE, ** TCAD Intel Corporation

1 Digital EE141 Integrated Circuits 2nd Introduction

Study the Analysis of Low power and High speed CMOS Logic Circuits in 90nm Technology

Transistor was first invented by William.B.Shockley, Walter Brattain and John Bardeen of Bell Labratories. In 1961, first IC was introduced.

A Software Technique to Improve Yield of Processor Chips in Presence of Ultra-Leaky SRAM Cells Caused by Process Variation

Low Power High Performance 10T Full Adder for Low Voltage CMOS Technology Using Dual Threshold Voltage

Practical Information

EDA Challenges for Low Power Design. Anand Iyer, Cadence Design Systems

Pushing Ultra-Low-Power Digital Circuits

Introduction to VLSI ASIC Design and Technology

Low Power Design of Successive Approximation Registers

Lecture 17. Low Power Circuits and Power Delivery

Digital Integrated Circuits Perspectives. Administrivia

18nm FinFET. Lecture 30. Perspectives. Administrivia. Power Density. Power will be a problem. Transistor Count

32nm Technology and Beyond

Intel s High-k/Metal Gate Announcement. November 4th, 2003

Opportunities and Challenges in Ultra Low Voltage CMOS. Rajeevan Amirtharajah University of California, Davis

Practical Information

3D ICs: Recent Advances in the Industry

Progress due to: Feature size reduction - 0.7X/3 years (Moore s Law). Increasing chip size - 16% per year. Creativity in implementing functions.

Challenges and Innovations in Nano CMOS Transistor Scaling

EE4800 CMOS Digital IC Design & Analysis. Lecture 1 Introduction Zhuo Feng

Fin-Shaped Field Effect Transistor (FinFET) Min Ku Kim 03/07/2018

65-GHz Receiver in SiGe BiCMOS Using Monolithic Inductors and Transformers

450mm and Moore s Law Advanced Packaging Challenges and the Impact of 3D

Lecture 04 CSE 40547/60547 Computing at the Nanoscale Interconnect

Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University

Intel s s Silicon Power Savings Strategy

Reliability and Energy Dissipation in Ultra Deep Submicron Designs

Ruixing Yang

Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. July 30, Digital EE141 Integrated Circuits 2nd Introduction

Lecture Integrated circuits era

EECS150 - Digital Design Lecture 15 - CMOS Implementation Technologies. Overview of Physical Implementations

EECS150 - Digital Design Lecture 9 - CMOS Implementation Technologies

Preliminary Datasheet

Sleepy Keeper Approach for Power Performance Tuning in VLSI Design

Chapter 7 Introduction to 3D Integration Technology using TSV

Design of Nano-Electro Mechanical (NEM) Relay Based Nano Transistor for Power Efficient VLSI Circuits

Bridging the Gap between Dreams and Nano-Scale Reality

DATE 2016 Early Reliability Modeling for Aging and Variability in Silicon System (ERMAVSS Workshop)

Ramon Canal NCD Master MIRI. NCD Master MIRI 1

UNIT-II LOW POWER VLSI DESIGN APPROACHES

Integrated Circuit Technology (Course Code: EE662) Lecture 1: Introduction

Disseny físic. Disseny en Standard Cells. Enric Pastor Rosa M. Badia Ramon Canal DM Tardor DM, Tardor

FinFET vs. FD-SOI Key Advantages & Disadvantages

30% PAE W-band InP Power Amplifiers using Sub-quarter-wavelength Baluns for Series-connected Power-combining

Effect of Aging on Power Integrity of Digital Integrated Circuits

Sub-micron technology IC fabrication process trends SOI technology. Development of CMOS technology. Technology problems due to scaling

Lecture 18 SOI Design Power Distribution. Midterm project reports due tomorrow. Please post links on your project web page

CMOS Process Variations: A Critical Operation Point Hypothesis

FUTURE PROSPECTS FOR CMOS ACTIVE PIXEL SENSORS

Signal Integrity Modeling and Measurement of TSV in 3D IC

Body-Biased Complementary Logic Implemented Using AlN Piezoelectric MEMS Switches

Energy Efficient Circuit Design and the Future of Power Delivery

Pramoda N V Department of Electronics and Communication Engineering, MCE Hassan Karnataka India

Integrated Power Delivery for High Performance Server Based Microprocessors

VLSI Design. Introduction

LSI and Circuit Technologies for the SX-8 Supercomputer

ECE 484 VLSI Digital Circuits Fall Lecture 02: Design Metrics

EMERGING SUBSTRATE TECHNOLOGIES FOR PACKAGING

Hot Topics and Cool Ideas in Scaled CMOS Analog Design

EECS 427 Lecture 13: Leakage Power Reduction Readings: 6.4.2, CBF Ch.3. EECS 427 F09 Lecture Reminders

Variation-Aware Design for Nanometer Generation LSI

Enabling Technology Development Through Modeling

Transcription:

Microprocessor Design in the Nanoscale Era Stefan Rusu Senior Principal Engineer Intel Corporation IEEE Fellow stefan.rusu@intel.com 2012 Stefan Intel Rusu Corporation July 2012 1

Agenda Microprocessor Design Trends Process Technology Directions Active Power Management Leakage Reduction Techniques Packaging and Thermal Modeling Future Directions and Summary 2

Microprocessor Evolution 4004 Processor Westmere-EX Processor Year 1971 2011 Transistors 2300 2.6 B Process 10 µm 32 nm Die area 12 mm 2 513 mm 2 Die photos not at scale 3

Microns Scaling Trends 10 Feature Size 1 0.1 0.01 0.7x every 2 years 65nm 45nm 32nm 22nm 1970 1980 1990 2000 2010 2020 Transistor dimensions scale to improve performance, reduce power and reduce cost per transistor M. Bohr 4

Client Processor Trend: Integrated Graphics Ivy Bridge 22nm client processor with monolithic integrated graphics Up to 4 dual-threaded cores and 8MB L3 cache Dual channel DDR3 memory controller at 1600MT/s Integrated PCIe interface (16 Gen3 + 4 Gen2 + 4 DMI lanes) - First Client CPU to support PCIe Gen3 Three independent displays 1.4B transistors in 160mm 2 die S. Damaraju, ISSCC 2012 5

Client Processor Trend: Integrated WiFi RF TLine to ANT (50Ω) Balun Filter 50Ω Diff. Package Chip TX RF PA G m RX RF TX SW RX SW / Matching Network Sensitive RF circuits integrated with 32nm ATOM and PCH Integration of traditional III-V RF components -21dBm Power amp, and 34dBm T/R switch, 3.5dB NF LNA H. Lakdawala, ISSCC 2012 6

Number of cores Server Processor Trends: More Cores 12 Westmere-EX 8 Nehalem-EX Dunnington 4 Tigerton 0 Tulsa Xeon EX Processors 65nm 45nm 32nm Server core count increases every generation, while keeping within flat power budget 7

Server Processor Trends: More Cache On-Die L3 Cache [MB] 32 28 24 20 16 12 8 4 0 30 24 16 8 4 1 Xeon EX Processors 180nm 130nm 90nm 65nm 45nm 32nm Cache size increases with every process generation 8

Power [W] Server Processors Power Trends 1000 100 10 1 Total Power Active Power 0.1 0.01 Leakage 0.001 1990 1995 2000 2005 2010 Year 9

Supply Voltage [V] Voltage Scaling Has Slowed Down 10 1 ~0.7X Scaling ~0.95X Scaling 0.1 '91 '93 '95 '97 '99 '01 '03 '05 '07 '09 '11 '13 10

Agenda Microprocessor Design Trends Process Technology Directions Active Power Management Leakage Reduction Techniques Packaging and Thermal Modeling Future Directions and Summary 11

30 Years of MOSFET Scaling Dennard 1974 Intel 2005 1 mm Gate Length: 1.0 mm 35 nm Gate Oxide Thickness: 35 nm 1.2 nm Operating Voltage: 4.0 V 1.2 V Classical scaling ended in the early 2000s due to gate oxide leakage limits M. Bohr, ISSCC 2009 12

90 nm Strained Silicon Transistors NMOS PMOS High Stress Film SiGe SiGe SiN cap layer Tensile channel strain SiGe source-drain Compressive channel strain Strained silicon provided increased drive currents, making up for lack of gate oxide scaling M. Bohr, ISSCC 2009 13

45 nm High-k + Metal Gate Transistors 65 nm Transistor 45 nm HK+MG SiO 2 dielectric Polysilicon gate electrode Hafnium-based dielectric Metal gate electrode High-k + metal gate transistors break through gate oxide scaling barrier M. Bohr, ISSCC 2009 14

HK/MG Gate Leakage Reduction 1000x 25x HK+MG significantly reduces gate leakage K. Mistry, IEDM 2007 15

Normalized Cell Leakage 6T SRAM Bit Cell Leakage Reduction 12 10 8 1.0V, 25C 6 4 2 0 I GATE I OFF I JUNCT 65nm 10x 45nm SRAM bit cell leakage reduced ~10x M. Bohr, ISSCC 2009 16

Traditional Planar Transistor Gate High-k Dielectric Source Drain Oxide Silicon Substrate Traditional 2-D planar transistors form a conducting channel in the silicon region under the gate electrode when in the on state M. Bohr, 2011 17

22 nm Tri-Gate Transistor Gate Drain Source Oxide Silicon Substrate 3-D Tri-Gate transistors form conducting channels on three sides of a vertical fin structure, providing fully depleted operation 18

Transistor Scaling Trends 32 nm Planar Transistors 22 nm Tri-Gate Transistors Gates Fins M. Bohr, 2011 19

Transistor Gate Delay 2.2 2.0 1.8 22% Faster Transistor Gate Delay (normalized) 1.6 1.4 1.2 1.0 0.8 32nm Planar 45nm Planar 22% Faster 0.6 0.5 0.6 0.7 0.8 0.9 1.0 1.1 Operating Voltage (V) 32nm planar transistors 22% faster than 45nm planar D. Perlmutter, ISSCC 2012 20

Transistor Gate Delay 2.2 2.0 22% Faster Transistor Gate Delay (normalized) 1.8 1.6 1.4 1.2 1.0 0.8 14% Faster 22nm Planar 32nm Planar 45nm Planar 22% Faster 14% Faster 0.6 0.5 0.6 0.7 0.8 0.9 1.0 1.1 Operating Voltage (V) 22nm planar transistors would have been only 14% faster 21

Transistor Gate Delay 2.2 2.0 1.8 Transistor Gate Delay (normalized) 1.6 1.4 1.2 37% Faster 32nm Planar 1.0 0.8 22nm Tri-Gate 18% Faster 0.6 0.5 0.6 0.7 0.8 0.9 1.0 1.1 Operating Voltage (V) 22nm Tri-Gate transistors provide improved performance at high voltage and unprecedented 37% speedup at low voltage 22

Intel Transistor Leadership 2003 2005 2007 2009 2011 90 nm 65 nm 45 nm 32 nm 22 nm SiGe SiGe Invented SiGe Strained Silicon 2 nd Gen. SiGe Strained Silicon Invented Gate-Last High-k Metal Gate 2 nd Gen. Gate-Last High-k Metal Gate First to Implement Tri-Gate Strained Silicon High-k Metal Gate Tri-Gate 23

Lithography Challenges 1000 nm Feature size 248nm 193nm 100 Lithography Wavelength Gap 13nm (EUVL) 10 89 91 93 95 97 99 01 03 05 07 09 11 13 15 Initial Production 193 nm enhancements enable the 22 nm generation 24

Extreme Ultraviolet Lithography EUV lithography uses extremely short wavelength light -Visible light 400 to 700 nm -DUV lithography 193 and 248 nm -EUV lithography 13 nm EUV Micro Exposure Tool World s First EUV Mask 25

Layout Restrictions 65 nm Layout Style 32 nm Layout Style Bi-directional features Varied gate dimensions Varied pitches Uni-directional features Uniform gate dimension Gridded layout M. Bohr, ISSCC 2009 26

450mm in the Era of Complex Scaling: Must coordinate demand drivers, technical requirements and resources End-User Demand Drivers Integrated IC Maker Coordination Equipment & Materials Development University and Government Support Projected 2000 Wafer, circa 1975 (Gordon Moore, ISSCC 03)

Process Variations Die-to-Die Variations Within-Die Variations Systematic Random Resist Thickness Lens Aberrations Random Placement of Dopant Atoms 28

Voltage and Temperature Variations Voltage -Chip activity change -Current delivery RLC -Dynamic: ns to 10-100µs -Within-die variation Temperature -Activity & ambient change -Dynamic: 100-1000µs -Within-die variation Temp ( o C) 29

# of Paths # of Paths Frequency Probability Impact on Design Methodology Due to variations in: Vdd, Vt, and Temp Path Delay Delay Deterministic Probabilistic Deterministic Probabilistic 10X variation ~50% total power Delay Target Delay Target Leakage Power Major paradigm shift from deterministic design to probabilistic / statistical design 30

Cell Area (um 2 ) SRAM Cell Size Scaling 10 1 45 nm, 0.346 um 2 0.1 0.5x every 2 years 32 nm, 0.171 um 2 0.01 180 130 90 65 45 32 22 Process technology [nm] 22 nm, 0.092 um 2 Memory density continues to double every 2 years 31

Number of Metal Layers Interconnect Trends 10 8 6 4 2 Al Cu 0 500 350 250 180 130 90 65 45 32 22 Technology Generation (nm) 32

22nm Interconnects M1 to M8 cross-section M1-M6 use ultra-low-k ILD and self-aligned vias providing 13-18% capacitance reduction Cross-section of integrated MIM capacitor Enables capacitance density of >20fF/mm 2 C. Auth, VLSI Symposium 2012 33

On-chip Interconnect Trend Relative delay 100 Feature size (nm) 250 180 130 90 65 45 32 Global interconnect without repeaters 10 Global interconnect with repeaters 1 0.1 Source: ITRS, 2001 Gate delay (FO4) Local interconnect (M1,2) Local interconnects scale with gate delay Global interconnects do not keep up with scaling 34

Agenda Microprocessor Design Trends Process Technology Directions Active Power Management Leakage Reduction Techniques Packaging and Thermal Modeling Future Directions and Summary 35

Voltage and Frequency Scaling Frequency Max Target Frequency Required Frequency Data Retention Limit Performance Limit 3 2 1 Reliability Limit sub-threshold logic +/-10% J. Rosal, ISSCC 2006 V T V LOW V DD V HI 1 - Fixed V DD, Frequency Scaling: Linear Power Reduction 2 - Fixed Frequency, V DD Scaling: Square Power Reduction 3 - Voltage and Frequency scaling: Cubic Power Reduction V DD 36

Memory and RF Vmin Reduction Write Assist circuit temporarily drops the array supply node to make it easier to write into the bit-cell Shared across several cells CVCC Both Cache and Register Files use this technique to improve write Vmin in 22nm Ivy Bridge processor 22nm transistor and circuit improvements enable Vmin reduction of >100mV for Cache and 60mV for RF WL (0 --> 1) Write Data 0 BL Data 1 --> 0 Data# 0 --> 1 Write Data# 1 BL# S. Damaraju, ISSCC 2012 37

ROM ENERGY EFFICENCY 1.8 mm Scan NTV Pentium Processor HIGH Subthreshold NTV Normal operating range 1.1 mm ~5x Demonstrated IA-32 Core Logic Level Shifters + clk spine LOW L1$-I L1$-D ZERO Ultra-low Power VOLTAGE Energy Efficient 280 mv 0.45 V 1.2 V MAX High Performance 3 MHz 60 MHz 915 MHz 2 mw 10 mw 737 mw 1500 Mips/W 5830 Mips/W 1240 Mips/W S. Jain, ISSCC 2012 38

Clock Gating D En Clk 0 1 S D REG Q En Clk D D REG Q Save power by gating the clock when data activity is low Requires detailed logic validation 39

Core Power Management C0 HFM C0 LFM C1/C2 C4 C6 Core voltage Core clock OFF OFF OFF PLL OFF OFF L1 caches L2 caches flushed flushed partial flush off off Wakeup time Power active active <1us <30us <100us Modulating the processor core voltage and frequency enables lower power states Gerosa, A-SSCC 2008 40

Multiple Voltage Domains QPI QPI QPI QPI Core Supply Core Supply I/O Domain 1.1V fixed Fuse Un-Core Domain 0.9-1.1V fixed Core Supply Uncore Supply Core Supply Core Domain 0.85-1.1V variable SMI SMI Multiple voltage domains minimize power consumption across the core and uncore areas Rusu, ISSCC 2009 41

Multiple Clock Domains QPI QPI QPI QPI BCLK IO PLLs Filter PLL Un - core PLL IO DLLs Core PLLs SMI SMI Three primary clock domains: core, un-core, I/O Total of 16 PLLs and 8 DLLs Rusu, ISSCC 2009 42

Agenda Microprocessor Design Trends Process Technology Directions Active Power Management Leakage Reduction Techniques Packaging and Thermal Modeling Future Directions and Summary 43

I Off (A/um) Subthreshold Leakage Trend 1.E-04 1.E-06 1.E-08 1.E-10 1.E-12 Intel 15nm transistor Intel 20nm transistor Intel 30nm transistor Research data in literature ( ) Production data in literature ( ) 1.E-14 10 100 1000 Physical Gate Length (nm) 44

Leakage Reduction Techniques Body Bias Stack Effect Sleep Transistor Vdd +Ve Vbp Equal Loading Logic Block -Ve Vbn 2-3X reduction 2-1000X reduction 45

Normalized Leakage Leakage is a Strong Function of Voltage 100 90 80 70 60 50 40 30 20 10 0 130nm process Subthreshold Leakage 0 0.3 0.6 0.9 1.2 1.5 Voltage (V) Gate Leakage Sub-threshold and gate leakage reduce with lower supply voltage 46

Voltage Cache Sleep and Shut-off Modes Active Mode Sleep Mode Shut-off Mode Sub-array Sub-array Sub-array Virtual VSS Block Select Sleep Bias Shut off X X X 1.1V Virtual VSS 2x lower leakage 250mV 2x lower leakage 520mV 0V 0V S. Rusu, US Pat. 7,657,767 47

Leakage Shut-off Infrared Images 16MB part 8MB part 4MB part 16MB in sleep mode 8MB 8MB sleep shut-off 4MB 12MB sleep shut-off Leakage reduction 3W (8MB) 5W (4MB) 48

Cache Dynamic Shut-off Way 15 14 3 2 1 0 15 14 3 2 1 0 Data Tag Data Controller Controller Normal Operation In the full-load state, all 16 ways are enabled (green) Cache-by-Demand Operation Under idle or low-load states, cache ways are dynamically flushed out and put in shut-off mode (red) 49

Cache Leakage Management Three PMOS sleep transistor groups for sub-array leakage reduction Y. Wang, ISSCC 2009 50

Cache Leakage Reduction Benefit Leakage management circuit reduces sub-array leakage by 58% 51

Leakage Mitigation: Long-Le Transistors Nominal Le All transistors can be either nominal or long-le Most library cells are available in both flavors Long-Le transistors are ~10% slower, but have 3x lower leakage All paths with timing slack use long-le transistors Initial design uses only long channel devices Long Le (Nom+10%) S. Rusu, ISSCC 2006 52

Long-Le Transistors Usage Map QPI QPI QPI QPI Nehalem-EX long-channel device usage [percent] 90-100 80-90 70-80 60-70 50-60 SMI SMI Massive long-channel usage in uncore reduces leakage 53

Power & Leakage Breakdown Nehalem-EX 45nm example Vcore 54.6% Power Breakdown Vuncore 33.4% Leakage Breakdown Active 84% Reduction techniques Vpll 0.8% Vio 11.2% Clock gating Run uncore at 0.9V Leakage 16% Long channel device usage: 58% cores, 85% uncore S. Rusu, ISSCC 2009 54

Core and Cache Recovery Example QPI0 QPI1 QPI2 QPI3 Disabled Disabled Core2 Core5 System Interface Core1 Core6 Core0 Disabled Disabled Core7 SMI SMI Defective core and cache slices can be disabled in horizontal pairs S. Rusu, ISSCC 2009 55

Voltage Voltage Minimize Leakage in Disabled Blocks Disabled cores Power gated Active/ Shut-off Core 0.85V Active Leakage Reduction 40x Shut-off Virtual VCC Disabled cache slices All major arrays in shut-off Active SRAM array Sleep/ Shut-off 0.9V 0V 0V Active Sleep Shut-off Leakage Reduction 35% 83% Virtual VCC S. Rusu, ISSCC 2009 56

Core/Cache Recovery Infrared Image All cores and cache slices are enabled S. Rusu, ISSCC 2009 57

Core/Cache Recovery Infrared Image Shut-off 2 cores (top row) and 2 cache slices (bottom row) Disabled blocks are clock and power gated S. Rusu, ISSCC 2009 58

Agenda Microprocessor Design Trends Process Technology Directions Active Power Management Leakage Reduction Techniques Packaging and Thermal Modeling Future Directions and Summary 59

Microprocessor Package Evolution 1971 4004 Processor - 16-pin ceramic package - Wire bond attach - 750 khz I/O 2012 Xeon E5 Processor - 2011-contact organic package - Flip-chip attach - 8.0 GHz I/O 60

Heat Flux (W/cm2) Temperature (C) Power Density Models Power Map On-Die Temperature 250 200 150 100 50 0 110 100 90 80 70 60 50 40 With increasing power density and large on-die caches, detailed, non-uniform power models are required 61

Thermal Modeling Simulated power density Infrared emission microscope measurement D. Genossar and N. Shamir Intel Pentium M Processor Power Estimation, Budgeting, Optimization and Validation, Intel Technology Journal 5/2003 62

Thermal Sensors QPI0 QPI1 QPI2 QPI3 Multiple temperature sensors -One in each core hot spot -One in the die center Temperature information is available through PECI bus for system fan management SMI SMI 63

Power Gate Power Gate Power Gate Power Gate Power Gate Power Gate Power Gate Power Gate Power Management Unit Core 7 Core 6 Core 5 Core 4 Sensors Sensors Sensors Sensors Sensors Power Management Unit External Voltage Regulator Control Power Gates Control Sensors Sensors Sensors Sensors Core 0 Core 1 Core 2 Core 3 PMU controls processor voltage and frequency based on compute loading and thermal data 64

Agenda Microprocessor Design Trends Process Technology Directions Active Power Management Leakage Reduction Techniques Packaging and Thermal Modeling Future Directions and Summary 65

Future Directions 2D mesh network with multiple Voltage / Frequency islands Communication across islands achieved through FIFOs Ogras (CMU), DAC 2007 66

Voltage Fine Grain Power Management 0 f 0 f/2 0 25-core processor example: f/2 0 f f/2 f/2 0 0 f f f/2 f Cores with critical tasks Freq = f, at Vdd TPT = 1, Power = 1 0 f/2 0 f f/2 0 f f/2 0 f f/2 Non-critical cores Freq = f/2, at 0.7*Vdd TPT = 0.5, Power = 0.25 VDD Hi-Act Lo-Act Shut-off 0.7*VDD 0 Temporarily shut down TPT = 0, Power = 0 0 Pwr=1 Pwr=¼ Pwr=0 0V 0 Permanently disabled TPT = 0, Power = 0 67

Summary Moore s Law has fueled the worldwide technology revolution for over 40 years and will continue for at least another decade -0.7x transistor dimension scaling every two years -Tri-gate devices provide significant benefits Continued microprocessor performance improvement depends on our ability to manage active power and leakage -Clock and power gate un-used or disabled blocks -Multiple voltage and clock domains -Dynamic voltage and frequency adjustment Core and cache recovery enables multiple product options -Disabled cores and cache slices are clock and power gated 68