Computer Architecture

Size: px
Start display at page:

Download "Computer Architecture"

Transcription

1 Computer Architecture Lecture 01 Arkaprava Basu

2 Acknowledgements Several of the slides in the deck are from Luis Ceze (Washington), Nima Horanmand (Stony Brook), Mark Hill, David Wood, Karu Sankaralingam (Wisconsin), Abhishek Bhattacharjee(Rutgers) and ARM Inc. Development of this course is partially supported by Western Digital Corporation. 8/5/2018 2

3 What is computer architecture? Problem Algorithm Program Languages Compiler Runtime Systems (Runtime, OS, VM) Architecture (ISA) Microarchitecture Transistors Electrons 8/5/2018 3

4 What is computer architecture? Problem Algorithm Program Languages Compiler Runtime Systems (Runtime, OS, VM) Architecture (ISA) Microarchitecture Transistors Electrons 8/5/2018 3

5 What is computer architecture? Problem Algorithm Program Languages Compiler Runtime Systems (Runtime, OS, VM) Architecture (ISA) Microarchitecture Transistors Electrons Architecture: Exposed to the software. Defines H/W-S/W boundary Micro-architecture: S/W does not see H/W techniques to improve performance/power efficiency 8/5/2018 3

6 Why study computer architecture? Problem Algorithm Program Languages Compiler Runtime Systems (Runtime, OS, VM) Architecture (ISA) Microarchitecture Transistors Electrons Wide impact Even a 1% improvement impact lives of millions Sets trends in other fields Example: Deep learning 8/5/2018 4

7 Why study computer architecture? Problem Algorithm Program Languages Compiler Runtime Systems (Runtime, OS, VM) Architecture (ISA) Microarchitecture Transistors Electrons Wide impact Even a 1% improvement impact lives of millions Sets trends in other fields Example: Deep learning 8/5/2018 4

8 Why study computer architecture? If deep learning is a rocket (science) then High performance computing (HPC) is its engine Large volume of data is its fuel --Andrew Ng 8/5/2018 5

9 Why study computer architecture? If deep learning is a rocket (science) then High performance computing (HPC) is its engine Large volume of data is its fuel --Andrew Ng Made possible by computer architecture! 8/5/2018 5

10 Why study computer architecture? Problem Algorithm Program Languages Compiler Runtime Systems (Runtime, OS, VM) Architecture (ISA) Microarchitecture Transistors Electrons Wide impact Even a 1% improvement impact lives of millions Sets trends in other fields Example: Deep learning Critical for understanding aspects of computing Example: Meltdown, specter 8/5/2018 6

11 Why study computer architecture? Problem Algorithm Program Languages Compiler Runtime Systems (Runtime, OS, VM) Architecture (ISA) Microarchitecture Transistors Electrons Wide impact Even a 1% improvement impact lives of millions Sets trends in other fields Example: Deep learning Critical for understanding aspects of computing Example: Meltdown, specter 8/5/2018 6

12 Why study computer architecture? 8/5/2018 7

13 Why study computer architecture? Microarchitectural became visible to adversary during a short time window Caused by speculative execution fundamental technique in computer architecture 8/5/2018 7

14 Why study computer architecture? Hardware is the final level of trust to ensure security and privacy OS s are often compromised (e.g., rootkits) In cloud computing, tenants don t want to trust the hypervisor Traditionally, security has been layer below security has to be hardware up 8/5/2018 8

15 Why study computer architecture? DAX GPU CPU TPU DGX1 Crypto New type of computing: domain-specific accelerators How to design the software stack for accelerators? How to make accelerators both programmable and efficient? 8/5/2018 9

16 Why study computer architecture? HBM DRAM HBM New type of memory technology: Redesign memory hierarchy (both h/w and s/w) for new type of memories 8/5/

17 Why study computer architecture? Problem Algorithm Program Languages Compiler Runtime Systems (Runtime, OS, VM) Architecture (ISA) Microarchitecture Transistors Electrons New generation computing H/W and S/W codesign is key 8/5/

18 Why study computer architecture? Problem Algorithm Program Languages Compiler Runtime Systems (Runtime, OS, VM) Architecture (ISA) Microarchitecture Transistors Electrons New generation computing H/W and S/W codesign is key Need to rethink how we design computers Opportunity for research contribution 8/5/

19 In this course, you will.. Learn about basics of processor design Hands on experience on how architecture impacts software programs One homework using hardware performance counters Learn about latest trends in computing Current technology trends and constraints How a modern CPU works? How a modern S/W stack interacts with CPU? Emerging topics: GPUs, accelerators, security Get a taste of research Course project 8/5/

20 Guest lectures Dr. Kanishka Lahiri (AMD) Dr. Vivek Seshadri (Microsoft Research) Prof. R. Govindarajan (IISc) Prof. Vinod Ganapathy (IISc) --- possibly someone from Intel Labs (not yet confirmed) 8/5/

21 Grading Learning is more important!! Research paper reviews (any 5) Due Date Points Grading TBD 15(3*5) Qualitative One homework Mid/late Semester 15 Qualitative/Quantitative Two exams Project End Sept and early December Proposal mid/end Sept and final report/presentation in early Dec. 40 (2*20) Qualitative/Quantitative 45 Qualitative 8/5/

22 Grading Learning is more important!! Research paper reviews (any 5) Due Date Points Grading TBD 15(3*5) Qualitative One homework Mid/late Semester 15 Qualitative/Quantitative Two exams Project End Sept and early December Proposal mid/end Sept and final report/presentation in early Dec. 40 (2*20) Qualitative/Quantitative 45 Qualitative Adds up to 115!! Opportunity to make up for a bad exam/homework/review 8/5/

23 Format of paper reviews Maximum 250 words and one page Three sections Summary of the paper (Write in your own words) Something new that you learned/things you liked Critique/What could be improved/lacking 8/5/

24 Policy on academic integrity You may... Discuss assignment, design, techniques. You may not Share code outside your group. Use any code not distributed as part of project handouts. What happens if you break academic integrity policies. Unpleasant conversations with me. You get hoisted to the department chair, the deans, etc. An F for the course. Really, think about the cost-benefit analysis. It s simply not worth it. 8/5/

25 Let s get started 8/5/

26 Today s agenda Fundamental technology trends that has shaped computer architecture Basic performance and power/energy metrics A few fundamental laws/equations 8/5/

27 Drivers of growth in computing In last four decades, computing capability has increased ~10 6 times Two primary sources of growth in computing: Computer architecture improvement (~10 3 times) o Example: Branch prediction, superscalar, speculation, out-oforder execution, caches, SIMD. Improvement in transistor technology (~10 3 times) o Smaller transistors faster, less power 8/5/

28 Drivers of growth in computing In last four decades, computing capability has increased ~10 6 times Two primary sources of growth in computing: Computer architecture improvement (~10 3 times) o Example: Branch prediction, superscalar, speculation, out-oforder execution, caches, SIMD. Improvement in transistor technology (~10 3 times) o Smaller transistors faster, less power Focus 8/5/

29 Drivers of growth in computing In last four decades, computing capability has increased ~10 6 times Two primary sources of growth in computing: Computer architecture improvement (~10 3 times) o Example: Branch prediction, superscalar, speculation, out-oforder execution, caches, SIMD. Focus Improvement in transistor technology (~10 3 times) o Smaller transistors faster, less power Next few slides 8/5/

30 Transistor technology source Substrate gate insulator channel drain source gate channel drain Basic technology element: MOSFET Solid-state component acts like electrical switch MOS: metal-oxide-semiconductor Conductor, insulator, semi-conductor FET: field-effect transistor Channel conducts sourcedrain only when voltage applied to gate Channel length: characteristic parameter (short fast) Aka feature size or technology node Currently: 14 nanometers (nm) 7 nm soon 8/5/

31 Moore s law (1965) Transistors per inch square Twice as many after ~1.5-2 years Some technology-based ramifications Annual improvements in density, speed, power, costs SRAM/logic: density: ~30%, speed: ~20% DRAM: density: ~60%, speed: ~4% Disk: density: ~60%, speed: ~10% (non-transistor) Big improvements in flash memory and network bandwidth too Related trends Processor performance twice as fast after ~18 months Memory capacity doubles in <2 years 8/5/

32 Impact of Moore s law More transistors More cool things to do with extra transistors Smaller transistors faster switching higher frequency More transistor Cheaper transistor (amortizes design cost) Virtuous cycle of improvement 8/5/

33 Dennard scaling Doubling the transistors; scale their power down Transistor: 2D Voltage-Controlled Switch Dimensions Voltage Doping Concentrations 0.7 8/5/

34 Dennard scaling Doubling the transistors; scale their power down Area Transistor: 2D Voltage-Controlled Switch Dimensions Voltage Doping Concentrations Capacitance Frequency /5/

35 Dennard scaling Doubling the transistors; scale their power down Area Transistor: 2D Voltage-Controlled Switch Dimensions Voltage Doping Concentrations Capacitance Frequency Power = Capacitance Frequency Voltage 2 8/5/

36 Dennard scaling Doubling the transistors; scale their power down Area Transistor: 2D Voltage-Controlled Switch Dimensions Voltage Doping Concentrations Capacitance 0.7 Frequency Power 1.4 Power = Capacitance Frequency Voltage /5/

37 Dennard scaling Doubling the transistors; scale their power down Area Transistor: 2D Voltage-Controlled Switch Dimensions Voltage Doping Concentrations Capacitance 0.7 Frequency Power 1.4 Power = Capacitance Frequency Voltage /5/

38 Historical impact of Moore s law and Dennard s scaling 8/5/

39 Historical impact of Moore s law and Dennard s scaling 8/5/

40 Benefits of transistor scaling are in peril In last four decades, computing capability has increased ~10 6 times Two primary sources of growth in computing: Computer architecture improvement (~10 3 times) o Example: Branch prediction, superscalar, speculation, out-oforder execution, caches, SIMD. Improvement in transistor technology (~10 3 times) o Smaller transistors faster, less power 8/5/

41 Benefits of transistor scaling are in peril In last four decades, computing capability has increased ~10 6 times Two primary sources of growth in computing: Computer architecture improvement (~10 3 times) o Example: Branch prediction, superscalar, speculation, out-oforder execution, caches, SIMD. Improvement in transistor technology (~10 3 times) o Smaller transistors faster, less power 8/5/

42 Benefits of transistor scaling are in peril In last four decades, computing capability has increased ~10 6 times Two primary sources of growth in computing: Computer architecture improvement (~10 3 times) Need to create new architectures! o Example: Branch prediction, superscalar, speculation, out-oforder execution, caches, SIMD. Improvement in transistor technology (~10 3 times) o Smaller transistors faster, less power 8/5/

43 How good is your processor? 8/5/

44 Performance metric Latency (execution/response time): time to finish one task Throughput (bandwidth): number of tasks finished per unit of time Throughput can exploit parallelism, latency can t Sometimes complimentary, often contradictory 8/5/

45 Performance metric Latency (execution/response time): time to finish one task Throughput (bandwidth): number of tasks finished per unit of time Throughput can exploit parallelism, latency can t Sometimes complimentary, often contradictory Example: move people from A to B, 10 KMs Car: capacity = 5, speed = 60 KM/hour Bus: capacity = 60, speed = 20 KM/hour Latency: car = 10 min, bus = 30 min Throughput: car = 15 PPH (w/ return trip), bus = 60 PPH 8/5/

46 Latency vs. throughput Processor A is X times faster than processor B if Latency(P, A) = Latency(P, B) / X Throughput(P, A) = Throughput(P, B) * X Processor A is X% faster than processor B if Latency(P, A) = Latency(P, B) / (1+X/100) Throughput(P, A) = Throughput(P, B) * (1+X/100) Car/bus example Latency? Car is 3 times (200%) faster than bus Throughput? Bus is 4 times (300%) faster than car 8/5/

47 Evaluating a processor: Latency/throughput of which programs? Very difficult question! Best case: you always run the same set of programs Just measure the execution time of those programs Too idealistic software evolves Use benchmarks Representative programs chosen to measure performance (Hopefully) predict performance of actual workload 8/5/

48 Types of benchmark Real programs Example: CAD, text processing, business apps, scientific apps Need to know program inputs and options (not just code) May not know what programs users will run Require a lot of effort to port Kernels Small key pieces (inner loops) of scientific programs where program spends most of its time Example: Livermore loops, LINPACK Toy Benchmarks e.g. Quicksort, Puzzle Easy to develop, predictable results, may use to check correctness of machine but not as performance benchmark 8/5/

49 Standardized benchmark suites SPEC INT 2006 Program Language Description 400.perlbench C Programming Language 401.bzip2 C Compression 403.gcc C C Compiler 429.mcf C Combinatorial Optimization 445.gobmk C Artificial Intelligence: Go 456.hmmer C Search Gene Sequence 458.sjeng C Artificial Intelligence: chess 462.libquantum C Physics / Quantum Computing 464.h264ref C Video Compression 471.omnetpp C++ Discrete Event Simulation 473.astar C++ Path-finding Algorithms 483.xalancbmk C++ XML Processing 8/5/

50 Iron law of processor performance Time Program Instructions Program Cycles Instruction Time Cycle 8/5/

51 Iron law of processor performance Time Program Instructions Program Cycles Instruction Time Cycle Total Work In Program 8/5/

52 Iron law of processor performance Time Program Instructions Program Cycles Instruction Time Cycle Total Work In Program CPI (Cycles per Inst) or 1/IPC 8/5/

53 Iron law of processor performance Time Program Instructions Program Cycles Instruction Time Cycle Total Work In Program CPI (Cycles per Inst) or 1/IPC 1/f (f: clock frequency) 8/5/

54 Iron law of processor performance Time Program Instructions Program Cycles Instruction Time Cycle Total Work In Program CPI (Cycles per Inst) or 1/IPC 1/f (f: clock frequency) Function of: Algorithms, Compilers, ISA, Program Input 8/5/

55 Iron law of processor performance Time Program Instructions Program Cycles Instruction Time Cycle Total Work In Program CPI (Cycles per Inst) or 1/IPC 1/f (f: clock frequency) Function of: Algorithms, Compilers, ISA, Program Input Function of: Program insts, ISA, Microarchitecture 8/5/

56 Iron law of processor performance Time Program Instructions Program Cycles Instruction Time Cycle Total Work In Program CPI (Cycles per Inst) or 1/IPC 1/f (f: clock frequency) Function of: Algorithms, Compilers, ISA, Program Input Function of: Program insts, ISA, Microarchitecture Function of: Microarchitecture, Fabrication Tech 8/5/

57 Components of Iron law The three components of Iron Law are interdependent Processor architects mostly target CPI but must understand the others well Architects are the interface between software people (compiler, OS, etc.) and those who build the physical hardware 8/5/

58 Understanding performance Which processor would you buy? Processor A: CPI = 2, clock = 2.8 GHz Processor B: CPI = 1, clock = 1.8 GHz Which one is likely to be faster (assuming same ISA/compiler, same set of programs)? 8/5/

59 Understanding performance Which processor would you buy? Processor A: CPI = 2, clock = 2.8 GHz Processor B: CPI = 1, clock = 1.8 GHz Which one is likely to be faster (assuming same ISA/compiler, same set of programs)? Probably A, but B is faster (assuming same ISA/compiler) Classic example 800 MHz Pentium III faster than 1 GHz Pentium 4 Same ISA and compiler Danger of partial performance metrics!! 8/5/

60 Example: Iron law of performance Program takes 33 billion instructions to run CPU processes instructions at 2 cycles on average Clock speed of 3GHz Program running time? 8/5/

61 Calculating cycles-per-instruction Different instr. take different amount of work (cycles) Time Program Instructions Program Cycles Instruction Time Cycle CPI Average CPI n i1 InstFrequency n i1 InstFrequency i CPI i i 8/5/

62 Calculating cycles-per-instruction Instr. frequencies of a given program on a given machine Instruction Type Frequency Avg. CPI Load 25% 2 Store 15% 2 Branch 20% 2 ALU 40% 1 What is the average CPI (cycles per instruction)? 8/5/

63 Calculating cycles-per-instruction Instr. frequencies of a given program on a given machine Depends upon program, compiler, ISA Instruction Type Frequency Avg. CPI Load 25% 2 Store 15% 2 Branch 20% 2 ALU 40% 1 What is the average CPI (cycles per instruction)? Depends upon processor (microarchitecture0 8/5/

64 Calculating cycles-per-instruction Instr. frequencies of a given program on a given machine Depends upon program, compiler, ISA Instruction Type Frequency Avg. CPI Load 25% 2 Store 15% 2 Branch 20% 2 ALU 40% 1 What is the average CPI (cycles per instruction)? Depends upon processor (microarchitecture0 Average CPI n i1 InstFrequency n i1 InstFrequency i CPI i i 8/5/

65 Calculating cycles-per-instruction Instr. frequencies of a given program on a given machine Depends upon program, compiler, ISA Instruction Type Frequency Avg. CPI Load 25% 2 Store 15% 2 Branch 20% 2 ALU 40% 1 What is the average CPI (cycles per instruction)? Average CPI n i1 InstFrequency n i1 InstFrequency CPI /5/ i i i 1.6 Depends upon processor (microarchitecture0

66 Speedup: Comparing performance Runtime (Program A) on processor X Speedup: Runtime Program A on processor Y 8/5/

67 Speedup: Comparing performance Runtime (Program A) on processor X Speedup: Runtime Program A on processor Y Processor X Processor Y Program A 5 sec 4 sec Program B 3 sec 6 sec 8/5/

68 Speedup: Comparing performance Runtime (Program A) on processor X Speedup: Runtime Program A on processor Y Processor X Processor Y Program A 5 sec 4 sec Program B 3 sec 6 sec What is speedup of program A? What is speedup of program B? What is average speedup? 8/5/

69 Summarizing performance numbers Arithmetic: times proportional to time e.g., latency Harmonic: rates inversely proportional to time e.g., throughput Geometric: ratios unit-less quantities e.g., speedups & normalized times 1 n i n 1Time n i 1 n n n 1 Rate i Ratio i i1 i Used by SPEC CPU 8/5/

70 Two common principles of optimizing performance 8/5/

71 Principles of optimizing performance Take advantage of parallelism E.g., multiple processors, disks, memory banks, pipelining, multiple functional units Speculate to create (even more) parallelism Principle of Locality Reuse of data and instructions 8/5/

72 Amdahl's Law Execution Time without Enhancemen t Speedup Execution Time with Enhancemen t Execution Time Execution Time What if enhancement does not enhance everything? old new 8/5/

73 Amdahl's Law Execution Time without Enhancemen t Speedup Execution Time with Enhancemen t Execution Time Execution Time What if enhancement does not enhance everything? old new Speedup Execution Time without using Enhancement at all Execution Time using Enhancement when Possible 8/5/

74 Amdahl's Law Execution Time without Enhancemen t Speedup Execution Time with Enhancemen t Execution Time Execution Time What if enhancement does not enhance everything? old new Speedup Execution Time without using Enhancement at all Execution Time using Enhancement when Possible Execution Time new Execution Time old Fraction Enhanced 1 Fraction Enhanced Speedup Enhanced 8/5/

75 Amdahl's Law Execution Time without Enhancemen t Speedup Execution Time with Enhancemen t Execution Time Execution Time What if enhancement does not enhance everything? old new Speedup Execution Time without using Enhancement at all Execution Time using Enhancement when Possible Execution Time new Execution Time old Fraction Enhanced 1 Fraction Enhanced Speedup Enhanced Overall Speedup 1 Fraction Enhanced 1 Fraction Enhanced Speedup Enhanced 8/5/

76 Amdahl's Law Execution Time without Enhancemen t Speedup Execution Time with Enhancemen t Execution Time Execution Time What if enhancement does not enhance everything? old new Speedup Execution Time without using Enhancement at all Execution Time using Enhancement when Possible Execution Time new Execution Time old Fraction Enhanced 1 Fraction Enhanced Speedup Enhanced Caution: fraction of What? Overall Speedup Fraction Enhanced 1 Fraction Enhanced Speedup Enhanced 8/5/

77 Amdahl s law example 8/5/

78 Amdahl s law example Overall Speedup 1 Fraction Enhanced 1 Fraction Enhanced Speedup Enhanced 8/5/

79 Amdahl s law example Overall Speedup 1 Fraction Enhanced 1 Fraction Enhanced Speedup Enhanced VS Speedup Enhanced 20 Fraction Enhanced 0.1 Speedup Enhanced 1.2 Fraction Enhanced 0.9 8/5/

80 Amdahl s law example Overall Speedup 1 Fraction Enhanced 1 Fraction Enhanced Speedup Enhanced Speedup Enhanced 20 Fraction Enhanced 0.1 VS Speedup Enhanced 1.2 Fraction Enhanced 0.9 Speedup /5/

81 Amdahl s law example Overall Speedup 1 Fraction Enhanced 1 Fraction Enhanced Speedup Enhanced Speedup Enhanced 20 Fraction Enhanced 0.1 VS Speedup Enhanced 1.2 Fraction Enhanced 0.9 Speedup Speedup /5/

82 Rule of thumbs in optimizing performance Make the common case fast Design for actual performance, not peak performance Amdahl s law Locality of reference (90/10 rule) (often) programs spend 90% of their time in 10% of the code main principle behind caches (spatial/temporal locality) Smaller (simpler) is faster Why? Main principle behind memory hierarchies give illusion of fast, large memory 8/5/

83 Power/energy is important 8/5/

84 Energy vs. Power Energy: capacity to do work or amount of work done Expressed in joules Battery life, electric bill, environmental impact Instructions per Joule 8/5/

85 Energy vs. Power Energy: capacity to do work or amount of work done Expressed in joules Battery life, electric bill, environmental impact Instructions per Joule Power: instantaneous rate of energy transfer Expressed in watts energy / time (watts = joules / seconds) Power impacts power supply and cooling requirements (cost) Power-density (Watt/mm 2 ): important related metric Peak power vs average power E.g., camera, power spikes when you actually take a picture In processors, all consumed energy is converted to heat power consumption = rate of heat generation 8/5/

86 Energy vs. power 8/5/

87 Why energy is important? Impacts battery life for mobile devices Impacts operating cost of servers/data centers You have to buy electricity It costs to produce and deliver electricity You have to remove generated heat It costs to buy and operate cooling systems 8/5/

88 Why energy is important? Impacts battery life for mobile devices Impacts operating cost of servers/data centers You have to buy electricity It costs to produce and deliver electricity You have to remove generated heat It costs to buy and operate cooling systems 8/5/

89 Why power is important? Because power delivery has a peak Power is also heat generation rate Must dissipate the heat Need heat sinks and fans and What if fans not fast enough? Chip powers off (if it s smart enough) Otherwise, it burns (or melts) Every processor advertise Thermal Design Point or TDP Can t go beyond this; Processor may burn up 8/5/

90 Components of power dissipation Dynamic + static power Dynamic Power Related to switching activity of transistors (from 01 and 10) Gate 8/5/

91 Components of power dissipation Dynamic + static power Dynamic Power Related to switching activity of transistors (from 01 and 10) Gate Applied Voltage Current 8/5/

92 Components of power dissipation Dynamic + static power Dynamic Power Related to switching activity of transistors (from 01 and 10) Gate Applied Voltage Source Drain Current 8/5/

93 Components of power dissipation Dynamic + static power Dynamic Power Related to switching activity of transistors (from 01 and 10) Gate Applied Voltage Source Drain Current Threshold Voltage 8/5/

94 Components of power dissipation Dynamic + static power Dynamic Power Related to switching activity of transistors (from 01 and 10) Gate Applied Voltage Gate Source Drain Current Threshold Voltage Source Drain 8/5/

95 Components of power dissipation Dynamic + static power Dynamic Power Related to switching activity of transistors (from 01 and 10) Gate Applied Voltage Gate Source Drain Current Threshold Voltage Source Drain 8/5/

96 Components of power dissipation Dynamic + static power Dynamic Power Related to switching activity of transistors (from 01 and 10) Gate Source Drain Applied Voltage Current Threshold Voltage Source Gate Drain 8/5/

97 Components of power dissipation Dynamic + static power Dynamic Power Related to switching activity of transistors (from 01 and 10) Gate Source Drain Applied Voltage Current Threshold Voltage Source Gate Current Drain 8/5/

98 Dynamic power dissipation Power ½ CV 2 Af 8/5/

99 Dynamic power dissipation Capacitance: Function of wire length, transistor size Power ½ CV 2 Af 8/5/

100 Dynamic power dissipation Capacitance: Function of wire length, transistor size Supply Voltage: Function of technology and operating frequency Power ½ CV 2 Af 8/5/

101 Dynamic power dissipation Capacitance: Function of wire length, transistor size Supply Voltage: Function of technology and operating frequency Power ½ CV 2 Af Clock frequency: Function of desired performance 8/5/

102 Dynamic power dissipation Capacitance: Function of wire length, transistor size Supply Voltage: Function of technology and operating frequency Power ½ CV 2 Af Activity factor: Average fraction of all possible transitions (01 and 10) per cycle? Clock frequency: Function of desired performance 8/5/

103 Static power dissipation Static Power Current leaking from a transistor even if doing nothing (steady, constant energy cost) Static Power V dd and e c 1V th and e c 2T This is a first-order model c 1, c 2 : some positive constants V th : Threshold Voltage T: Temperature About 30-50% of processor power 8/5/

104 Static power dissipation Static Power Current leaking from a transistor even if doing nothing (steady, constant energy cost) Channel Leakage Sub-threshold Conductance Static Power V dd and e c 1V th and e c 2T This is a first-order model c 1, c 2 : some positive constants V th : Threshold Voltage T: Temperature About 30-50% of processor power 8/5/

105 Static power dissipation Static Power Current leaking from a transistor even if doing nothing (steady, constant energy cost) Gate Leakage Channel Leakage Sub-threshold Conductance Static Power V dd and e c 1V th and e c 2T This is a first-order model c 1, c 2 : some positive constants V th : Threshold Voltage T: Temperature About 30-50% of processor power 8/5/

106 Thermal runaway Leakage is an exponential function of temperature Temp leads to Leakage Which burns more power Which leads to Temp, which leads to Can melt your chip! 8/5/

107 Ways to reduce energy/power? Clock gating/power gating Turn off parts of the processor Dynamic voltage frequency scaling Frequency related to voltage (Higher frequency needs higher voltage) Dynamically reduce frequency/voltage when possible CPU sleep states Idle power savings by turning off entire portions of CPU when possible 8/5/

108 Types of processor Serve class Intel Xeon, AMD Epyc Desktop class Mobile class Embeded class IoT 8/5/

CSE502: Computer Architecture Welcome to CSE 502

CSE502: Computer Architecture Welcome to CSE 502 Welcome to CSE 502 Introduction & Review Today s Lecture Course Overview Course Topics Grading Logistics Academic Integrity Policy Homework Quiz Key basic concepts for Computer Architecture Course Overview

More information

CS4617 Computer Architecture

CS4617 Computer Architecture 1/26 CS4617 Computer Architecture Lecture 2 Dr J Vaughan September 10, 2014 2/26 Amdahl s Law Speedup = Execution time for entire task without using enhancement Execution time for entire task using enhancement

More information

CS Computer Architecture Spring Lecture 04: Understanding Performance

CS Computer Architecture Spring Lecture 04: Understanding Performance CS 35101 Computer Architecture Spring 2008 Lecture 04: Understanding Performance Taken from Mary Jane Irwin (www.cse.psu.edu/~mji) and Kevin Schaffer [Adapted from Computer Organization and Design, Patterson

More information

Performance Metrics, Amdahl s Law

Performance Metrics, Amdahl s Law ecture 26 Computer Science 61C Spring 2017 March 20th, 2017 Performance Metrics, Amdahl s Law 1 New-School Machine Structures (It s a bit more complicated!) Software Hardware Parallel Requests Assigned

More information

CS 6290 Evaluation & Metrics

CS 6290 Evaluation & Metrics CS 6290 Evaluation & Metrics Performance Two common measures Latency (how long to do X) Also called response time and execution time Throughput (how often can it do X) Example of car assembly line Takes

More information

Overview. 1 Trends in Microprocessor Architecture. Computer architecture. Computer architecture

Overview. 1 Trends in Microprocessor Architecture. Computer architecture. Computer architecture Overview 1 Trends in Microprocessor Architecture R05 Robert Mullins Computer architecture Scaling performance and CMOS Where have performance gains come from? Modern superscalar processors The limits of

More information

Measuring and Evaluating Computer System Performance

Measuring and Evaluating Computer System Performance Measuring and Evaluating Computer System Performance Performance Marches On... But what is performance? The bottom line: Performance Car Time to Bay Area Speed Passengers Throughput (pmph) Ferrari 3.1

More information

Parallel Computing 2020: Preparing for the Post-Moore Era. Marc Snir

Parallel Computing 2020: Preparing for the Post-Moore Era. Marc Snir Parallel Computing 2020: Preparing for the Post-Moore Era Marc Snir THE (CMOS) WORLD IS ENDING NEXT DECADE So says the International Technology Roadmap for Semiconductors (ITRS) 2 End of CMOS? IN THE LONG

More information

Performance Metrics. Computer Architecture. Outline. Objectives. Basic Performance Metrics. Basic Performance Metrics

Performance Metrics. Computer Architecture. Outline. Objectives. Basic Performance Metrics. Basic Performance Metrics Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com Performance Metrics http://www.yildiz.edu.tr/~naydin 1 2 Objectives How can we meaningfully measure and compare

More information

Lecture #29. Moore s Law

Lecture #29. Moore s Law Lecture #29 ANNOUNCEMENTS HW#15 will be for extra credit Quiz #6 (Thursday 5/8) will include MOSFET C-V No late Projects will be accepted after Thursday 5/8 The last Coffee Hour will be held this Thursday

More information

A Static Power Model for Architects

A Static Power Model for Architects A Static Power Model for Architects J. Adam Butts and Guri Sohi University of Wisconsin-Madison {butts,sohi}@cs.wisc.edu 33rd International Symposium on Microarchitecture Monterey, California December,

More information

ΕΠΛ 605: Προχωρημένη Αρχιτεκτονική

ΕΠΛ 605: Προχωρημένη Αρχιτεκτονική ΕΠΛ 605: Προχωρημένη Αρχιτεκτονική Υπολογιστών Presentation of UniServer Horizon 2020 European project findings: X-Gene server chips, voltage-noise characterization, high-bandwidth voltage measurements,

More information

ECE 484 VLSI Digital Circuits Fall Lecture 02: Design Metrics

ECE 484 VLSI Digital Circuits Fall Lecture 02: Design Metrics ECE 484 VLSI Digital Circuits Fall 2016 Lecture 02: Design Metrics Dr. George L. Engel Adapted from slides provided by Mary Jane Irwin (PSU) [Adapted from Rabaey s Digital Integrated Circuits, 2002, J.

More information

Pramoda N V Department of Electronics and Communication Engineering, MCE Hassan Karnataka India

Pramoda N V Department of Electronics and Communication Engineering, MCE Hassan Karnataka India Advanced Low Power CMOS Design to Reduce Power Consumption in CMOS Circuit for VLSI Design Pramoda N V Department of Electronics and Communication Engineering, MCE Hassan Karnataka India Abstract: Low

More information

Low Power Design Part I Introduction and VHDL design. Ricardo Santos LSCAD/FACOM/UFMS

Low Power Design Part I Introduction and VHDL design. Ricardo Santos LSCAD/FACOM/UFMS Low Power Design Part I Introduction and VHDL design Ricardo Santos ricardo@facom.ufms.br LSCAD/FACOM/UFMS Motivation for Low Power Design Low power design is important from three different reasons Device

More information

An Overview of Static Power Dissipation

An Overview of Static Power Dissipation An Overview of Static Power Dissipation Jayanth Srinivasan 1 Introduction Power consumption is an increasingly important issue in general purpose processors, particularly in the mobile computing segment.

More information

Course Content. Course Content. Course Format. Low Power VLSI System Design Lecture 1: Introduction. Course focus

Course Content. Course Content. Course Format. Low Power VLSI System Design Lecture 1: Introduction. Course focus Course Content Low Power VLSI System Design Lecture 1: Introduction Prof. R. Iris Bahar E September 6, 2017 Course focus low power and thermal-aware design digital design, from devices to architecture

More information

The Transistor. Survey: What is Moore s Law? Survey: What is Moore s Law? Technology Unit Overview. Technology Generations

The Transistor. Survey: What is Moore s Law? Survey: What is Moore s Law? Technology Unit Overview. Technology Generations CSE 560 Computer Systems Architecture Technology Survey: What is Moore s Law? What does Moore s Law state? A. The length of a transistor halves every 2 years. B. The number of transistors on a chip will

More information

! Technology basis! MOS transistors! Moore s Law: transistor scaling. ! The metrics! Transistor speed! Cost! Power! Reliability

! Technology basis! MOS transistors! Moore s Law: transistor scaling. ! The metrics! Transistor speed! Cost! Power! Reliability This Unit CIS 501 Computer Architecture Unit 1: Technology Technology basis MOS transistors Moore s Law: transistor scaling The metrics Transistor speed Cost Power Reliability How do these change over

More information

MICROPROCESSOR TECHNOLOGY

MICROPROCESSOR TECHNOLOGY MICROPROCESSOR TECHNOLOGY Assis. Prof. Hossam El-Din Moustafa Lecture 3 Ch.1 The Evolution of The Microprocessor 17-Feb-15 1 Chapter Objectives Introduce the microprocessor evolution from transistors to

More information

Ramon Canal NCD Master MIRI. NCD Master MIRI 1

Ramon Canal NCD Master MIRI. NCD Master MIRI 1 Wattch, Hotspot, Hotleakage, McPAT http://www.eecs.harvard.edu/~dbrooks/wattch-form.html http://lava.cs.virginia.edu/hotspot http://lava.cs.virginia.edu/hotleakage http://www.hpl.hp.com/research/mcpat/

More information

Trends and Challenges in VLSI Technology Scaling Towards 100nm

Trends and Challenges in VLSI Technology Scaling Towards 100nm Trends and Challenges in VLSI Technology Scaling Towards 100nm Stefan Rusu Intel Corporation stefan.rusu@intel.com September 2001 Stefan Rusu 9/2001 2001 Intel Corp. Page 1 Agenda VLSI Technology Trends

More information

Instructor: Dr. Mainak Chaudhuri. Instructor: Dr. S. K. Aggarwal. Instructor: Dr. Rajat Moona

Instructor: Dr. Mainak Chaudhuri. Instructor: Dr. S. K. Aggarwal. Instructor: Dr. Rajat Moona NPTEL Online - IIT Kanpur Instructor: Dr. Mainak Chaudhuri Instructor: Dr. S. K. Aggarwal Course Name: Department: Program Optimization for Multi-core Architecture Computer Science and Engineering IIT

More information

Low Power Design in VLSI

Low Power Design in VLSI Low Power Design in VLSI Evolution in Power Dissipation: Why worry about power? Heat Dissipation source : arpa-esto microprocessor power dissipation DEC 21164 Computers Defined by Watts not MIPS: µwatt

More information

EE4800 CMOS Digital IC Design & Analysis. Lecture 1 Introduction Zhuo Feng

EE4800 CMOS Digital IC Design & Analysis. Lecture 1 Introduction Zhuo Feng EE4800 CMOS Digital IC Design & Analysis Lecture 1 Introduction Zhuo Feng 1.1 Prof. Zhuo Feng Office: EERC 730 Phone: 487-3116 Email: zhuofeng@mtu.edu Class Website http://www.ece.mtu.edu/~zhuofeng/ee4800fall2010.html

More information

Static Power and the Importance of Realistic Junction Temperature Analysis

Static Power and the Importance of Realistic Junction Temperature Analysis White Paper: Virtex-4 Family R WP221 (v1.0) March 23, 2005 Static Power and the Importance of Realistic Junction Temperature Analysis By: Matt Klein Total power consumption of a board or system is important;

More information

Introduction. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. July 30, 2002

Introduction. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. July 30, 2002 Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Introduction July 30, 2002 1 What is this book all about? Introduction to digital integrated circuits.

More information

Creating the Right Environment for Machine Learning Codesign. Cliff Young, Google AI

Creating the Right Environment for Machine Learning Codesign. Cliff Young, Google AI Creating the Right Environment for Machine Learning Codesign Cliff Young, Google AI 1 Deep Learning has Reinvigorated Hardware GPUs AlexNet, Speech. TPUs Many Google applications: AlphaGo and Translate,

More information

Power Spring /7/05 L11 Power 1

Power Spring /7/05 L11 Power 1 Power 6.884 Spring 2005 3/7/05 L11 Power 1 Lab 2 Results Pareto-Optimal Points 6.884 Spring 2005 3/7/05 L11 Power 2 Standard Projects Two basic design projects Processor variants (based on lab1&2 testrigs)

More information

Outline. Technology Trends Moore's Law: Process, Feature Size, Scaling Power, Energy

Outline. Technology Trends Moore's Law: Process, Feature Size, Scaling Power, Energy Technology Trends Outline Technology Trends Moore's Law: Process, Feature Size, Scaling Power, Energy Moore's Law (Technology Scaling) Parameter Value in Current Generation Value in the New Generation

More information

APPENDIX B PARETO PLOTS PER BENCHMARK

APPENDIX B PARETO PLOTS PER BENCHMARK IEEE TRANSACTIONS ON COMPUTERS, VOL., NO., SEPTEMBER 1 APPENDIX B PARETO PLOTS PER BENCHMARK Appendix B contains all Pareto frontiers for the SPEC CPU benchmarks as calculated by the model (green curve)

More information

Processors Processing Processors. The meta-lecture

Processors Processing Processors. The meta-lecture Simulators 5SIA0 Processors Processing Processors The meta-lecture Why Simulators? Your Friend Harm Why Simulators? Harm Loves Tractors Harm Why Simulators? The outside world Unfortunately for Harm you

More information

Recap. RISC vs. CISC. Caches. Load, Store instructions. Locality of reference It is small and it is fast

Recap. RISC vs. CISC. Caches. Load, Store instructions. Locality of reference It is small and it is fast Recap RISC vs. CISC Load, Store instructions Caches Locality of reference It is small and it is fast Is it fast because it is small? Why is it small? Application Algorithm Programming Language OS/VM ISA

More information

Practical Information

Practical Information EE241 - Spring 2013 Advanced Digital Integrated Circuits MW 2-3:30pm 540A/B Cory Practical Information Instructor: Borivoje Nikolić 509 Cory Hall, 3-9297, bora@eecs Office hours: M 11-12, W 3:30pm-4:30pm

More information

Static Energy Reduction Techniques in Microprocessor Caches

Static Energy Reduction Techniques in Microprocessor Caches Static Energy Reduction Techniques in Microprocessor Caches Heather Hanson, Stephen W. Keckler, Doug Burger Computer Architecture and Technology Laboratory Department of Computer Sciences Tech Report TR2001-18

More information

Practical Information

Practical Information EE241 - Spring 2010 Advanced Digital Integrated Circuits TuTh 3:30-5pm 293 Cory Practical Information Instructor: Borivoje Nikolić 550B Cory Hall, 3-9297, bora@eecs Office hours: M 10:30am-12pm Reader:

More information

EMT 251 Introduction to IC Design

EMT 251 Introduction to IC Design EMT 251 Introduction to IC Design (Pengantar Rekabentuk Litar Terkamir) Semester II 2011/2012 Introduction to IC design and Transistor Fundamental Some Keywords! Very-large-scale-integration (VLSI) is

More information

Pushing Ultra-Low-Power Digital Circuits

Pushing Ultra-Low-Power Digital Circuits Pushing Ultra-Low-Power Digital Circuits into the Nanometer Era David Bol Microelectronics Laboratory Ph.D public defense December 16, 2008 Pushing Ultra-Low-Power Digital Circuits into the Nanometer Era

More information

A Case for Opportunistic Embedded Sensing In Presence of Hardware Power Variability

A Case for Opportunistic Embedded Sensing In Presence of Hardware Power Variability A Case for Opportunistic Embedded Sensing In Presence of Hardware Power Variability L. Wanner, C. Apte, R. Balani, Puneet Gupta, and Mani Srivastava University of California, Los Angeles puneet@ee.ucla.edu

More information

VLSI Design I; A. Milenkovic 1

VLSI Design I; A. Milenkovic 1 CPE/EE 427, CPE 527 VLSI Design I L02: Design Metrics Department of Electrical and Computer Engineering University of Alabama in Huntsville Aleksandar Milenkovic ( www.ece.uah.edu/~milenka ) www.ece.uah.edu/~milenka/cpe527-03f

More information

Low Power Design. Prof. MacDonald

Low Power Design. Prof. MacDonald Low Power Design Prof. MacDonald Power the next challenge! l High performance thermal problems power is now exceeding 100-200 watts l difficult to remove heat from system l slows down circuits - mobilities

More information

Lec 24: Parallel Processors. Announcements

Lec 24: Parallel Processors. Announcements Lec 24: Parallel Processors Kavita ala CS 3410, Fall 2008 Computer Science Cornell University P 3 out Hack n Seek nnouncements The goal is to have fun with it Recitations today will talk about it Pizza

More information

Contents 1 Introduction 2 MOS Fabrication Technology

Contents 1 Introduction 2 MOS Fabrication Technology Contents 1 Introduction... 1 1.1 Introduction... 1 1.2 Historical Background [1]... 2 1.3 Why Low Power? [2]... 7 1.4 Sources of Power Dissipations [3]... 9 1.4.1 Dynamic Power... 10 1.4.2 Static Power...

More information

Low Power VLSI Circuit Synthesis: Introduction and Course Outline

Low Power VLSI Circuit Synthesis: Introduction and Course Outline Low Power VLSI Circuit Synthesis: Introduction and Course Outline Ajit Pal Professor Department of Computer Science and Engineering Indian Institute of Technology Kharagpur INDIA -721302 Agenda Why Low

More information

1 Introduction COPYRIGHTED MATERIAL

1 Introduction COPYRIGHTED MATERIAL Introduction The scaling of semiconductor process technologies has been continuing for more than four decades. Advancements in process technologies are the fuel that has been moving the semiconductor industry.

More information

CSCI-564 Advanced Computer Architecture

CSCI-564 Advanced Computer Architecture CSCI-564 Advanced Computer Architecture Lecture 1: Introduction Bo Wu Colorado School of Mines Disclaimer: most of the slides in this course are adapted from four top-notch computer architecture researchers:

More information

1 Digital EE141 Integrated Circuits 2nd Introduction

1 Digital EE141 Integrated Circuits 2nd Introduction Digital Integrated Circuits Introduction 1 What is this lecture about? Introduction to digital integrated circuits + low power circuits Issues in digital design The CMOS inverter Combinational logic structures

More information

7/11/2012. Single Cycle (Review) CSE 2021: Computer Organization. Multi-Cycle Implementation. Single Cycle with Jump. Pipelining Analogy

7/11/2012. Single Cycle (Review) CSE 2021: Computer Organization. Multi-Cycle Implementation. Single Cycle with Jump. Pipelining Analogy CSE 2021: Computer Organization Single Cycle (Review) Lecture-10 CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan CSE-2021 July-12-2012 2 Single Cycle with Jump Multi-Cycle Implementation

More information

Power of Realtime 3D-Rendering. Raja Koduri

Power of Realtime 3D-Rendering. Raja Koduri Power of Realtime 3D-Rendering Raja Koduri 1 We ate our GPU cake - vuoi la botte piena e la moglie ubriaca And had more too! 16+ years of (sugar) high! In every GPU generation More performance and performance-per-watt

More information

Lecture 13 CMOS Power Dissipation

Lecture 13 CMOS Power Dissipation EE 471: Transport Phenomena in Solid State Devices Spring 2018 Lecture 13 CMOS Power Dissipation Bryan Ackland Department of Electrical and Computer Engineering Stevens Institute of Technology Hoboken,

More information

CMOS Technology for Computer Architects

CMOS Technology for Computer Architects CMOS Technology for Computer Architects Lecture 1: Introduction Iakovos Mavroidis Giorgos Passas Manolis Katevenis FORTH-ICS (University of Crete) Course Contents Implementation of high-performance digital

More information

Chapter 1 Introduction

Chapter 1 Introduction Chapter 1 Introduction 1.1 Introduction There are many possible facts because of which the power efficiency is becoming important consideration. The most portable systems used in recent era, which are

More information

Technology Challenges

Technology Challenges Technology Challenges ECE/CS 752 Fall 2017 Prof. Mikko H. Lipasti University of Wisconsin-Madison Readings Read on your own: Shekhar Borkar, Designing Reliable Systems from Unreliable Components: The Challenges

More information

EE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling

EE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling EE241 - Spring 2004 Advanced Digital Integrated Circuits Borivoje Nikolic Lecture 15 Low-Power Design: Supply Voltage Scaling Announcements Homework #2 due today Midterm project reports due next Thursday

More information

Topics. Low Power Techniques. Based on Penn State CSE477 Lecture Notes 2002 M.J. Irwin and adapted from Digital Integrated Circuits 2002 J.

Topics. Low Power Techniques. Based on Penn State CSE477 Lecture Notes 2002 M.J. Irwin and adapted from Digital Integrated Circuits 2002 J. Topics Low Power Techniques Based on Penn State CSE477 Lecture Notes 2002 M.J. Irwin and adapted from Digital Integrated Circuits 2002 J. Rabaey Review: Energy & Power Equations E = C L V 2 DD P 0 1 +

More information

FIELD- EFFECT TRANSISTORS: MOSFETS

FIELD- EFFECT TRANSISTORS: MOSFETS FIELD- EFFECT TRANSISTORS: MOSFETS LAB 8: INTRODUCTION TO FETS AND USING THEM AS CURRENT CONTROLLERS As discussed in the last lab, transistors are the basic devices providing control of large currents

More information

On-chip Networks in Multi-core era

On-chip Networks in Multi-core era Friday, October 12th, 2012 On-chip Networks in Multi-core era Davide Zoni PhD Student email: zoni@elet.polimi.it webpage: home.dei.polimi.it/zoni Outline 2 Introduction Technology trends and challenges

More information

Datorstödd Elektronikkonstruktion

Datorstödd Elektronikkonstruktion Datorstödd Elektronikkonstruktion [Computer Aided Design of Electronics] Zebo Peng, Petru Eles and Gert Jervan Embedded Systems Laboratory IDA, Linköping University http://www.ida.liu.se/~tdts80/~tdts80

More information

On the Rules of Low-Power Design

On the Rules of Low-Power Design On the Rules of Low-Power Design (and Why You Should Break Them) Prof. Todd Austin University of Michigan austin@umich.edu A long time ago, in a not so far away place The Rules of Low-Power Design P =

More information

LOW-POWER SOFTWARE-DEFINED RADIO DESIGN USING FPGAS

LOW-POWER SOFTWARE-DEFINED RADIO DESIGN USING FPGAS LOW-POWER SOFTWARE-DEFINED RADIO DESIGN USING FPGAS Charlie Jenkins, (Altera Corporation San Jose, California, USA; chjenkin@altera.com) Paul Ekas, (Altera Corporation San Jose, California, USA; pekas@altera.com)

More information

Low-Power Digital CMOS Design: A Survey

Low-Power Digital CMOS Design: A Survey Low-Power Digital CMOS Design: A Survey Krister Landernäs June 4, 2005 Department of Computer Science and Electronics, Mälardalen University Abstract The aim of this document is to provide the reader with

More information

Hardware Platforms and Sensors

Hardware Platforms and Sensors Hardware Platforms and Sensors Tom Spink Including material adapted from Bjoern Franke and Michael O Boyle Hardware Platform A hardware platform describes the physical components that go to make up a particular

More information

Design of High Performance Arithmetic and Logic Circuits in DSM Technology

Design of High Performance Arithmetic and Logic Circuits in DSM Technology Design of High Performance Arithmetic and Logic Circuits in DSM Technology Salendra.Govindarajulu 1, Dr.T.Jayachandra Prasad 2, N.Ramanjaneyulu 3 1 Associate Professor, ECE, RGMCET, Nandyal, JNTU, A.P.Email:

More information

Department Computer Science and Engineering IIT Kanpur

Department Computer Science and Engineering IIT Kanpur NPTEL Online - IIT Bombay Course Name Parallel Computer Architecture Department Computer Science and Engineering IIT Kanpur Instructor Dr. Mainak Chaudhuri file:///e /parallel_com_arch/lecture1/main.html[6/13/2012

More information

Introduction. Reading: Chapter 1. Courtesy of Dr. Dansereau, Dr. Brown, Dr. Vranesic, Dr. Harris, and Dr. Choi.

Introduction. Reading: Chapter 1. Courtesy of Dr. Dansereau, Dr. Brown, Dr. Vranesic, Dr. Harris, and Dr. Choi. Introduction Reading: Chapter 1 Courtesy of Dr. Dansereau, Dr. Brown, Dr. Vranesic, Dr. Harris, and Dr. Choi http://csce.uark.edu +1 (479) 575-6043 yrpeng@uark.edu Why study logic design? Obvious reasons

More information

Lecture 0: Introduction

Lecture 0: Introduction Lecture 0: Introduction Introduction Integrated circuits: many transistors on one chip. Very Large Scale Integration (VLSI): bucketloads! Complementary Metal Oxide Semiconductor Fast, cheap, low power

More information

Lecture 1. Tinoosh Mohsenin

Lecture 1. Tinoosh Mohsenin Lecture 1 Tinoosh Mohsenin Today Administrative items Syllabus and course overview Digital systems and optimization overview 2 Course Communication Email Urgent announcements Web page http://www.csee.umbc.edu/~tinoosh/cmpe650/

More information

Performance Evaluation of Multi-Threaded System vs. Chip-Multi-Processor System

Performance Evaluation of Multi-Threaded System vs. Chip-Multi-Processor System Performance Evaluation of Multi-Threaded System vs. Chip-Multi-Processor System Ho Young Kim, Robert Maxwell, Ankil Patel, Byeong Kil Lee Abstract The purpose of this study is to analyze and compare the

More information

A Software Technique to Improve Yield of Processor Chips in Presence of Ultra-Leaky SRAM Cells Caused by Process Variation

A Software Technique to Improve Yield of Processor Chips in Presence of Ultra-Leaky SRAM Cells Caused by Process Variation A Software Technique to Improve Yield of Processor Chips in Presence of Ultra-Leaky SRAM Cells Caused by Process Variation Maziar Goudarzi, Tohru Ishihara, Hiroto Yasuura System LSI Research Center Kyushu

More information

EE 42/100 Lecture 23: CMOS Transistors and Logic Gates. Rev A 4/15/2012 (10:39 AM) Prof. Ali M. Niknejad

EE 42/100 Lecture 23: CMOS Transistors and Logic Gates. Rev A 4/15/2012 (10:39 AM) Prof. Ali M. Niknejad A. M. Niknejad University of California, Berkeley EE 100 / 42 Lecture 23 p. 1/16 EE 42/100 Lecture 23: CMOS Transistors and Logic Gates ELECTRONICS Rev A 4/15/2012 (10:39 AM) Prof. Ali M. Niknejad University

More information

Low-Power VLSI. Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering

Low-Power VLSI. Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering Low-Power VLSI Seong-Ook Jung 2013. 5. 27. sjung@yonsei.ac.kr VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering Contents 1. Introduction 2. Power classification & Power performance

More information

Transistor Scaling in the Innovation Era. Mark Bohr Intel Senior Fellow Logic Technology Development August 15, 2011

Transistor Scaling in the Innovation Era. Mark Bohr Intel Senior Fellow Logic Technology Development August 15, 2011 Transistor Scaling in the Innovation Era Mark Bohr Intel Senior Fellow Logic Technology Development August 15, 2011 MOSFET Scaling Device or Circuit Parameter Scaling Factor Device dimension tox, L, W

More information

A Static Power Model for Architects

A Static Power Model for Architects A Static Power Model for Architects J. Adam Butts and Gurindar S. Sohi Computer Science Department University of Wisconsin-Madison {butts,sohi}@cs.wisc.edu Abstract Static power dissipation due to transistor

More information

COTSon: Infrastructure for system-level simulation

COTSon: Infrastructure for system-level simulation COTSon: Infrastructure for system-level simulation Ayose Falcón, Paolo Faraboschi, Daniel Ortega HP Labs Exascale Computing Lab http://sites.google.com/site/hplabscotson MICRO-41 tutorial November 9, 28

More information

Best Instruction Per Cycle Formula >>>CLICK HERE<<<

Best Instruction Per Cycle Formula >>>CLICK HERE<<< Best Instruction Per Cycle Formula 6 Performance tuning, 7 Perceived performance, 8 Performance Equation, 9 See also is the average instructions per cycle (IPC) for this benchmark. Even. Click Card to

More information

PC accounts for 353 Cory will be created early next week (when the class list is completed) Discussions & Labs start in Week 3

PC accounts for 353 Cory will be created early next week (when the class list is completed) Discussions & Labs start in Week 3 EE141 Fall 2005 Lecture 2 Design Metrics Admin Page Everyone should have a UNIX account on Cory! This will allow you to run HSPICE! If you do not have an account, check: http://www-inst.eecs.berkeley.edu/usr/

More information

Computer Aided Design of Electronics

Computer Aided Design of Electronics Computer Aided Design of Electronics [Datorstödd Elektronikkonstruktion] Zebo Peng, Petru Eles, and Nima Aghaee Embedded Systems Laboratory IDA, Linköping University www.ida.liu.se/~tdts01 Electronic Systems

More information

Metrics How to improve performance? CPI MIPS Benchmarks CSC3501 S07 CSC3501 S07. Louisiana State University 4- Performance - 1

Metrics How to improve performance? CPI MIPS Benchmarks CSC3501 S07 CSC3501 S07. Louisiana State University 4- Performance - 1 Performance of Computer Systems Dr. Arjan Durresi Louisiana State University Baton Rouge, LA 70810 Durresi@Csc.LSU.Edu LSUEd These slides are available at: http://www.csc.lsu.edu/~durresi/csc3501_07/ Louisiana

More information

Interconnect-Power Dissipation in a Microprocessor

Interconnect-Power Dissipation in a Microprocessor 4/2/2004 Interconnect-Power Dissipation in a Microprocessor N. Magen, A. Kolodny, U. Weiser, N. Shamir Intel corporation Technion - Israel Institute of Technology 4/2/2004 2 Interconnect-Power Definition

More information

ECE 471 Embedded Systems Lecture 31

ECE 471 Embedded Systems Lecture 31 ECE 471 Embedded Systems Lecture 31 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 30 November 2018 HW#10 was due Project update was due HW#11 will be posted Announcements 1 HW#9

More information

Homework 10 posted just for practice. Office hours next week, schedule TBD. HKN review today. Your feedback is important!

Homework 10 posted just for practice. Office hours next week, schedule TBD. HKN review today. Your feedback is important! EE141 Fall 2005 Lecture 26 Memory (Cont.) Perspectives Administrative Stuff Homework 10 posted just for practice No need to turn in Office hours next week, schedule TBD. HKN review today. Your feedback

More information

BICMOS Technology and Fabrication

BICMOS Technology and Fabrication 12-1 BICMOS Technology and Fabrication 12-2 Combines Bipolar and CMOS transistors in a single integrated circuit By retaining benefits of bipolar and CMOS, BiCMOS is able to achieve VLSI circuits with

More information

Revisiting Dynamic Thermal Management Exploiting Inverse Thermal Dependence

Revisiting Dynamic Thermal Management Exploiting Inverse Thermal Dependence Revisiting Dynamic Thermal Management Exploiting Inverse Thermal Dependence Katayoun Neshatpour George Mason University kneshatp@gmu.edu Amin Khajeh Broadcom Corporation amink@broadcom.com Houman Homayoun

More information

Lecture 04 CSE 40547/60547 Computing at the Nanoscale Interconnect

Lecture 04 CSE 40547/60547 Computing at the Nanoscale Interconnect Lecture 04 CSE 40547/60547 Computing at the Nanoscale Interconnect Introduction - So far, have considered transistor-based logic in the face of technology scaling - Interconnect effects are also of concern

More information

Energy Consumption Issues and Power Management Techniques

Energy Consumption Issues and Power Management Techniques Energy Consumption Issues and Power Management Techniques David Macii Embedded Electronics and Computing Systems group http://eecs.disi.unitn.it The scenario 2 The Moore s Law The transistor count in IC

More information

Design Challenges in Multi-GHz Microprocessors

Design Challenges in Multi-GHz Microprocessors Design Challenges in Multi-GHz Microprocessors Bill Herrick Director, Alpha Microprocessor Development www.compaq.com Introduction Moore s Law ( Law (the trend that the demand for IC functions and the

More information

Project 5: Optimizer Jason Ansel

Project 5: Optimizer Jason Ansel Project 5: Optimizer Jason Ansel Overview Project guidelines Benchmarking Library OoO CPUs Project Guidelines Use optimizations from lectures as your arsenal If you decide to implement one, look at Whale

More information

Power Issues with Embedded Systems. Rabi Mahapatra Computer Science

Power Issues with Embedded Systems. Rabi Mahapatra Computer Science Power Issues with Embedded Systems Rabi Mahapatra Computer Science Plan for today Some Power Models Familiar with technique to reduce power consumption Reading assignment: paper by Bill Moyer on Low-Power

More information

A Case Study of Nanoscale FPGA Programmable Switches with Low Power

A Case Study of Nanoscale FPGA Programmable Switches with Low Power A Case Study of Nanoscale FPGA Programmable Switches with Low Power V.Elamaran 1, Har Narayan Upadhyay 2 1 Assistant Professor, Department of ECE, School of EEE SASTRA University, Tamilnadu - 613401, India

More information

Chapter 4. Pipelining Analogy. The Processor. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop:

Chapter 4. Pipelining Analogy. The Processor. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Chapter 4 The Processor Part II Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup p = 2n/(0.5n + 1.5) 4 =

More information

EE382N-20 Computer Architecture Parallelism and Locality Lecture 1

EE382N-20 Computer Architecture Parallelism and Locality Lecture 1 EE382-20 Computer Architecture Parallelism and Locality Lecture 1 Mattan Erez The University of Texas at Austin EE382-20: Lecture 1 (c) Mattan Erez What is this class about? Computer architecture Principles

More information

A HIGH SPEED & LOW POWER 16T 1-BIT FULL ADDER CIRCUIT DESIGN BY USING MTCMOS TECHNIQUE IN 45nm TECHNOLOGY

A HIGH SPEED & LOW POWER 16T 1-BIT FULL ADDER CIRCUIT DESIGN BY USING MTCMOS TECHNIQUE IN 45nm TECHNOLOGY A HIGH SPEED & LOW POWER 16T 1-BIT FULL ADDER CIRCUIT DESIGN BY USING MTCMOS TECHNIQUE IN 45nm TECHNOLOGY Jasbir kaur 1, Neeraj Singla 2 1 Assistant Professor, 2 PG Scholar Electronics and Communication

More information

Introduction to VLSI ASIC Design and Technology

Introduction to VLSI ASIC Design and Technology Introduction to VLSI ASIC Design and Technology Paulo Moreira CERN - Geneva, Switzerland Paulo Moreira Introduction 1 Outline Introduction Is there a limit? Transistors CMOS building blocks Parasitics

More information

Exploring Heterogeneity within a Core for Improved Power Efficiency

Exploring Heterogeneity within a Core for Improved Power Efficiency Computer Engineering Exploring Heterogeneity within a Core for Improved Power Efficiency Sudarshan Srinivasan Nithesh Kurella Israel Koren Sandip Kundu May 2, 215 CE Tech Report # 6 Available at http://www.eng.biu.ac.il/segalla/computer-engineering-tech-reports/

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 8, August 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Novel Implementation

More information

420 Intro to VLSI Design

420 Intro to VLSI Design Dept of Electrical and Computer Engineering 420 Intro to VLSI Design Lecture 0: Course Introduction and Overview Valencia M. Joyner Spring 2005 Getting Started Syllabus About the Instructor Labs, Problem

More information

ECE 5745 Complex Digital ASIC Design Topic 2: CMOS Devices

ECE 5745 Complex Digital ASIC Design Topic 2: CMOS Devices ECE 5745 Complex Digital ASIC Design Topic 2: CMOS Devices Christopher Batten School of Electrical and Computer Engineering Cornell University http://www.csl.cornell.edu/courses/ece5950 Simple Transistor

More information

The future of lithography and its impact on design

The future of lithography and its impact on design The future of lithography and its impact on design Chris Mack www.lithoguru.com 1 Outline History Lessons Moore s Law Dennard Scaling Cost Trends Is Moore s Law Over? Litho scaling? The Design Gap The

More information

CS152 Computer Architecture and Engineering Lecture 3: ReviewTechnology & Delay Modeling. September 3, 1997

CS152 Computer Architecture and Engineering Lecture 3: ReviewTechnology & Delay Modeling. September 3, 1997 CS152 Computer Architecture and Engineering Lecture 3: ReviewTechnology & Delay Modeling September 3, 1997 Dave Patterson (httpcsberkeleyedu/~patterson) lecture slides: http://www-insteecsberkeleyedu/~cs152/

More information

Power Modeling and Characterization of Computing Devices: A Survey. Contents

Power Modeling and Characterization of Computing Devices: A Survey. Contents Foundations and Trends R in Electronic Design Automation Vol. 6, No. 2 (2012) 121 216 c 2012 S. Reda and A. N. Nowroz DOI: 10.1561/1000000022 Power Modeling and Characterization of Computing Devices: A

More information