ΕΠΛ 605: Προχωρημένη Αρχιτεκτονική

Size: px
Start display at page:

Download "ΕΠΛ 605: Προχωρημένη Αρχιτεκτονική"

Transcription

1 ΕΠΛ 605: Προχωρημένη Αρχιτεκτονική Υπολογιστών Presentation of UniServer Horizon 2020 European project findings: X-Gene server chips, voltage-noise characterization, high-bandwidth voltage measurements, di/dt viruses, V MIN characterization UCY CS department - Zacharias Hadjilambrou 1

2 UniServer Project Overview 3.5 years major research project funded by European Community s Horizon 2020 research program. Project budget 4.8 million Started in February 2016 to finish in July 2019 Aims to develop universal system and software architecture for servers targeting cloud and edge computing market Key principle behind Uniserver approach is exposing the hardware intrinsic variations by pushing the operating voltage, frequency, refresh rates points beyond the pessimistic nominal values UCY CS department - Zacharias Hadjilambrou 2

3 UniServer Partners Applied Micro provides the hardware, the X-Gene2 and X-Gene3 state-of-the-art ARM 64bit server CPUs QUB, UCY, UOA, ARM lead the hardware characterization effort that will reveal the pessimistic operating points UTH and IBM lead the effort of developing fault-tolerant hypervisor and resource managers (e.g. open stack) WSE, MER and SPA provide the application where the Uniserver software hardware ecosystem will be evaluated on UCY CS department - Zacharias Hadjilambrou 3

4 UniServer Hardware X-Gene2 server board 8 2.4GHz, 0.98V at 28nm 8MB LLC cache 32 GB DDR3 X-Gene3 server board 32 3GHz, 0.87V at 16nm 32MB LLC cache 128GB DDR4 UCY CS department - Zacharias Hadjilambrou 4

5 X-Gene 2/3 chip layout Source HotChips 2014 Cores are packed into PMD (processor modules) Each PMD has two cores and one shared among the two cores L2 cache (256KB) Each core has private 32KB L1I and L1D L1 cache is write through to L2. Questions: What is the optimal allocation strategy of threads to cores for performance? Is this like SMT? 8MB L3 is shared among all cores UCY CS department - Zacharias Hadjilambrou 5

6 X-Gene2 uarch block diagram Branch predictor 4 wide superscalar 1 pipeline for SIMD and float instructions One ALU for simple int One for complex int UCY CS department - Zacharias Hadjilambrou 6

7 X-Gene Voltage knobs X-Gene2/3 servers offer 3 voltage domains CPU, LLC, DRAM Hence we can optimize the voltage for CPU, LLC and DRAM Moreover we can optimize the DRAM refresh-rate UCY CS department - Zacharias Hadjilambrou 7

8 Power Supply Voltage UniServer Motivation - Dennard scaling 1,2 1 0,8 0,6 0,4 0,2 ITRS 2001 Projections Voltage used to scale linearly with transistor dimensions following Dennard s Rule Year SOURCE ITRS UCY CS department - Zacharias Hadjilambrou 8

9 Power Supply Voltage End of Dennard Scaling 1,2 1 0,8 ITRS 2001 Projections ITRS 2015 Projections 0,6 0,4 0,2 15 years shift Year Limited voltage scaling since ~2005 ITRS 2001 projections fell significantly off SOURCE ITRS UCY CS department - Zacharias Hadjilambrou 9

10 Limited Energy Efficiency Gains Based on [HEsm.ISCA11, ITRS13, JKoo.AHC11], UCY CS department - Zacharias Hadjilambrou 10

11 Limited Voltage Scaling Factors Unpredictable issues at small technology nodes Leakage Vthreshold Voltage margins for Process variations Voltage Noise UCY CS department - Zacharias Hadjilambrou 11

12 Process Variations CPU Products Names: Across Chips X100 X200 X300 Intra-Chip Frequency F req1 (2.7GHz) < F req2 (3GHz) < F req3 (3.3GHz) UCY CS department - Zacharias Hadjilambrou 12

13 UniServer approach Depart from pessimistic operating points by revealing and exploiting at runtime the true capabilities of each CPU, DRAM, core,... UCY CS department - Zacharias Hadjilambrou 13

14 Dealing with hardware variations across chips/drams How to set the optimal voltage/refresh-rate for a given frequency for each chip/drams individually? Characterize each chip with V MIN testing. Apply a different voltage to each chip based on the V MIN test results UCY CS department - Zacharias Hadjilambrou 14

15 VMIN (mv) How to V MIN test? Test many benchmarks lu mg ep is cg bt sp ua dc benchmarks The highest Vmin is the chip s Vmin UCY CS department - Zacharias Hadjilambrou 15

16 How to V MIN test? Test one worst-case benchmark/stress-test/virus Craft a virus that creates worst-case conditions and do a V MIN run for the virus only We need CPU, LLC (last level cache) and DRAM viruses DRAM and LLC viruses: Attempt to fill the SRAM/DRAM cells within patterns that maximize the probability of bit-flips CPU viruses: Attempt to maximize voltage-noise. Note this is different from maximizing power e.g. Prime95 UCY CS department - Zacharias Hadjilambrou 16

17 Limited Voltage Scaling Factors Unpredictable issues at small technology nodes Leakage Vthreshold Voltage margins for Process variations Voltage Noise UCY CS department - Zacharias Hadjilambrou 17

18 Voltage-Noise in CPUs Caused by sudden variations in CPU power consumption Voltage-noise sources: Low-power techniques e.g. clock-gating and power-gating Pipeline flushes (e.g. due to branch miss prediction) followed by high power activity cause Activity switching between high and lower power at a rate equal to the PDN 1 st resonance frequency UCY CS department - Zacharias Hadjilambrou 18

19 Power-Delivery-Networks 101 ~2 us ~10 ns ~5 us PDN reacts to large current stimulus with a response that has 3 dominant frequencies (resonance frequencies) The largest droop (1 st order resonance droop) occurs at ~10 ns Repeated large current (I) draw at 10ns amplify voltage-noise. This is what voltage-noise (di/dt) viruses attempt to do. Time (us) UCY CS department - Zacharias Hadjilambrou 19

20 CPU Voltage (mv) Why voltage-noise is bad? 1000 Nominal Voltage 950 Voltage droop time (ns) UCY CS department - Zacharias Hadjilambrou 20

21 How to deal with voltage-noise? Nominal Voltage set by manufacturer Needed voltage margin V MIN Extra Margin Needed Margin Excessive Margining Heat issues Energy inefficiency Intrinsic V MIN How to shave the unnecessary margin? Use di/dt stress-tests, they are good stability tests for V MIN testing because they drop the voltage very low The V MIN of the virus can dictate the nominal voltage UCY CS department - Zacharias Hadjilambrou 21

22 Voltage-noise (di/dt) viruses development On-Chip Circuits High-bandwidth voltage-measurements On-Package Measurement Points SOURCE [ARM ISSCC 2015] To develop di/dt viruses high-bandwidth voltage-measurements are required to measure the virus effectiveness and progress. Otherwise have to rely on V MIN (which is very time consuming) Genetic algorithms (GA) to find the type and order of instructions that maximize voltage-noise. Manual virus crafting is possible but time-consuming UCY CS department - Zacharias Hadjilambrou 22

23 Genetic-algorithm for di/dt virus generation Generate initial population (random assembly instruction sequences) Measure population (run each instruction sequence on the target-machine and measure the voltage droop) Select two parents (Pick two of the fittest instruction sequences that cause the largest voltage-droop) Create two children by crossover (exchange instructions between parents) Mutate children (randomly change some instructions) Whole process repeats until we are happy with the results.. Or the metric of interests does not improve over generations No Yes Population size reached? UCY CS department - Zacharias Hadjilambrou 23

24 max voltage droop (mv) Vmin (mv) GA di/dt virus vs SPEC benchmarks on Cortex-A GA optimization was driven by high-bandwidth voltagemeasurements from an on-chip oscilloscope available for Cortex-A72 VMIN (mv) maxdroop (mv) Chip has a nominal voltage of 1000mV virus V MIN is at 850mV -> potential to shave 150mV GA virus has higher V MIN and causes higher voltage droop UCY CS department - Zacharias Hadjilambrou 24

25 Novel methodology for high-bandwidth voltage-noise measurements Spectrum analyzer Antenna CPU X-Gene2 validation board doesn t support high-bandwidth voltagemeasurements, had to find an alternative approach for generating di/dt viruses UCY CS department - Zacharias Hadjilambrou 25

26 Cortex-A72 Cluster Voltage (mv) Voltage-noise oscillations manifest as EM spikes Voltage oscillations at 15ns This manifests as spectrum spike at 67MHz (1/15ns) time (ns) Higher-Amplitude EM signals => Higher voltage noise UCY CS department - Zacharias Hadjilambrou 26

27 EM Amplitude (nw) max voltage droop (mv) GA driven by EM peak_amplitude (nw) maxdroop (mv) As EM amplitude increases voltage droop increases GA driven by EM amplitude converges to di/dt virus GA optimization iterations UCY CS department - Zacharias Hadjilambrou 27

28 VMIN (mv) EM virus on X-Gene2 V MIN vs NAS benchmarks EM virus has the highest V MIN UCY CS department - Zacharias Hadjilambrou 28

29 Exposing hardware heterogeneity with EM virus Stress-test (mv) nominal zero margin mV margin 20mV margin CHIP#1 TTT CHIP#2 TSS CHIP#3TFF We can lower the voltage on chips #1 and #2 to save power UCY CS department - Zacharias Hadjilambrou 29

30 Power (W) Overall savings by undervolting all components PMD LLC DRAM Total Component PMD voltage reduced 50mV LLC voltage 30mV DRAM refresh rate relaxed 35X 20% overall power savings Nominal Undervolted UCY CS department - Zacharias Hadjilambrou 30

Engineering the Power Delivery Network

Engineering the Power Delivery Network C HAPTER 1 Engineering the Power Delivery Network 1.1 What Is the Power Delivery Network (PDN) and Why Should I Care? The power delivery network consists of all the interconnects in the power supply path

More information

Wire Layer Geometry Optimization using Stochastic Wire Sampling

Wire Layer Geometry Optimization using Stochastic Wire Sampling Wire Layer Geometry Optimization using Stochastic Wire Sampling Raymond A. Wildman*, Joshua I. Kramer, Daniel S. Weile, and Philip Christie Department University of Delaware Introduction Is it possible

More information

CS4617 Computer Architecture

CS4617 Computer Architecture 1/26 CS4617 Computer Architecture Lecture 2 Dr J Vaughan September 10, 2014 2/26 Amdahl s Law Speedup = Execution time for entire task without using enhancement Execution time for entire task using enhancement

More information

EE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling

EE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling EE241 - Spring 2004 Advanced Digital Integrated Circuits Borivoje Nikolic Lecture 15 Low-Power Design: Supply Voltage Scaling Announcements Homework #2 due today Midterm project reports due next Thursday

More information

Low-Power VLSI. Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering

Low-Power VLSI. Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering Low-Power VLSI Seong-Ook Jung 2013. 5. 27. sjung@yonsei.ac.kr VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering Contents 1. Introduction 2. Power classification & Power performance

More information

CMOS Process Variations: A Critical Operation Point Hypothesis

CMOS Process Variations: A Critical Operation Point Hypothesis CMOS Process Variations: A Critical Operation Point Hypothesis Janak H. Patel Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign jhpatel@uiuc.edu Computer Systems

More information

Performance Evaluation of Multi-Threaded System vs. Chip-Multi-Processor System

Performance Evaluation of Multi-Threaded System vs. Chip-Multi-Processor System Performance Evaluation of Multi-Threaded System vs. Chip-Multi-Processor System Ho Young Kim, Robert Maxwell, Ankil Patel, Byeong Kil Lee Abstract The purpose of this study is to analyze and compare the

More information

Parallel Computing 2020: Preparing for the Post-Moore Era. Marc Snir

Parallel Computing 2020: Preparing for the Post-Moore Era. Marc Snir Parallel Computing 2020: Preparing for the Post-Moore Era Marc Snir THE (CMOS) WORLD IS ENDING NEXT DECADE So says the International Technology Roadmap for Semiconductors (ITRS) 2 End of CMOS? IN THE LONG

More information

Interconnect-Power Dissipation in a Microprocessor

Interconnect-Power Dissipation in a Microprocessor 4/2/2004 Interconnect-Power Dissipation in a Microprocessor N. Magen, A. Kolodny, U. Weiser, N. Shamir Intel corporation Technion - Israel Institute of Technology 4/2/2004 2 Interconnect-Power Definition

More information

Overview. 1 Trends in Microprocessor Architecture. Computer architecture. Computer architecture

Overview. 1 Trends in Microprocessor Architecture. Computer architecture. Computer architecture Overview 1 Trends in Microprocessor Architecture R05 Robert Mullins Computer architecture Scaling performance and CMOS Where have performance gains come from? Modern superscalar processors The limits of

More information

Ramon Canal NCD Master MIRI. NCD Master MIRI 1

Ramon Canal NCD Master MIRI. NCD Master MIRI 1 Wattch, Hotspot, Hotleakage, McPAT http://www.eecs.harvard.edu/~dbrooks/wattch-form.html http://lava.cs.virginia.edu/hotspot http://lava.cs.virginia.edu/hotleakage http://www.hpl.hp.com/research/mcpat/

More information

PROBE: Prediction-based Optical Bandwidth Scaling for Energy-efficient NoCs

PROBE: Prediction-based Optical Bandwidth Scaling for Energy-efficient NoCs PROBE: Prediction-based Optical Bandwidth Scaling for Energy-efficient NoCs Li Zhou and Avinash Kodi Technologies for Emerging Computer Architecture Laboratory (TEAL) School of Electrical Engineering and

More information

Formal Hardware Verification: Theory Meets Practice

Formal Hardware Verification: Theory Meets Practice Formal Hardware Verification: Theory Meets Practice Dr. Carl Seger Senior Principal Engineer Tools, Flows and Method Group Server Division Intel Corp. June 24, 2015 1 Quiz 1 Small Numbers Order the following

More information

Reducing Transistor Variability For High Performance Low Power Chips

Reducing Transistor Variability For High Performance Low Power Chips Reducing Transistor Variability For High Performance Low Power Chips HOT Chips 24 Dr Robert Rogenmoser Senior Vice President Product Development & Engineering 1 HotChips 2012 Copyright 2011 SuVolta, Inc.

More information

Architectural Core Salvaging in a Multi-Core Processor for Hard-Error Tolerance

Architectural Core Salvaging in a Multi-Core Processor for Hard-Error Tolerance Architectural Core Salvaging in a Multi-Core Processor for Hard-Error Tolerance Michael D. Powell, Arijit Biswas, Shantanu Gupta, and Shubu Mukherjee SPEARS Group, Intel Massachusetts EECS, University

More information

An Optimized Performance Amplifier

An Optimized Performance Amplifier Electrical and Electronic Engineering 217, 7(3): 85-89 DOI: 1.5923/j.eee.21773.3 An Optimized Performance Amplifier Amir Ashtari Gargari *, Neginsadat Tabatabaei, Ghazal Mirzaei School of Electrical and

More information

Yield-driven Robust Iterative Circuit Optimization

Yield-driven Robust Iterative Circuit Optimization Yield-driven Robust Iterative Circuit Optimization Yan Li, Vladimir Stojanovic July 29, 2009 Integrated System Group Massachusetts Institute of Technology Systems-on-chip is difficult to design Integrated

More information

Broadband Methodology for Power Distribution System Analysis of Chip, Package and Board for High Speed IO Design

Broadband Methodology for Power Distribution System Analysis of Chip, Package and Board for High Speed IO Design DesignCon 2009 Broadband Methodology for Power Distribution System Analysis of Chip, Package and Board for High Speed IO Design Hsing-Chou Hsu, VIA Technologies jimmyhsu@via.com.tw Jack Lin, Sigrity Inc.

More information

Lecture #29. Moore s Law

Lecture #29. Moore s Law Lecture #29 ANNOUNCEMENTS HW#15 will be for extra credit Quiz #6 (Thursday 5/8) will include MOSFET C-V No late Projects will be accepted after Thursday 5/8 The last Coffee Hour will be held this Thursday

More information

Di/dt Mitigation Method in Power Delivery Design & Analysis

Di/dt Mitigation Method in Power Delivery Design & Analysis Di/dt Mitigation Method in Power Delivery Design & Analysis Delino Julius Thao Pham Fattouh Farag DAC 2009, San Francisco July 27, 2009 Outlines Introduction Background di/dt Mitigation Modeling di/dt

More information

Project 5: Optimizer Jason Ansel

Project 5: Optimizer Jason Ansel Project 5: Optimizer Jason Ansel Overview Project guidelines Benchmarking Library OoO CPUs Project Guidelines Use optimizations from lectures as your arsenal If you decide to implement one, look at Whale

More information

BASICS: TECHNOLOGIES. EEC 116, B. Baas

BASICS: TECHNOLOGIES. EEC 116, B. Baas BASICS: TECHNOLOGIES EEC 116, B. Baas 97 Minimum Feature Size Fabrication technologies (often called just technologies) are named after their minimum feature size which is generally the minimum gate length

More information

RANA: Towards Efficient Neural Acceleration with Refresh-Optimized Embedded DRAM

RANA: Towards Efficient Neural Acceleration with Refresh-Optimized Embedded DRAM RANA: Towards Efficient Neural Acceleration with Refresh-Optimized Embedded DRAM Fengbin Tu, Weiwei Wu, Shouyi Yin, Leibo Liu, Shaojun Wei Institute of Microelectronics Tsinghua University The 45th International

More information

DesignCon On-Chip Power Supply Noise and Reliability Analysis for Multi-Gigabit I/O Interfaces

DesignCon On-Chip Power Supply Noise and Reliability Analysis for Multi-Gigabit I/O Interfaces DesignCon 2010 On-Chip Power Supply Noise and Reliability Analysis for Multi-Gigabit I/O Interfaces Ralf Schmitt, Rambus Inc. [Email: rschmitt@rambus.com] Hai Lan, Rambus Inc. Ling Yang, Rambus Inc. Abstract

More information

A Case for Opportunistic Embedded Sensing In Presence of Hardware Power Variability

A Case for Opportunistic Embedded Sensing In Presence of Hardware Power Variability A Case for Opportunistic Embedded Sensing In Presence of Hardware Power Variability L. Wanner, C. Apte, R. Balani, Puneet Gupta, and Mani Srivastava University of California, Los Angeles puneet@ee.ucla.edu

More information

Effect of Aging on Power Integrity of Digital Integrated Circuits

Effect of Aging on Power Integrity of Digital Integrated Circuits Effect of Aging on Power Integrity of Digital Integrated Circuits A. Boyer, S. Ben Dhia Alexandre.boyer@laas.fr Sonia.bendhia@laas.fr 1 May 14 th, 2013 Introduction and context Long time operation Harsh

More information

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. Title Triple boundary multiphase with predictive interleaving technique for switched capacitor DC-DC converter

More information

Leakage Power Minimization in Deep-Submicron CMOS circuits

Leakage Power Minimization in Deep-Submicron CMOS circuits Outline Leakage Power Minimization in Deep-Submicron circuits Politecnico di Torino Dip. di Automatica e Informatica 1019 Torino, Italy enrico.macii@polito.it Introduction. Design for low leakage: Basics.

More information

Research Statement. Sorin Cotofana

Research Statement. Sorin Cotofana Research Statement Sorin Cotofana Over the years I ve been involved in computer engineering topics varying from computer aided design to computer architecture, logic design, and implementation. In the

More information

A Software Technique to Improve Yield of Processor Chips in Presence of Ultra-Leaky SRAM Cells Caused by Process Variation

A Software Technique to Improve Yield of Processor Chips in Presence of Ultra-Leaky SRAM Cells Caused by Process Variation A Software Technique to Improve Yield of Processor Chips in Presence of Ultra-Leaky SRAM Cells Caused by Process Variation Maziar Goudarzi, Tohru Ishihara, Hiroto Yasuura System LSI Research Center Kyushu

More information

Probabilistic and Variation- Tolerant Design: Key to Continued Moore's Law. Tanay Karnik, Shekhar Borkar, Vivek De Circuit Research, Intel Labs

Probabilistic and Variation- Tolerant Design: Key to Continued Moore's Law. Tanay Karnik, Shekhar Borkar, Vivek De Circuit Research, Intel Labs Probabilistic and Variation- Tolerant Design: Key to Continued Moore's Law Tanay Karnik, Shekhar Borkar, Vivek De Circuit Research, Intel Labs 1 Outline Variations Process, supply voltage, and temperature

More information

EECS 427 Lecture 13: Leakage Power Reduction Readings: 6.4.2, CBF Ch.3. EECS 427 F09 Lecture Reminders

EECS 427 Lecture 13: Leakage Power Reduction Readings: 6.4.2, CBF Ch.3. EECS 427 F09 Lecture Reminders EECS 427 Lecture 13: Leakage Power Reduction Readings: 6.4.2, CBF Ch.3 [Partly adapted from Irwin and Narayanan, and Nikolic] 1 Reminders CAD assignments Please submit CAD5 by tomorrow noon CAD6 is due

More information

PART MAX2605EUT-T MAX2606EUT-T MAX2607EUT-T MAX2608EUT-T MAX2609EUT-T TOP VIEW IND GND. Maxim Integrated Products 1

PART MAX2605EUT-T MAX2606EUT-T MAX2607EUT-T MAX2608EUT-T MAX2609EUT-T TOP VIEW IND GND. Maxim Integrated Products 1 19-1673; Rev 0a; 4/02 EVALUATION KIT MANUAL AVAILABLE 45MHz to 650MHz, Integrated IF General Description The are compact, high-performance intermediate-frequency (IF) voltage-controlled oscillators (VCOs)

More information

Using ECC Feedback to Guide Voltage Speculation in Low-Voltage Processors

Using ECC Feedback to Guide Voltage Speculation in Low-Voltage Processors Using ECC Feedback to Guide Voltage Speculation in Low-Voltage Processors Anys Bacha Computer Science and Engineering The Ohio State University bacha@cse.ohio-state.edu Radu Teodorescu Computer Science

More information

Innovative ultra-broadband ubiquitous Wireless communications through terahertz transceivers ibrow

Innovative ultra-broadband ubiquitous Wireless communications through terahertz transceivers ibrow Project Overview Innovative ultra-broadband ubiquitous Wireless communications through terahertz transceivers ibrow Mar-2017 Presentation outline Project key facts Motivation Project objectives Project

More information

FOUR TOTAL TRANSFER CAPABILITY. 4.1 Total transfer capability CHAPTER

FOUR TOTAL TRANSFER CAPABILITY. 4.1 Total transfer capability CHAPTER CHAPTER FOUR TOTAL TRANSFER CAPABILITY R structuring of power system aims at involving the private power producers in the system to supply power. The restructured electric power industry is characterized

More information

04/29/03 EE371 Power Delivery D. Ayers 1. VLSI Power Delivery. David Ayers

04/29/03 EE371 Power Delivery D. Ayers 1. VLSI Power Delivery. David Ayers 04/29/03 EE371 Power Delivery D. Ayers 1 VLSI Power Delivery David Ayers 04/29/03 EE371 Power Delivery D. Ayers 2 Outline Die power delivery Die power goals Typical processor power grid Transistor power

More information

Cognitive Wireless Network : Computer Networking. Overview. Cognitive Wireless Networks

Cognitive Wireless Network : Computer Networking. Overview. Cognitive Wireless Networks Cognitive Wireless Network 15-744: Computer Networking L-19 Cognitive Wireless Networks Optimize wireless networks based context information Assigned reading White spaces Online Estimation of Interference

More information

COTSon: Infrastructure for system-level simulation

COTSon: Infrastructure for system-level simulation COTSon: Infrastructure for system-level simulation Ayose Falcón, Paolo Faraboschi, Daniel Ortega HP Labs Exascale Computing Lab http://sites.google.com/site/hplabscotson MICRO-41 tutorial November 9, 28

More information

DASH: Deadline-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators

DASH: Deadline-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators DASH: Deadline-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators Hiroyuki Usui, Lavanya Subramanian Kevin Chang, Onur Mutlu DASH source code is available at GitHub

More information

Opportunities and Challenges in Ultra Low Voltage CMOS. Rajeevan Amirtharajah University of California, Davis

Opportunities and Challenges in Ultra Low Voltage CMOS. Rajeevan Amirtharajah University of California, Davis Opportunities and Challenges in Ultra Low Voltage CMOS Rajeevan Amirtharajah University of California, Davis Opportunities for Ultra Low Voltage Battery Operated and Mobile Systems Wireless sensors RFID

More information

LSI and Circuit Technologies for the SX-8 Supercomputer

LSI and Circuit Technologies for the SX-8 Supercomputer LSI and Circuit Technologies for the SX-8 Supercomputer By Jun INASAKA,* Toshio TANAHASHI,* Hideaki KOBAYASHI,* Toshihiro KATOH,* Mikihiro KAJITA* and Naoya NAKAYAMA This paper describes the LSI and circuit

More information

Design Challenges in Multi-GHz Microprocessors

Design Challenges in Multi-GHz Microprocessors Design Challenges in Multi-GHz Microprocessors Bill Herrick Director, Alpha Microprocessor Development www.compaq.com Introduction Moore s Law ( Law (the trend that the demand for IC functions and the

More information

A Power-efficient 32bit ARM ISA Processor using Timingerror. Detection and Correction for Transient-error Tolerance. and Adaptation to PVT Variation

A Power-efficient 32bit ARM ISA Processor using Timingerror. Detection and Correction for Transient-error Tolerance. and Adaptation to PVT Variation A Power-efficient 32bit ARM ISA Processor using Timingerror Detection and Correction for Transient-error Tolerance and Adaptation to PVT Variation David Bull 1, Shidhartha Das 1, Karthik Shivashankar 1,

More information

Exploring the Software Stack for Underdesigned Computing Machines Rajesh Gupta UC San Diego.

Exploring the Software Stack for Underdesigned Computing Machines Rajesh Gupta UC San Diego. Exploring the Software Stack for Underdesigned Computing Machines Rajesh Gupta UC San Diego. 1 Exploring the Software Stack for Underdesigned Computing Machines 1 Exploring the Software Stack for Underdesigned

More information

Datorstödd Elektronikkonstruktion

Datorstödd Elektronikkonstruktion Datorstödd Elektronikkonstruktion [Computer Aided Design of Electronics] Zebo Peng, Petru Eles and Gert Jervan Embedded Systems Laboratory IDA, Linköping University http://www.ida.liu.se/~tdts80/~tdts80

More information

PART TOP VIEW V EE 1 V CC 1 CONTROL LOGIC

PART TOP VIEW V EE 1 V CC 1 CONTROL LOGIC 19-1331; Rev 1; 6/98 EVALUATION KIT AVAILABLE Upstream CATV Driver Amplifier General Description The MAX3532 is a programmable power amplifier for use in upstream cable applications. The device outputs

More information

Performance Evaluation of Recently Proposed Cache Replacement Policies

Performance Evaluation of Recently Proposed Cache Replacement Policies University of Jordan Computer Engineering Department Performance Evaluation of Recently Proposed Cache Replacement Policies CPE 731: Advanced Computer Architecture Dr. Gheith Abandah Asma Abdelkarim January

More information

DAT175: Topics in Electronic System Design

DAT175: Topics in Electronic System Design DAT175: Topics in Electronic System Design Analog Readout Circuitry for Hearing Aid in STM90nm 21 February 2010 Remzi Yagiz Mungan v1.10 1. Introduction In this project, the aim is to design an adjustable

More information

EE4800 CMOS Digital IC Design & Analysis. Lecture 1 Introduction Zhuo Feng

EE4800 CMOS Digital IC Design & Analysis. Lecture 1 Introduction Zhuo Feng EE4800 CMOS Digital IC Design & Analysis Lecture 1 Introduction Zhuo Feng 1.1 Prof. Zhuo Feng Office: EERC 730 Phone: 487-3116 Email: zhuofeng@mtu.edu Class Website http://www.ece.mtu.edu/~zhuofeng/ee4800fall2010.html

More information

Lecture 6: Electronics Beyond the Logic Switches Xufeng Kou School of Information Science and Technology ShanghaiTech University

Lecture 6: Electronics Beyond the Logic Switches Xufeng Kou School of Information Science and Technology ShanghaiTech University Lecture 6: Electronics Beyond the Logic Switches Xufeng Kou School of Information Science and Technology ShanghaiTech University EE 224 Solid State Electronics II Lecture 3: Lattice and symmetry 1 Outline

More information

Performance Metrics. Computer Architecture. Outline. Objectives. Basic Performance Metrics. Basic Performance Metrics

Performance Metrics. Computer Architecture. Outline. Objectives. Basic Performance Metrics. Basic Performance Metrics Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com Performance Metrics http://www.yildiz.edu.tr/~naydin 1 2 Objectives How can we meaningfully measure and compare

More information

A 4-Channel Fast Waveform Sampling ASIC in 130 nm CMOS

A 4-Channel Fast Waveform Sampling ASIC in 130 nm CMOS A 4-Channel Fast Waveform Sampling ASIC in 130 nm CMOS E. Oberla, H. Grabas, M. Bogdan, J.F. Genat, H. Frisch Enrico Fermi Institute, University of Chicago K. Nishimura, G. Varner University of Hawai I

More information

A/D Conversion and Filtering for Ultra Low Power Radios. Dejan Radjen Yasser Sherazi. Advanced Digital IC Design. Contents. Why is this important?

A/D Conversion and Filtering for Ultra Low Power Radios. Dejan Radjen Yasser Sherazi. Advanced Digital IC Design. Contents. Why is this important? 1 Advanced Digital IC Design A/D Conversion and Filtering for Ultra Low Power Radios Dejan Radjen Yasser Sherazi Contents A/D Conversion A/D Converters Introduction ΔΣ modulator for Ultra Low Power Radios

More information

Design Considerations for 5G mm-wave Receivers. Stefan Andersson, Lars Sundström, and Sven Mattisson

Design Considerations for 5G mm-wave Receivers. Stefan Andersson, Lars Sundström, and Sven Mattisson Design Considerations for 5G mm-wave Receivers Stefan Andersson, Lars Sundström, and Sven Mattisson Outline Introduction to 5G @ mm-waves mm-wave on-chip frequency generation mm-wave analog front-end design

More information

Layout-Aware Pattern Generation for Maximizing Supply Noise Effects on Critical Paths

Layout-Aware Pattern Generation for Maximizing Supply Noise Effects on Critical Paths Layout-Aware Pattern Generation for Maximizing Supply Noise Effects on Critical Paths Junxia Ma, Jeremy Lee and Mohammad Tehranipoor ECE Department, University of Connecticut, CT, 06269 {junxia, jslee,

More information

Implementation of FPGA based Decision Making Engine and Genetic Algorithm (GA) for Control of Wireless Parameters

Implementation of FPGA based Decision Making Engine and Genetic Algorithm (GA) for Control of Wireless Parameters Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 11, Number 1 (2018) pp. 15-21 Research India Publications http://www.ripublication.com Implementation of FPGA based Decision Making

More information

Rohde & Schwarz EMI/EMC debugging with modern oscilloscope. Ing. Leonardo Nanetti Rohde&Schwarz

Rohde & Schwarz EMI/EMC debugging with modern oscilloscope. Ing. Leonardo Nanetti Rohde&Schwarz Rohde & Schwarz EMI/EMC debugging with modern oscilloscope Ing. Leonardo Nanetti Rohde&Schwarz EMI debugging Agenda l The basics l l l l The idea of EMI debugging How is it done? Application example What

More information

Chapter 1 Introduction

Chapter 1 Introduction Chapter 1 Introduction 1.1 Introduction There are many possible facts because of which the power efficiency is becoming important consideration. The most portable systems used in recent era, which are

More information

Pulse-Based Ultra-Wideband Transmitters for Digital Communication

Pulse-Based Ultra-Wideband Transmitters for Digital Communication Pulse-Based Ultra-Wideband Transmitters for Digital Communication Ph.D. Thesis Defense David Wentzloff Thesis Committee: Prof. Anantha Chandrakasan (Advisor) Prof. Joel Dawson Prof. Charles Sodini Ultra-Wideband

More information

1) Fixed point [15 points] a) What are the primary reasons we might use fixed point rather than floating point? [2]

1) Fixed point [15 points] a) What are the primary reasons we might use fixed point rather than floating point? [2] 473 Fall 2018 Homework 2 Answers Due on Gradescope by 5pm on December 11 th. 165 points. Notice that the last problem is a group assignment (groups of 2 or 3). Digital Signal Processing and other specialized

More information

PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS

PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS The major design challenges of ASIC design consist of microscopic issues and macroscopic issues [1]. The microscopic issues are ultra-high

More information

A 4 Channel Waveform Sampling ASIC in 130 nm CMOS

A 4 Channel Waveform Sampling ASIC in 130 nm CMOS A 4 Channel Waveform Sampling ASIC in 130 nm CMOS E. Oberla, H. Grabas, J.F. Genat, H. Frisch Enrico Fermi Institute, University of Chicago K. Nishimura, G. Varner University of Hawai I Large Area Picosecond

More information

Introduction. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. July 30, 2002

Introduction. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. July 30, 2002 Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Introduction July 30, 2002 1 What is this book all about? Introduction to digital integrated circuits.

More information

JANUARY 28-31, 2013 SANTA CLARA CONVENTION CENTER. World s First LPDDR3 Enabling for Mobile Application Processors System

JANUARY 28-31, 2013 SANTA CLARA CONVENTION CENTER. World s First LPDDR3 Enabling for Mobile Application Processors System JANUARY 28-31, 2013 SANTA CLARA CONVENTION CENTER World s First LPDDR3 Enabling for Mobile Application Processors System Contents Introduction Problem Statements at Early mobile platform Root-cause, Enablers

More information

System Level Analysis of Fast, Per-Core DVFS using On-Chip Switching Regulators

System Level Analysis of Fast, Per-Core DVFS using On-Chip Switching Regulators System Level Analysis of Fast, Per-Core DVFS using On-Chip Switching s Wonyoung Kim, Meeta S. Gupta, Gu-Yeon Wei and David Brooks School of Engineering and Applied Sciences, Harvard University, 33 Oxford

More information

Using Variable-MHz Microprocessors to Efficiently Handle Uncertainty in Real-Time Systems

Using Variable-MHz Microprocessors to Efficiently Handle Uncertainty in Real-Time Systems Using Variable-MHz Microprocessors to Efficiently Handle Uncertainty in Real-Time Systems Eric Rotenberg Center for Embedded Systems Research (CESR) Department of Electrical & Computer Engineering North

More information

Worst Case RLC Noise with Timing Window Constraints

Worst Case RLC Noise with Timing Window Constraints Worst Case RLC Noise with Timing Window Constraints Jun Chen Electrical Engineering Department University of California, Los Angeles jchen@ee.ucla.edu Lei He Electrical Engineering Department University

More information

A Switched Decoupling Capacitor Circuit for On-Chip Supply Resonance Damping

A Switched Decoupling Capacitor Circuit for On-Chip Supply Resonance Damping A Switched Decoupling Capacitor Circuit for On-Chip Supply Resonance Damping Jie Gu, Hanyong Eom and Chris H. Kim Department of Electrical and Computer Engineering University of Minnesota, Minneapolis

More information

Novel Low-Overhead Operand Isolation Techniques for Low-Power Datapath Synthesis

Novel Low-Overhead Operand Isolation Techniques for Low-Power Datapath Synthesis Novel Low-Overhead Operand Isolation Techniques for Low-Power Datapath Synthesis N. Banerjee, A. Raychowdhury, S. Bhunia, H. Mahmoodi, and K. Roy School of Electrical and Computer Engineering, Purdue University,

More information

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2012

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2012 ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2012 Lecture 5: Termination, TX Driver, & Multiplexer Circuits Sam Palermo Analog & Mixed-Signal Center Texas A&M University Announcements

More information

Resource Allocation Strategies Based on the Signal-to-Leakage-plus-Noise Ratio in LTE-A CoMP Systems

Resource Allocation Strategies Based on the Signal-to-Leakage-plus-Noise Ratio in LTE-A CoMP Systems Resource Allocation Strategies Based on the Signal-to-Leakage-plus-Noise Ratio in LTE-A CoMP Systems Rana A. Abdelaal Mahmoud H. Ismail Khaled Elsayed Cairo University, Egypt 4G++ Project 1 Agenda Motivation

More information

Simulation and Measurement of an On-Die Power-Gated Power Delivery System

Simulation and Measurement of an On-Die Power-Gated Power Delivery System DesignCon 2010 Simulation and Measurement of an On-Die Power-Gated Power Delivery System Jimmy Huang, Intel [jimmy.huat.since.huang@intel.com, (+604)-2532385] Tan Fern Nee, Intel [fern.nee.tan@intel.com,

More information

Analog and RF circuit techniques in nanometer CMOS

Analog and RF circuit techniques in nanometer CMOS Analog and RF circuit techniques in nanometer CMOS Bram Nauta University of Twente The Netherlands http://icd.ewi.utwente.nl b.nauta@utwente.nl UNIVERSITY OF TWENTE. Outline Introduction Balun-LNA-Mixer

More information

Energy Efficient Circuit Design and the Future of Power Delivery

Energy Efficient Circuit Design and the Future of Power Delivery Energy Efficient Circuit Design and the Future of Power Delivery Greg Taylor EPEPS 2009 Outline Looking back Energy efficiency in CMOS Side effects Suggestions Conclusion 2 Looking Back Microprocessor

More information

Static Power and the Importance of Realistic Junction Temperature Analysis

Static Power and the Importance of Realistic Junction Temperature Analysis White Paper: Virtex-4 Family R WP221 (v1.0) March 23, 2005 Static Power and the Importance of Realistic Junction Temperature Analysis By: Matt Klein Total power consumption of a board or system is important;

More information

Polarization Optimized PMD Source Applications

Polarization Optimized PMD Source Applications PMD mitigation in 40Gb/s systems Polarization Optimized PMD Source Applications As the bit rate of fiber optic communication systems increases from 10 Gbps to 40Gbps, 100 Gbps, and beyond, polarization

More information

Intelligent Systems Group Department of Electronics. An Evolvable, Field-Programmable Full Custom Analogue Transistor Array (FPTA)

Intelligent Systems Group Department of Electronics. An Evolvable, Field-Programmable Full Custom Analogue Transistor Array (FPTA) Department of Electronics n Evolvable, Field-Programmable Full Custom nalogue Transistor rray (FPT) Outline What`s Behind nalog? Evolution Substrate custom made configurable transistor array (FPT) Ways

More information

A Low-Power SRAM Design Using Quiet-Bitline Architecture

A Low-Power SRAM Design Using Quiet-Bitline Architecture A Low-Power SRAM Design Using uiet-bitline Architecture Shin-Pao Cheng Shi-Yu Huang Electrical Engineering Department National Tsing-Hua University, Taiwan Abstract This paper presents a low-power SRAM

More information

Fast Placement Optimization of Power Supply Pads

Fast Placement Optimization of Power Supply Pads Fast Placement Optimization of Power Supply Pads Yu Zhong Martin D. F. Wong Dept. of Electrical and Computer Engineering Dept. of Electrical and Computer Engineering Univ. of Illinois at Urbana-Champaign

More information

DESIGN OF MULTIPLYING DELAY LOCKED LOOP FOR DIFFERENT MULTIPLYING FACTORS

DESIGN OF MULTIPLYING DELAY LOCKED LOOP FOR DIFFERENT MULTIPLYING FACTORS DESIGN OF MULTIPLYING DELAY LOCKED LOOP FOR DIFFERENT MULTIPLYING FACTORS Aman Chaudhary, Md. Imtiyaz Chowdhary, Rajib Kar Department of Electronics and Communication Engg. National Institute of Technology,

More information

The challenges of low power design Karen Yorav

The challenges of low power design Karen Yorav The challenges of low power design Karen Yorav The challenges of low power design What this tutorial is NOT about: Electrical engineering CMOS technology but also not Hand waving nonsense about trends

More information

Appendix. RF Transient Simulator. Page 1

Appendix. RF Transient Simulator. Page 1 Appendix RF Transient Simulator Page 1 RF Transient/Convolution Simulation This simulator can be used to solve problems associated with circuit simulation, when the signal and waveforms involved are modulated

More information

LOW COST PHASED ARRAY ANTENNA TRANSCEIVER FOR WPAN APPLICATIONS

LOW COST PHASED ARRAY ANTENNA TRANSCEIVER FOR WPAN APPLICATIONS LOW COST PHASED ARRAY ANTENNA TRANSCEIVER FOR WPAN APPLICATIONS Introduction WPAN (Wireless Personal Area Network) transceivers are being designed to operate in the 60 GHz frequency band and will mainly

More information

Low Power Design of Successive Approximation Registers

Low Power Design of Successive Approximation Registers Low Power Design of Successive Approximation Registers Rabeeh Majidi ECE Department, Worcester Polytechnic Institute, Worcester MA USA rabeehm@ece.wpi.edu Abstract: This paper presents low power design

More information

Development of a 20 GS/s Sampling Chip in 130nm CMOS Technology

Development of a 20 GS/s Sampling Chip in 130nm CMOS Technology Development of a 20 GS/s Sampling Chip in 130nm CMOS Technology 2009 IEEE Nuclear Science Symposium, Orlando, Florida, October 28 th 2009 Jean-Francois Genat On behalf of Mircea Bogdan 1, Henry J. Frisch

More information

Adaptive Guardband Scheduling to Improve System-Level Efficiency of the POWER7+

Adaptive Guardband Scheduling to Improve System-Level Efficiency of the POWER7+ Adaptive Guardband Scheduling to Improve System-Level Efficiency of the POWER7+ Yazhou Zu 1, Charles R. Lefurgy, Jingwen Leng 1, Matthew Halpern 1, Michael S. Floyd, Vijay Janapa Reddi 1 1 The University

More information

HIGH-SPEED LOW-POWER ON-CHIP GLOBAL SIGNALING DESIGN OVERVIEW. Xi Chen, John Wilson, John Poulton, Rizwan Bashirullah, Tom Gray

HIGH-SPEED LOW-POWER ON-CHIP GLOBAL SIGNALING DESIGN OVERVIEW. Xi Chen, John Wilson, John Poulton, Rizwan Bashirullah, Tom Gray HIGH-SPEED LOW-POWER ON-CHIP GLOBAL SIGNALING DESIGN OVERVIEW Xi Chen, John Wilson, John Poulton, Rizwan Bashirullah, Tom Gray Agenda Problems of On-chip Global Signaling Channel Design Considerations

More information

CS Computer Architecture Spring Lecture 04: Understanding Performance

CS Computer Architecture Spring Lecture 04: Understanding Performance CS 35101 Computer Architecture Spring 2008 Lecture 04: Understanding Performance Taken from Mary Jane Irwin (www.cse.psu.edu/~mji) and Kevin Schaffer [Adapted from Computer Organization and Design, Patterson

More information

Advanced Digital Design

Advanced Digital Design Advanced Digital Design Introduction & Motivation by A. Steininger and M. Delvai Vienna University of Technology Outline Challenges in Digital Design The Role of Time in the Design The Fundamental Design

More information

Low Transistor Variability The Key to Energy Efficient ICs

Low Transistor Variability The Key to Energy Efficient ICs Low Transistor Variability The Key to Energy Efficient ICs 2 nd Berkeley Symposium on Energy Efficient Electronic Systems 11/3/11 Robert Rogenmoser, PhD 1 BEES_roro_G_111103 Copyright 2011 SuVolta, Inc.

More information

A 3-10GHz Ultra-Wideband Pulser

A 3-10GHz Ultra-Wideband Pulser A 3-10GHz Ultra-Wideband Pulser Jan M. Rabaey Simone Gambini Davide Guermandi Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2006-136 http://www.eecs.berkeley.edu/pubs/techrpts/2006/eecs-2006-136.html

More information

Supporting x86-64 Address Translation for 100s of GPU Lanes. Jason Power, Mark D. Hill, David A. Wood

Supporting x86-64 Address Translation for 100s of GPU Lanes. Jason Power, Mark D. Hill, David A. Wood Supporting x86-64 Address Translation for 100s of GPU s Jason Power, Mark D. Hill, David A. Wood Summary Challenges: CPU&GPUs physically integrated, but logically separate; This reduces theoretical bandwidth,

More information

Pushing Ultra-Low-Power Digital Circuits

Pushing Ultra-Low-Power Digital Circuits Pushing Ultra-Low-Power Digital Circuits into the Nanometer Era David Bol Microelectronics Laboratory Ph.D public defense December 16, 2008 Pushing Ultra-Low-Power Digital Circuits into the Nanometer Era

More information

Dr. Ralf Sommer. Munich, March 8th, 2006 COM BTS DAT DF AMF. Presenter Dept Titel presentation Date Page 1

Dr. Ralf Sommer. Munich, March 8th, 2006 COM BTS DAT DF AMF. Presenter Dept Titel presentation Date Page 1 DATE 2006 Special Session: DFM/DFY Design for Manufacturability and Yield - Influence of Process Variations in Digital, Analog and Mixed-Signal Circuit Design DATE 06 Munich, March 8th, 2006 Presenter

More information

On the Rules of Low-Power Design

On the Rules of Low-Power Design On the Rules of Low-Power Design (and Why You Should Break Them) Prof. Todd Austin University of Michigan austin@umich.edu A long time ago, in a not so far away place The Rules of Low-Power Design P =

More information

Domino Static Gates Final Design Report

Domino Static Gates Final Design Report Domino Static Gates Final Design Report Krishna Santhanam bstract Static circuit gates are the standard circuit devices used to build the major parts of digital circuits. Dynamic gates, such as domino

More information

A Novel approach for Optimizing Cross Layer among Physical Layer and MAC Layer of Infrastructure Based Wireless Network using Genetic Algorithm

A Novel approach for Optimizing Cross Layer among Physical Layer and MAC Layer of Infrastructure Based Wireless Network using Genetic Algorithm A Novel approach for Optimizing Cross Layer among Physical Layer and MAC Layer of Infrastructure Based Wireless Network using Genetic Algorithm Vinay Verma, Savita Shiwani Abstract Cross-layer awareness

More information

Status and Prospect for MRAM Technology

Status and Prospect for MRAM Technology Status and Prospect for MRAM Technology Dr. Saied Tehrani Nonvolatile Memory Seminar Hot Chips Conference August 22, 2010 Memorial Auditorium Stanford University Everspin Technologies, Inc. - 2010 Agenda

More information

Power Management in Multicore Processors through Clustered DVFS

Power Management in Multicore Processors through Clustered DVFS Power Management in Multicore Processors through Clustered DVFS A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Tejaswini Kolpe IN PARTIAL FULFILLMENT OF THE

More information