ECE 4750 Computer Architecture, Fall 2016 T09 Advanced Processors: Superscalar Execution
|
|
- Alban Snow
- 1 years ago
- Views:
Transcription
1 ECE 4750 Computer Architecture, Fall 2016 T09 Advanced Processors: Superscalar Execution School of Electrical and Computer Engineering Cornell University revision: In-Order Dual-Issue Superscalar TinyRV1 Processor 2 2 Superscalar Pipeline Hazards RAW Hazards Control Hazards Structural Hazards WAW and WAR Name Hazards Analyzing Performance of Superscalar Processors 11 1
2 1. In-Order Dual-Issue Superscalar TinyRV1 Processor 1. In-Order Dual-Issue Superscalar TinyRV1 Processor Processors studied so far are fundamentally limited to CPI >= 1 Superscalar processors enable CPI < 1 (i.e., IPC > 1) by executing multiple instructions in parallel Can have both in-order and out-of-order superscalar processors, but we will start by exploring in-order Continue to assume combinational memories F Stage : fetch two instructions at once D Stage : 4 read ports, decode 2 inst, issue inst to correct pipe X/M Stage : separate into A and B pipes (see next page) W Stage : 2 write ports 2
3 1. In-Order Dual-Issue Superscalar TinyRV1 Processor More abstract way to illustrate same dual-issue superscalar pipeline F 2 D A0 B0 A1 2 W 2 B1 Different instructions use the A-pipe and/or the B-pipe add addi mul lw sw jal jr bne A-Pipe B-Pipe Example pipeline diagram for dual-issue superscalar processor addi x3, x4, 1 addi x5, x6, 1 mul x7, x8, x9 mul x10, x11, x12 addi x13, x14, 1 Multiple instructions in stages F, D, W allowed because superscalar processor has duplicated hardware to avoid structural hazards Fetch Block group of instructions fetched as unit Swizzle instructions swapped from natural fetch position to appropriate execution pipe 3
4 2. Superscalar Pipeline Hazards 2.1. RAW Hazards 2. Superscalar Pipeline Hazards Seems so easy, but why is pipelining hard? RAW Hazards Control Hazards Structural Hazards WAR/WAR Name Hazards 2.1. RAW Hazards Let s first assume we only use stalling to resolve RAW hazards addi x3, x4, 1 add x5, x1, x3 addi x6, x5, 1 addi x7, x8, 1 addi x9, x8, 1 A fully-bypassed superscalar processor is possible, but expensive 4
5 2. Superscalar Pipeline Hazards 2.1. RAW Hazards Revisit previous assembly sequence with full bypassing addi x3, x4, 1 add x5, x1, x3 addi x6, x5, 1 addi x7, x8, 1 addi x9, x8, 1 Activity: Draw a pipeline diagram for following instruction sequence. Include all microarchitectural dependency arrows. lw lw x3, 0(x4) x5, 0(x3) addi x6, x7, 1 addi x8, x5, 1 addi x9, x8, 1 5
6 2. Superscalar Pipeline Hazards 2.2. Control Hazards 2.2. Control Hazards Consider following two static instruction sequences. 1 0x x1004 jal x0, foo foo: 5 0x2000 addi x3, x4, 1 6 0x2004 addi x5, x6, 1 1 # assume R[x1]!= R[x2] 2 0x1000 bne x1, x2, foo foo: 5 0x2000 addi x3, x4, 1 6 0x2004 addi x5, x6, 1 Pipeline diagram for left sequence. Jumps are resolved in D stage. Pipeline diagram for right sequence. Branches are resolved in A0 stage. 6
7 2. Superscalar Pipeline Hazards 2.2. Control Hazards Unaligned fetch blocks Consider the following static instruction sequence 1 0x000 opa 2 0x004 opb 3 0x008 opc 4 0x00c jal x0, 0x x100 opd 7 0x104 jal x0, 0x x204 ope 10 0x208 jal x0, 0x30c x30c opf 13 0x310 opg 14 0x314 oph Layout of fetch blocks in instruction cache. Numbers indicate which instructions belong to which fetch block. Unaligned fetch blocks within a cache line are challenging Unaligned fetch blocks across cache lines are very challenging 7
8 2. Superscalar Pipeline Hazards 2.2. Control Hazards Aligned fetch blocks Only fetch aligned fetch blocks, possibly discarding first instruction. Reconsider the same static instruction sequence 1 0x000 opa 2 0x004 opb 3 0x008 opc 4 0x00c jal x0, 0x x100 opd 7 0x104 jal x0, 0x x204 ope 10 0x208 jal x0, 0x30c x30c opf 13 0x310 opg 14 0x314 oph Layout of fetch blocks in instruction cache. Numbers indicate which instructions belong to which fetch block. 8
9 2. Superscalar Pipeline Hazards 2.2. Control Hazards Supporting precise exceptions Consider following instruction sequence. Assume commit point is in the A1/B1 stage and the xxx instruction causes an illegal instruction exception originating in the D stage. 1 add x1, x2, x3 2 xxx # causes illegal instruction exception 3 addi x4, x5, 1 4 addi x6, x7, exception_handler: 7 opx 8 opy 9 opz What if add caused an arithmetic overflow exception? 9
10 2. Superscalar Pipeline Hazards 2.3. Structural Hazards 2.3. Structural Hazards Structural hazards are not possible in the canonical single-issue TinyRV1 pipeline, but structural hazards are possible in the canonical dual-issue TinyRV1 pipeline if two instructions in the same fetch block want to use the same pipe. mul x1, x2, x3 mul x4, x5, x6 lw sw x7, 0(x8) x9, 0(x10) 2.4. WAW and WAR Name Hazards WAW name hazards are not possible in the canonical single-issue TinyRV1 pipeline, but WAW name hazards are possible in the canonical dual-issue TinyRV1 pipeline if two instructions in the same fetch block write the same register. addi x1, x3, 1 WAR name hazards are not possible in the canonical single-issue TinyRV1 pipeline. Are WAR name hazards possible in the canonical dual-issue TinyRV1 pipeline? addi x2, x3, 1 10
11 3. Analyzing Performance of Superscalar Processors 3. Analyzing Performance of Superscalar Processors Consider the classic vector-vector add loop over arrays with 64 elements. This loop has a CPI of 1.33 on the canonical single-issue TinyRV1 processor. What is the CPI on the canonical dual-issue TinyRV1 processor? loop: lw x5, 0(x13) lw x6, 0(x14) add x7, x5, x6 sw x7, 0(x12) addi x13, x12, 4 addi x14, x14, 4 addi x12, x12, 4 addi x15, x15, -1 bne x15, x0, loop jr x1 lw lw add sw addi addi addi addi bne 11
Chapter 16 - Instruction-Level Parallelism and Superscalar Processors
Chapter 16 - Instruction-Level Parallelism and Superscalar Processors Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ L. Tarrataca Chapter 16 - Superscalar Processors 1 / 78 Table of Contents I 1 Overview
Lecture 4: Introduction to Pipelining
Lecture 4: Introduction to Pipelining Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes A B C D Dryer takes 40 minutes Folder
CMP 301B Computer Architecture. Appendix C
CMP 301B Computer Architecture Appendix C Dealing with Exceptions What should be done when an exception arises and many instructions are in the pipeline??!! Force a trap instruction in the next IF stage
Dynamic Scheduling I
basic pipeline started with single, in-order issue, single-cycle operations have extended this basic pipeline with multi-cycle operations multiple issue (superscalar) now: dynamic scheduling (out-of-order
IF ID EX MEM WB 400 ps 225 ps 350 ps 450 ps 300 ps
CSE 30321 Computer Architecture I Fall 2011 Homework 06 Pipelined Processors 75 points Assigned: November 1, 2011 Due: November 8, 2011 PLEASE DO THE ASSIGNMENT ON THIS HANDOUT!!! Problem 1: (15 points)
CSE502: Computer Architecture CSE 502: Computer Architecture
CSE 502: Computer Architecture Out-of-Order Execution and Register Rename In Search of Parallelism rivial Parallelism is limited What is trivial parallelism? In-order: sequential instructions do not have
IF ID EX MEM WB 400 ps 225 ps 350 ps 450 ps 300 ps
CSE 30321 Computer Architecture I Fall 2010 Homework 06 Pipelined Processors 85 points Assigned: November 2, 2010 Due: November 9, 2010 PLEASE DO THE ASSIGNMENT ON THIS HANDOUT!!! Problem 1: (25 points)
EN164: Design of Computing Systems Lecture 22: Processor / ILP 3
EN164: Design of Computing Systems Lecture 22: Processor / ILP 3 Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University
COSC4201. Scoreboard
COSC4201 Scoreboard Prof. Mokhtar Aboelaze York University Based on Slides by Prof. L. Bhuyan (UCR) Prof. M. Shaaban (RIT) 1 Overcoming Data Hazards with Dynamic Scheduling In the pipeline, if there is
Project 5: Optimizer Jason Ansel
Project 5: Optimizer Jason Ansel Overview Project guidelines Benchmarking Library OoO CPUs Project Guidelines Use optimizations from lectures as your arsenal If you decide to implement one, look at Whale
CSE 2021: Computer Organization
CSE 2021: Computer Organization Lecture-11 CPU Design : Pipelining-2 Review, Hazards Shakil M. Khan IF for Load (Review) CSE-2021 July-14-2011 2 ID for Load (Review) CSE-2021 July-14-2011 3 EX for Load
ECE 2300 Digital Logic & Computer Organization. More Pipelined Microprocessor
ECE 2300 Digital ogic & Computer Organization Spring 2018 ore Pipelined icroprocessor ecture 18: 1 nnouncements No instructor office hour today Rescheduled to onday pril 16, 4:00-5:30pm Prelim 2 review
Computer Science 246. Advanced Computer Architecture. Spring 2010 Harvard University. Instructor: Prof. David Brooks
Advanced Computer Architecture Spring 2010 Harvard University Instructor: Prof. dbrooks@eecs.harvard.edu Lecture Outline Instruction-Level Parallelism Scoreboarding (A.8) Instruction Level Parallelism
A B C D. Ann, Brian, Cathy, & Dave each have one load of clothes to wash, dry, and fold. Time
Pipelining Readings: 4.5-4.8 Example: Doing the laundry A B C D Ann, Brian, Cathy, & Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 40 minutes Folder takes
Single vs. Mul2- cycle MIPS. Single Clock Cycle Length
Single vs. Mul2- cycle MIPS Single Clock Cycle Length Suppose we have 2ns 2ns ister read 2ns ister write 2ns ory read 2ns ory write 2ns 2ns What is the clock cycle length? 1 Single Cycle Length Worst case
EECS 470. Tomasulo s Algorithm. Lecture 4 Winter 2018
omasulo s Algorithm Winter 2018 Slides developed in part by Profs. Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, yson, Vijaykumar, and Wenisch of Carnegie Mellon University,
CS521 CSE IITG 11/23/2012
Parallel Decoding and issue Parallel execution Preserving the sequential consistency of execution and exception processing 1 slide 2 Decode/issue data Issue bound fetch Dispatch bound fetch RS RS RS RS
Lecture 8-1 Vector Processors 2 A. Sohn
Lecture 8-1 Vector Processors Vector Processors How many iterations does the following loop go through? For i=1 to n do A[i] = B[i] + C[i] Sequential Processor: n times. Vector processor: 1 instruction!
SATSim: A Superscalar Architecture Trace Simulator Using Interactive Animation
SATSim: A Superscalar Architecture Trace Simulator Using Interactive Animation Mark Wolff Linda Wills School of Electrical and Computer Engineering Georgia Institute of Technology {wolff,linda.wills}@ece.gatech.edu
On the Rules of Low-Power Design
On the Rules of Low-Power Design (and Why You Should Break Them) Prof. Todd Austin University of Michigan austin@umich.edu A long time ago, in a not so far away place The Rules of Low-Power Design P =
CSE502: Computer Architecture CSE 502: Computer Architecture
CSE 502: Computer Architecture Out-of-Order Schedulers Data-Capture Scheduler Dispatch: read available operands from ARF/ROB, store in scheduler Commit: Missing operands filled in from bypass Issue: When
CMSC 611: Advanced Computer Architecture
CMSC 611: Advanced Compute Achitectue Pipelining Some mateial adapted fom Mohamed Younis, UMBC CMSC 611 Sp 2003 couse slides Some mateial adapted fom Hennessy & Patteson / 2003 Elsevie Science Pipeline
Parallel architectures Electronic Computers LM
Parallel architectures Electronic Computers LM 1 Architecture Architecture: functional behaviour of a computer. For instance a processor which executes DLX code Implementation: a logical network implementing
Computer Hardware. Pipeline
Computer Hardware Pipeline Conventional Datapath 2.4 ns is required to perform a single operation (i.e. 416.7 MHz). Register file MUX B 0.6 ns Clock 0.6 ns 0.2 ns Function unit 0.8 ns MUX D 0.2 ns c. Production
Dynamic Scheduling II
so far: dynamic scheduling (out-of-order execution) Scoreboard omasulo s algorithm register renaming: removing artificial dependences (WAR/WAW) now: out-of-order execution + precise state advanced topic:
CISC 662 Graduate Computer Architecture. Lecture 9 - Scoreboard
CISC 662 Graduate Computer Architecture Lecture 9 - Scoreboard Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture tes from John Hennessy and David Patterson s: Computer
Single-Cycle CPU The following exercises are taken from Hennessy and Patterson, CO&D 2 nd, 3 rd, and 4 th Ed.
EE 357 Homework 7 Redekopp Name: Lec: 9:30 / 11:00 Score: Submit answers via Blackboard for all problems except 5.) and 6.). For those questions, submit a hardcopy with your answers, diagrams, circuit
EE 457 Homework 5 Redekopp Name: Score: / 100_
EE 457 Homework 5 Redekopp Name: Score: / 100_ Single-Cycle CPU The following exercises are taken from Hennessy and Patterson, CO&D 2 nd, 3 rd, and 4 th Ed. 1.) (6 pts.) Review your class notes. a. Is
EECS150 - Digital Design Lecture 2 - Synchronous Digital Systems Review Part 1. Outline
EECS5 - Digital Design Lecture 2 - Synchronous Digital Systems Review Part January 2, 2 John Wawrzynek Electrical Engineering and Computer Sciences University of California, Berkeley http://www-inst.eecs.berkeley.edu/~cs5
ECE473 Computer Architecture and Organization. Pipeline: Introduction
Computer Architecture and Organization Pipeline: Introduction Lecturer: Prof. Yifeng Zhu Fall, 2015 Portions of these slides are derived from: Dave Patterson UCB Lec 11.1 The Laundry Analogy Student A,
Computer Architecture
Computer Architecture An Introduction Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/
Compiler Optimisation
Compiler Optimisation 6 Instruction Scheduling Hugh Leather IF 1.18a hleather@inf.ed.ac.uk Institute for Computing Systems Architecture School of Informatics University of Edinburgh 2018 Introduction This
Ramon Canal NCD Master MIRI. NCD Master MIRI 1
Wattch, Hotspot, Hotleakage, McPAT http://www.eecs.harvard.edu/~dbrooks/wattch-form.html http://lava.cs.virginia.edu/hotspot http://lava.cs.virginia.edu/hotleakage http://www.hpl.hp.com/research/mcpat/
Instructor: Dr. Mainak Chaudhuri. Instructor: Dr. S. K. Aggarwal. Instructor: Dr. Rajat Moona
NPTEL Online - IIT Kanpur Instructor: Dr. Mainak Chaudhuri Instructor: Dr. S. K. Aggarwal Course Name: Department: Program Optimization for Multi-core Architecture Computer Science and Engineering IIT
Multiple Predictors: BTB + Branch Direction Predictors
Constructive Computer Architecture: Branch Prediction: Direction Predictors Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology October 28, 2015 http://csg.csail.mit.edu/6.175
FMP For More Practice
FP 6.-6 For ore Practice Labeling Pipeline Diagrams with 6.5 [2] < 6.3> To understand how pipeline works, let s consider these five instructions going through the pipeline: lw $, 2($) sub $, $2, $3 and
Performance Evaluation of Recently Proposed Cache Replacement Policies
University of Jordan Computer Engineering Department Performance Evaluation of Recently Proposed Cache Replacement Policies CPE 731: Advanced Computer Architecture Dr. Gheith Abandah Asma Abdelkarim January
MIT OpenCourseWare Multicore Programming Primer, January (IAP) Please use the following citation format:
MIT OpenCourseWare http://ocw.mit.edu 6.189 Multicore Programming Primer, January (IAP) 2007 Please use the following citation format: Rodric Rabbah, 6.189 Multicore Programming Primer, January (IAP) 2007.
Metrics How to improve performance? CPI MIPS Benchmarks CSC3501 S07 CSC3501 S07. Louisiana State University 4- Performance - 1
Performance of Computer Systems Dr. Arjan Durresi Louisiana State University Baton Rouge, LA 70810 Durresi@Csc.LSU.Edu LSUEd These slides are available at: http://www.csc.lsu.edu/~durresi/csc3501_07/ Louisiana
COTSon: Infrastructure for system-level simulation
COTSon: Infrastructure for system-level simulation Ayose Falcón, Paolo Faraboschi, Daniel Ortega HP Labs Exascale Computing Lab http://sites.google.com/site/hplabscotson MICRO-41 tutorial November 9, 28
Memory-Level Parallelism Aware Fetch Policies for Simultaneous Multithreading Processors
Memory-Level Parallelism Aware Fetch Policies for Simultaneous Multithreading Processors STIJN EYERMAN and LIEVEN EECKHOUT Ghent University A thread executing on a simultaneous multithreading (SMT) processor
How a processor can permute n bits in O(1) cycles
How a processor can permute n bits in O(1) cycles Ruby Lee, Zhijie Shi, Xiao Yang Princeton Architecture Lab for Multimedia and Security (PALMS) Department of Electrical Engineering Princeton University
4.4 Implementation Structures in FPGAs and DSPs. Presented by Lee Pucker President, ForwardLink Consulting
4.4 Implementation Structures in FPGAs and DSPs Presented by Lee Pucker President, ForwardLink Consulting Agenda Case Study on Implementation Structures Synchronization in a GSM Network Option 1: DSP Implementation
CSE502: Computer Architecture Welcome to CSE 502
Welcome to CSE 502 Introduction & Review Today s Lecture Course Overview Course Topics Grading Logistics Academic Integrity Policy Homework Quiz Key basic concepts for Computer Architecture Course Overview
High-Performance Pipelined Architecture of Elliptic Curve Scalar Multiplication Over GF(2 m )
High-Performance Pipelined Architecture of Elliptic Curve Scalar Multiplication Over GF(2 m ) Abstract: This paper proposes an efficient pipelined architecture of elliptic curve scalar multiplication (ECSM)
RedPitaya. FPGA memory map
RedPitaya FPGA memory map Written by Revision Description Version Date Matej Oblak Initial 0.1 08/11/13 Matej Oblak Release1 update 0.2 16/12/13 Matej Oblak ASG - added burst mode ASG - buffer read pointer
Dual-Band Wireless DPDT RF Switch
Dual-Band Wireless DPDT RF Switch RF SWITCH CG2164X3 DESCRIPTION The CG2164X3 is a GaAs MMIC DPDT (Double Pole Double Throw) switch for 2.5 GHz and 6 GHz dual-band wireless LAN applications PACKAGE 6-pin
Using ST6 analog inputs for multiple key decoding
AN431 Application note Using ST6 analog inputs for multiple key decoding INTRODUCTION The ST6 on-chip Analog to Digital Converter (ADC) is a useful peripheral integrated into the silicon of the ST6 family
Computer Architecture and Organization:
Computer Architecture and Organization: L03: Register transfer and System Bus By: A. H. Abdul Hafez Abdul.hafez@hku.edu.tr, ah.abdulhafez@gmail.com 1 CAO, by Dr. A.H. Abdul Hafez, CE Dept. HKU Outlines
Convolution Engine: Balancing Efficiency and Flexibility in Specialized Computing
Convolution Engine: Balancing Efficiency and Flexibility in Specialized Computing Paper by: Wajahat Qadeer Rehan Hameed Ofer Shacham Preethi Venkatesan Christos Kozyrakis Mark Horowitz Presentation by:
CZ3001 ADVANCED COMPUTER ARCHITECTURE
CZ3001 ADVANCED COMPUTER ARCHITECTURE Lab 3 Report Abstract Pipelining is a process in which successive steps of an instruction sequence are executed in turn by a sequence of modules able to operate concurrently,
CS4617 Computer Architecture
1/26 CS4617 Computer Architecture Lecture 2 Dr J Vaughan September 10, 2014 2/26 Amdahl s Law Speedup = Execution time for entire task without using enhancement Execution time for entire task using enhancement
CSEN 601: Computer System Architecture Summer 2014
CSEN 601: Cmputer System Architecture Summer 2014 Practice Assignment 7 Slutin Exercise 7-1: Based n the MIPS pipeline implementatin yu studied, what are the cntrl signals that have t be stred in the ID/EX
Game Architecture. 4/8/16: Multiprocessor Game Loops
Game Architecture 4/8/16: Multiprocessor Game Loops Monolithic Dead simple to set up, but it can get messy Flow-of-control can be complex Top-level may have too much knowledge of underlying systems (gross
AN EFFICIENT ALGORITHM FOR THE REMOVAL OF IMPULSE NOISE IN IMAGES USING BLACKFIN PROCESSOR
AN EFFICIENT ALGORITHM FOR THE REMOVAL OF IMPULSE NOISE IN IMAGES USING BLACKFIN PROCESSOR S. Preethi 1, Ms. K. Subhashini 2 1 M.E/Embedded System Technologies, 2 Assistant professor Sri Sai Ram Engineering
Reconfigurable High Performance Baugh-Wooley Multiplier for DSP Applications
Reconfigurable High Performance Baugh-Wooley Multiplier for DSP Applications Joshin Mathews Joseph & V.Sarada Department of Electronics and Communication Engineering, SRM University, Kattankulathur, Chennai,
Flexibility, Speed and Accuracy in VLIW Architectures Simulation and Modeling
Flexibility, Speed and Accuracy in VLIW Architectures Simulation and Modeling IVANO BARBIERI, MASSIMO BARIANI, ALBERTO CABITTO, MARCO RAGGIO Department of Biophysical and Electronic Engineering University
REAL TIME DIGITAL SIGNAL PROCESSING. Introduction
REAL TIME DIGITAL SIGNAL Introduction Why Digital? A brief comparison with analog. PROCESSING Seminario de Electrónica: Sistemas Embebidos Advantages The BIG picture Flexibility. Easily modifiable and
Warp-Aware Trace Scheduling for GPUS. James Jablin (Brown) Thomas Jablin (UIUC) Onur Mutlu (CMU) Maurice Herlihy (Brown)
Warp-Aware Trace Scheduling for GPUS James Jablin (Brown) Thomas Jablin (UIUC) Onur Mutlu (CMU) Maurice Herlihy (Brown) Historical Trends in GFLOPS: CPUs vs. GPUs Theoretical GFLOP/s 3250 3000 2750 2500
Interpolation Error in Waveform Table Lookup
Carnegie Mellon University Research Showcase @ CMU Computer Science Department School of Computer Science 1998 Interpolation Error in Waveform Table Lookup Roger B. Dannenberg Carnegie Mellon University
Power Modeling and Characterization of Computing Devices: A Survey. Contents
Foundations and Trends R in Electronic Design Automation Vol. 6, No. 2 (2012) 121 216 c 2012 S. Reda and A. N. Nowroz DOI: 10.1561/1000000022 Power Modeling and Characterization of Computing Devices: A
Integrated Power Delivery for High Performance Server Based Microprocessors
Integrated Power Delivery for High Performance Server Based Microprocessors J. Ted DiBene II, Ph.D. Intel, Dupont-WA International Workshop on Power Supply on Chip, Cork, Ireland, Sept. 24-26 Slide 1 Legal
EE-382M-8 VLSI II. Early Design Planning: Back End. Mark McDermott. The University of Texas at Austin. EE 382M-8 VLSI-2 Page Foil # 1 1
EE-382M-8 VLSI II Early Design Planning: Back End Mark McDermott EE 382M-8 VLSI-2 Page Foil # 1 1 Backend EDP Flow The project activities will include: Determining the standard cell and custom library
SCUBA-2. Low Pass Filtering
Physics and Astronomy Dept. MA UBC 07/07/2008 11:06:00 SCUBA-2 Project SC2-ELE-S582-211 Version 1.3 SCUBA-2 Low Pass Filtering Revision History: Rev. 1.0 MA July 28, 2006 Initial Release Rev. 1.1 MA Sept.
Chapter 3 Digital Logic Structures
Chapter 3 Digital Logic Structures Transistor: Building Block of Computers Microprocessors contain millions of transistors Intel Pentium 4 (2): 48 million IBM PowerPC 75FX (22): 38 million IBM/Apple PowerPC
Implementation and Complexity Analysis of List Sphere Detector for MIMO-OFDM systems
Implementation and Complexity Analysis of List Sphere Detector for MIMO-OFDM systems Markus Myllylä University of Oulu, Centre for Wireless Communications markus.myllyla@ee.oulu.fi Outline Introduction
User Guide IRMCS3041 System Overview/Guide. Aengus Murray. Table of Contents. Introduction
User Guide 0607 IRMCS3041 System Overview/Guide By Aengus Murray Table of Contents Introduction... 1 IRMCF341 Application Circuit... 2 Sensorless Control Algorithm... 4 Velocity and Current Control...
The physics of capacitive touch technology
The physics of capacitive touch technology By Tom Perme Applications Engineer Microchip Technology Inc. Introduction Understanding the physics of capacitive touch technology makes it easier to choose the
What you can do with very little:
page How Computers Work Lecture 3 irect Execution RISC Processor: The Unpipelined ET How Computers Work Lecture 3 Page What you can do with very little: Each instruction class can be implemented using
MULTISCALAR PROCESSORS
MULTISCALAR PROCESSORS THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE MULTISCALAR PROCESSORS by Manoj Franklin University of Maryland, US.A. SPRINGER SCIENCE+BUSINESS MEDIA, LLC Library
CAD-MF. PC-Based Multi-Format ANI & Emergency ANI Display Decoder. Manual Revision: Covers Firmware Revisions: CAD-MF: 1.
CAD-MF PC-Based Multi-Format ANI & Emergency ANI Display Decoder Manual Revision: 2010-05-25 Covers Firmware Revisions: CAD-MF: 1.0 & Higher Covers Software Revisions: CAD: 3.21 & Higher Covers Hardware
LEAN NPI AT OPTIMUM DESIGN ASSOCIATES: PART 2 WHAT IS LEAN NPI AND HOW TO ACHIEVE IT
W H I T E P A P E R LEAN NPI AT OPTIMUM DESIGN ASSOCIATES: PART 2 WHAT IS LEAN NPI AND HOW TO ACHIEVE IT RANDY HOLT, OPTIMUM DESIGN ASSOCIATES JAMES DOWDING, MENTOR GRAPHICS w w w. o d b - s a. c o m In
A High-Throughput Memory-Based VLC Decoder with Codeword Boundary Prediction
1514 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 8, DECEMBER 2000 A High-Throughput Memory-Based VLC Decoder with Codeword Boundary Prediction Bai-Jue Shieh, Yew-San Lee,
Computer Architecture A Quantitative Approach
Computer Architecture A Quantitative Approach Fourth Edition John L. Hennessy Stanford University David A. Patterson University of California at Berkeley With Contributions by Andrea C. Arpaci-Dusseau
Game Programming Paradigms. Michael Chung
Game Programming Paradigms Michael Chung CS248, 10 years ago... Goals Goals 1. High level tips for your project s game architecture Goals 1. High level tips for your project s game architecture 2.
Mahendra Engineering College, Namakkal, Tamilnadu, India.
Implementation of Modified Booth Algorithm for Parallel MAC Stephen 1, Ravikumar. M 2 1 PG Scholar, ME (VLSI DESIGN), 2 Assistant Professor, Department ECE Mahendra Engineering College, Namakkal, Tamilnadu,
DIGITAL INTEGRATED CIRCUITS A DESIGN PERSPECTIVE 2 N D E D I T I O N
DIGITAL INTEGRATED CIRCUITS A DESIGN PERSPECTIVE 2 N D E D I T I O N Jan M. Rabaey, Anantha Chandrakasan, and Borivoje Nikolic CONTENTS PART I: THE FABRICS Chapter 1: Introduction (32 pages) 1.1 A Historical
IMPLEMENTATION OF G.726 ITU-T VOCODER ON A SINGLE CHIP USING VHDL
IMPLEMENTATION OF G.726 ITU-T VOCODER ON A SINGLE CHIP USING VHDL G.Murugesan N. Ramadass Dr.J.Raja paul Perinbum School of ECE Anna University Chennai-600 025 Gm1gm@rediffmail.com ramadassn@yahoo.com
COURSE LEARNING OUTCOMES AND OBJECTIVES
COURSE LEARNING OUTCOMES AND OBJECTIVES A student who successfully fulfills the course requirements will have demonstrated: 1. an ability to analyze and design CMOS logic gates 1-1. convert numbers from
Low-Power Design for Embedded Processors
Low-Power Design for Embedded Processors BILL MOYER, MEMBER, IEEE Invited Paper Minimization of power consumption in portable and batterypowered embedded systems has become an important aspect of processor
CHAPTER III THE FPGA IMPLEMENTATION OF PULSE WIDTH MODULATION
34 CHAPTER III THE FPGA IMPLEMENTATION OF PULSE WIDTH MODULATION 3.1 Introduction A number of PWM schemes are used to obtain variable voltage and frequency supply. The Pulse width of PWM pulsevaries with
New generation of welding and inspection systems
New generation of welding and inspection systems Throughout the pipeline industry, and particularly in offshore and spool base production, welding requirements are shifting toward higher quality, greater
Final Project Report 4-bit ALU Design
ECE 467 Final Project Report 4-bit ALU Design Fall 2013 Kai Zhao Aswin Gonzalez Sepideh Roghanchi Soroush Khaleghi Part 1) Final ALU Design: There are 6 different functions implemented in this ALU: 1)
Redundant Residue Number System Based Fault Tolerant Architecture over Wireless Network
Redundant Residue Number System Based Fault Tolerant Architecture over Wireless Network Olabanji Olatunde.T toheeb.olabanji@kwasu.edu.ng Kazeem.A. Gbolagade kazeem.gbolagade@kwasu.edu.ng Yunus Abolaji
Synthesis and Simulation of Floating Point Multipliers Dr. P. N. Jain 1, Dr. A.J. Patil 2, M. Y. Thakre 3
Synthesis and Simulation of Floating Point Multipliers Dr. P. N. Jain 1, Dr. A.J. Patil 2, M. Y. Thakre 3 1Professor and Academic Dean, Department of E&TC, Shri. Gulabrao Deokar College of Engineering,
OPTIONS [MATH FUNCTION] [TOTALIZER] [FLOW CORRECTION]
INST. INE-288C OPTIONS [MATH FUNCTION] [TOTALIZER] [FLOW CORRECTION] FOR AL/AH3000 SERIES HYBRID RECORDER Retain this manual apart from the instrument and in an easily accessible. Please make sure that
EEL 4744C: Microprocessor Applications Lecture 8 Timer Dr. Tao Li
EEL 4744C: Microprocessor Applications Lecture 8 Timer Reading Assignment Software and Hardware Engineering (new version): Chapter 14 SHE (old version): Chapter 10 HC12 Data Sheet: Chapters 12, 13, 11,
Reading Assignment. Timer. Introduction. Timer Overview. Programming HC12 Timer. An Overview of HC12 Timer. EEL 4744C: Microprocessor Applications
Reading Assignment EEL 4744C: Microprocessor Applications Lecture 8 Timer Software and Hardware Engineering (new version): Chapter 4 SHE (old version): Chapter 0 HC Data Sheet: Chapters,,, 0 Introduction
FCam: An architecture for computational cameras
FCam: An architecture for computational cameras Dr. Kari Pulli, Research Fellow Palo Alto What is computational photography? All cameras have optics + sensors But the images have limitations they cannot
Cactus V6 Firmware Release Notes
Cactus V6 Firmware Release Notes Firmware V2.1.001 (Released on 20 Oct 2016) New features: - Added support for V6II / IIS (all firmware versions) and RF60 Master (with firmware version 2.00 or later):
132 PIN 1. Figure JD Microprocessor
EMBEDDED 32-BIT MICROPROCESSOR Pin/Code Compatible with all 80960Jx Processors High-Performance Embedded Architecture One Instruction/Clock Execution Core Clock Rate is 2x the Bus Clock Load/Store Programming
FPGA Based 70MHz Digital Receiver for RADAR Applications
Technology Volume 1, Issue 1, July-September, 2013, pp. 01-07, IASTER 2013 www.iaster.com, Online: 2347-6109, Print: 2348-0017 FPGA Based 70MHz Digital Receiver for RADAR Applications ABSTRACT Dr. M. Kamaraju
SERVOSTAR S- and CD-series Sine Encoder Feedback
SERVOSTAR S- and CD-series Sine Encoder Feedback The SERVOSTAR S and SERVOSTAR CD family of drives offers the ability to accept signals from various feedback devices. Sine Encoders provide analog-encoded
FPGA Realization of Space-Vector PWM Control IC for Three-Phase PWM Inverters
IEEE TRANSACTIONS ON POWER ELECTRONICS, VOL. 12, NO. 6, NOVEMBER 1997 953 FPGA Realization of Space-Vector PWM Control IC for Three-Phase PWM Inverters Ying-Yu Tzou, Member, IEEE, and Hau-Jean Hsu Abstract
Lecture 0: Introduction
Introduction to CMOS VLSI Design Lecture : Introduction David Harris Steven Levitan Harvey Mudd College University of Pittsburgh Spring 24 Fall 28 Administrivia Professor Steven Levitan TA: Bo Zhao Syllabus
Chapter 4: The Building Blocks: Binary Numbers, Boolean Logic, and Gates
Chapter 4: The Building Blocks: Binary Numbers, Boolean Logic, and Gates Objectives In this chapter, you will learn about The binary numbering system Boolean logic and gates Building computer circuits
The rangefinder can be configured using an I2C machine interface. Settings control the
Detailed Register Definitions The rangefinder can be configured using an I2C machine interface. Settings control the acquisition and processing of ranging data. The I2C interface supports a transfer rate
Architectural Floor Plan Symbols
Architectural Floor Plan Symbols The symbols below are used in architectural floor plans. Every office has their own standard, but most symbols should be similar to those shown on this page. Building Section
QUICK-START GUIDE. Configurator Transmi er Configura on System
Configurator Transmi er Configura on System QUICK-START GUIDE 1801 North Juniper Avenue Broken Arrow, Oklahoma 74012 U.S.A. +1 (918) 258 6068 worldwide www.pigging.com support@pigging.com Informa on in
ebner@complang.tuwien.ac.at brandner@complang.tuwien.ac.at http://complang.tuwien.ac.at/cd/vliw 03/28/08 Ebner, Brandner Compilation Techniques for VLIWs SS08 Slide #1 03/28/08 Ebner, Brandner Compilation