Embedded Hardware (1) Kai Huang
|
|
- Allan Ferguson
- 5 years ago
- Views:
Transcription
1 Ebedded Hardware () Kai Hang
2 News: PS4 and Xbo One are Coing /9/203 2
3 The Hardware Xbo One PS4 CPU PS4 /9/203 3 Xbo One sei-csto 86 AMD APU 28n 8-core Jagar CPU CPU freqency.6 GHz.75 GHz GPU 8 CUs:52 shaders (800MHz) 2 CUs:768 shader (853 MHz) Meory 8G 5500MHz DDR5 8G 233MHz DDR3 Me Bandwidth 76GB/sec 68.3GB/sec Ebedded SRAM N/A MB (204GB/sec)
4 /9/203 4
5 Dataflow MoC Recap Kahn Process Network Synchronos DataFlow h init = 3 h2 g f 4 g f 2 h init = 0 3 h 2 /9/203 kai.hang@t 5
6 Otline Processor Meory I/O /9/203 6
7 Otline Processor o Single-cycle datapath o Pipeline datapath o Processor types Meory I/O /9/203 kai.hang@t 7
8 Y-Chart Methodology Architectre odel Mapping Applications odel Perforance Evalation Perforance Nbers /9/203 8
9 Ebedded Syste Hardware Ebedded syste hardware is freqently sed in a loop ( hardware in a loop ): This corse A/D converter Saple-and-hold Inforation processing Display D/A converter Sensors Environent Actators Ebedded syste /9/203 kai.hang@t 9
10 The Big Pictre Since 946 all copters have had 5 coponents Control nit coordinates varios actions: Inpt, Otpt Processing Processor Control Meory Inpt nit accepts inforation: Han operators, Electroechanical devices Other copters Inpt Datapath: the part of the central processing nit (CPU) that does the actal coptations Datapath Stores inforation: Instrctions, Data Otpt Otpt nit sends reslts of processing: To a onitor display, To a printer /9/203 kai.hang@t 0
11 PC Datapath Coponents Cobinational Eleents o ALU, Adder o Iediate etender o Mltipleers Storage Eleents o Instrction eory o Data eory o PC register o Register file Clocking ethodology o Tiing of reads and writes 6 Etend EtOp Clock Address RA 0 Instrction Instrction Meory RB RW Registers File RegWrite BsA select BsB BsW A L U MeRead ALU control Data Meory zero ALU reslt overflow Address Data_ot Data_in MeWrite /9/203 kai.hang@t
12 ALU: Arithetic Logic Unit. ALU is a digital circit that perfors Arithetic (Add, Sb,...) and Logical (AND, OR, NOT) operations. 2. John Von Neann proposed the ALU in 945 when he was working on EDVAC. -Bit ALU /9/203 kai.hang@t 2
13 Logical Operation Arithetic Operation Shift Operation Mltifnction ALU None = 00 SLL = 0 SRL = 0 SRA = 2 Shift Aont lsb 5 Shifter SLT: ALU does a SUB and check the sign and overflow ADD = 0 SUB = A B c 0 A d d e r sign overflow ALU Reslt zero AND = 00 OR = 0 NOR = 0 XOR = Logic Unit ALU Selection Shift = 00 SLT = 0 Arith = 0 Logic = /9/203 kai.hang@t 3
14 Single-Cycle Datapath (with Control Signal) 30 Jp or Branch Target Address PCSrc 0 30 PC Instrction Meory Address Instrction Rd 5 Rs Rt I26 RA BsA Registers RB BsB RW BsW Et 30 I6 0 Net PC A L U zero J, Beq, Bne ALU reslt Data Meory Address Data_ot Data_in 0 RegDst RegWrite EtOp ALUSrc ALUCtrl Op fnc ALU Ctrl MeRead ALUOp MeWrite MetoReg Main Control /9/203 kai.hang@t 4
15 Register Transfer Level (RTL) RTL is a description of data flow between registers RTL gives a eaning to the instrctions All instrctions are fetched fro eory at address PC Instrction RTL Description ADD Reg(Rd) Reg(Rs) + Reg(Rt); PC PC + 4 SUB Reg(Rd) Reg(Rs) Reg(Rt); PC PC + 4 ORI Reg(Rt) Reg(Rs) zero_et(i6); PC PC + 4 LW Reg(Rt) MEM[Reg(Rs) + sign_et(i6)]; PC PC + 4 SW MEM[Reg(Rs) + sign_et(i6)] Reg(Rt); PC PC + 4 BEQ if (Reg(Rs) == Reg(Rt)) PC PC sign_etend(i6) else PC PC + 4 /9/203 kai.hang@t 5
16 Instrctions are Eected in Steps R-type Fetch instrction: Instrction MEM[PC] Fetch operands: data Reg(Rs), data2 Reg(Rt) Eecte operation: ALU_reslt fnc(data, data2) Write ALU reslt: Reg(Rd) ALU_reslt Net PC address: PC PC + 4 I-type Fetch instrction: Instrction MEM[PC] Fetch operands: data Reg(Rs), data2 Etend(i6) Eecte operation: ALU_reslt op(data, data2) Write ALU reslt: Reg(Rt) ALU_reslt Net PC address: PC PC + 4 BEQ Fetch instrction: Instrction MEM[PC] Fetch operands: data Reg(Rs), data2 Reg(Rt) Eqality: zero sbtract(data, data2) Branch: if (zero) PC PC sign_et(i6) else PC PC + 4 /9/203 kai.hang@t 6
17 Instrction Eection Eaples LW Fetch instrction: Instrction MEM[PC] lw Rt,C(Rs) Fetch base register: base Reg(Rs) Calclate address: address base + sign_etend(i6) Read eory: data MEM[address] Write register Rt: Reg(Rt) data Net PC address: PC PC + 4 SW Fetch instrction: Instrction MEM[PC] sw Rt,C(Rs) Fetch registers: base Reg(Rs), data Reg(Rt) Calclate address: address base + sign_etend(i6) Write eory: MEM[address] data Net PC address: PC PC + 4 Jp Fetch instrction: Instrction MEM[PC] j C Target PC address: target PC[3:28], I26, 00 Jp: PC target concatenation /9/203 kai.hang@t 7
18 Eection of Load Instrction: lw Rt,C(Rs) 30 EtOp = sign to sign-etend Iediate6 to bits PC Instrction Meory Address Instrction I6 Rd 5 Rs Rt RA RB RW EtOp = sign Etender Registers BsA BsB BsW ALUSrc = 0 ALUCtrl = ADD A L U MeRead = ALU reslt Data Meory Address MeWrite = 0 Data_ot Data_in MetoReg = 0 RegDst = 0 selects Rt as destination register RegDst = 0 RegWrite = MeRead = to read data eory ALUSrc = selects etended iediate as second ALU inpt ALUCtrl = ADD to calclate data eory address as Reg(Rs) + sign-etend(i6) MetoReg = places the data read fro eory on BsW RegWrite = to write the eory data on BsW to register Rt /9/203 kai.hang@t 8
19 Eection of Store Instrction: sw Rt,C(Rs) EtOp = sign to sign-etend Iediate6 to bits 30 PC Instrction Meory Address Instrction I6 Rd 5 Rs Rt RA RB RW EtOp = sign Etender Registers BsA BsB BsW ALUSrc = 0 ALUCtrl = ADD A L U MeRead = 0 ALU reslt Data Meory Address MeWrite = Data_ot Data_in MetoReg = 0 RegDst = becase no destination register RegDst = RegWrite = 0 MeWrite = to write data eory ALUSrc = to select the etended iediate as second ALU inpt ALUCtrl = ADD to calclate data eory address as Reg(Rs) + sign-etend(i6) MetoReg = becase we don t care what data is placed on BsW RegWrite = 0 becase no register is written by the store instrction /9/203 kai.hang@t 9
20 Eection of Jp Instrction: j C 30 Jp Target Address 30 PCSrc = + I6 zero Instrction Rs 5 RA BsA Meory 30 Instrction Registers Et A 0 Rt 5 L RB BsB 0 Address U 0 RW BsW Rd 5 PC 00 J = selects I26 as jp target address I26 30 Net PC RegDst RegWrite = = 0 EtOp ALUSrc ALUCtrl = = = J = MeRead = 0 ALU reslt Data Meory Address MeWrite = 0 Data_ot Data_in MetoReg = 0 Upper 4 bits are fro the increented PC PCSrc = to select jp target address MeRead, MeWrite & RegWrite are 0 We don t care abot RegDst, EtOp, ALUSrc, ALUCtrl, and MetoReg /9/203 kai.hang@t 20
21 Drawbacks of Single Cycle Processor Long cycle tie o All instrctions take as ch tie as the slowest Arithetic Instrction Fetch Reg Read ALU Load Instrction Fetch Reg Read longest delay ALU Reg Write Meory Read Reg Write Store Instrction Fetch Reg Read ALU Meory Write Branch Instrction Fetch Reg Read ALU Jp Instrction Fetch Decode Alternative Soltion: Mlticycle ipleentation o Break down instrction eection into ltiple cycles /9/203 kai.hang@t 2
22 Single-Cycle vs. Mlticycle Clock Tie needed Tie allotted Instr Instr 2 Instr 3 Instr 4 Clock Tie needed Tie allotted 3 cycles 5 cycles 3 cycles 4 cycles Instr Instr 2 Instr 3 Instr 4 Tie saved /9/203 kai.hang@t 22
23 Otline Processor o Single-cycle datapath o Pipeline datapath o Processor types Meory I/O /9/203 kai.hang@t 23
24 RW BsW PC 00 Single-Cycle Datapath Shown below is the single-cycle datapath How to pipeline this single-cycle datapath? Answer: Introdce registers at the end of each stage IF = Instrction Fetch ID = Decode and Register Fetch EX = Eecte and Calclate Address MEM = Meory Access WB = Write Back 0 Address Inc Instrction Instrction Meory I26 Rd Rs Rt 0 Register File Et I6 0 Net PC A L U zero ALU reslt Data Meory Address Data_in 0 /9/203 kai.hang@t 24
25 RW BsW PC 00 Pipelined Datapath Pipeline registers, in green, separate each pipeline stage Pipeline registers are labeled by the stages they separate Is there a proble with the register destination address? IF = Instrction Fetch ID = Decode EX = Eecte MEM = Meory WB IF/ID ID/EX EX/MEM Inc I26 Net PC MEM/WB 0 Address Instrction Instrction Meory Rs Rt Rd Register File Et I6 A L U zero ALU reslt Address Data Meory Data_in 0 /9/203 kai.hang@t 25
26 BsW RW PC 00 Corrected Pipelined Datapath Destination register nber shold coe fro MEM/WB o Along with the data dring the written back stage Destination register nber is passed fro ID to WB stage IF ID EX MEM WB IF/ID ID/EX EX/MEM Inc I26 Net PC MEM/WB 0 Address Instrction Instrction Meory Rs Rt Rd Register File Et I6 A L U zero ALU reslt Address Data Meory Data_in 0 /9/203 kai.hang@t 26
27 Progra Eection Order Graphically Representing Pipelines Mltiple instrction eection over ltiple clock cycles o Instrctions are listed in eection order fro top to botto o Clock cycles ove fro left to right o Figre shows the se of resorces at each stage and each cycle Tie (in cycles) CC CC2 CC3 CC4 CC5 CC6 CC7 CC8 lw $6, 8($5) IM Reg ALU DM Reg add $, $2, $3 IM Reg ALU DM Reg ori $4, $3, 7 IM Reg ALU DM Reg sb $5, $2, $3 IM Reg ALU DM Reg sw $2, 0($3) IM Reg ALU DM /9/203 kai.hang@t 27
28 Instrction Order Instrction Tie Diagra Diagra shows: o Which instrction occpies what stage at each clock cycle Instrction eection is pipelined over the 5 stages Up to five instrctions can be in eection dring a single cycle lw $7, 8($3) IF ID EX MEM WB ALU instrctions skip the MEM stage. Store instrctions skip the WB stage lw $6, 8($5) IF ID EX MEM WB ori $4, $3, 7 IF ID EX WB sb $5, $2, $3 IF ID EX WB sw $2, 0($3) IF ID EX MEM CC CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9 Tie /9/203 kai.hang@t 28
29 Single-Cycle vs Pipelined Perforance Consider a 5-stage instrction eection in which o Instrction fetch = ALU operation = Data eory access = 200 ps o Register read = register write = 50 ps What is the single-cycle non-pipelined tie? What is the pipelined cycle tie? What is the speedp factor for pipelined eection? Soltion Non-pipelined cycle = = 900 ps IF Reg ALU MEM Reg 900 ps IF Reg ALU MEM Reg 900 ps /9/203 kai.hang@t 29
30 Single-Cycle verss Pipelined cont d Pipelined cycle tie = a(200, 50) = 200 ps IF Reg ALU MEM 200 CPI for pipelined eection = o One instrction copletes each cycle (ignoring pipeline fill) Speedp of pipelined eection = 900 ps / 200 ps = 4.5 o Instrction cont and CPI are eqal in both cases Speedp factor is less than 5 (nber of pipeline stage) o Becase the pipeline stages are not balanced Reg IF Reg ALU MEM Reg 200 IF Reg ALU MEM Reg /9/203 kai.hang@t 30
31 Sary between Datapaths Clock Cycle Tie Cycle Per Instrction # instrction eecting concrrently Dplicate Hardware Single Cycle Mltiple Cycle Pipeline Long (Long enogh for the slowest instrction) clock cycle per instrction (by definition) Short (long enogh for the slowest instrction step) Variable nber of clock cycles per instrction Short (long enogh for the slowest pipeline stage) Fied nber of clock cycles per instrction, one for each pipeline stage # pipeline stage Yes, since we can se a fnctional nit (FU) for at ost one sbtask per instrction No, since the instrction generally is broken into single-fu steps Etra Register No Yes, to hold reslts for the net step Yes, to avoid restriction on pipeline eection Yes, to provide reslts for the pipeline stage Perforance Baseline Faster, bt not too fast Fastest, if pipeline is balanced /9/203 kai.hang@t 3
32 Otline Processor o Single-cycle datapath o Pipeline datapath o Processor types Meory I/O /9/203 kai.hang@t
33 General Prpose Processors (GPP) High perforance o Highly optiized circits and technology o Use of parallelis sperscalar: dynaic schedling of instrctions sper-pipelining: instrction pipelining, branch prediction, speclation o cople eory hierarchy Not sited for real-tie applications o Eection ties are highly npredictable becase of intensive resorce sharing and dynaic decisions Properties o Good average perforance for large application i o High power consption /9/203 kai.hang@t 33
34 GPP + Meory (I): von Neann Architectre 2 3 PC 200 Meory Data + Progra ADD a,a2,an address data IR CPU. PC := Fetch => IR := Me[PC] 3. Decode IR 4. Eecte 5. PC := PC + 6. goto 2 N GPR /9/203 kai.hang@t 34
35 GPP + Meory (II): Harvard Architectre Progra Meory address data PC IR Data Meory address data CPU GPR /9/203 kai.hang@t 35
36 Intel Lynnfield (Core i5/i7) 4-8Mbytes L3 Cache 4 cores, 8 threads /9/203 kai.hang@t 36
37 Siple GPP: Xilin MicroBlaze IOPB: Instrction side On chip Peripheral Bs IXCL_M: Instrction-side Xilin Cache Link Master IXCL_S: Instrction-side Xilin Cache Link Slave ILMB: Instrction side Local Mory Bs DOPB: Data side On chip Peripheral Bs DXCL_M: Data-side Xilin Cache Link Master DXCL_S: Data-side Xilin Cache Link Slave DLMB: Data side Local Meory Bs MFSL: Master Fast Siple Link SFSL: Slave Fast Siple Link /9/203 37
38 Ebedded Processors RISC vs. CISC Cople instrction set CISC (e.g. 86) o Map copleity of coon instrctions directly in achine code o Cople instrctions can consist of several siple instrctions o Can lead to sbtle tiing isses o Used in general prpose copting Redced instrction set RISC (e.g. ARM Acorn Risc Machine) o Only siple achine instrctions; Copiler has to ap highlevel langage onto siple instrctions o All instrctions take the sae tie o Used in ebedded systes (Real-tie hardware, sart phones, ) /9/203 kai.hang@t 38
39 Application Specific Instrction Set Processors Micro Controllers (MicroCtrl) o Used in Control Doinated Systes o Reactive systes with event driven behavior o Application eaples: cars, conser electronics (washing achines, dishwashers etc.) Digital Signal Processors (DSPs) o Used in Data Doinated Systes o Streaing-oriented systes with ostly periodic behavior o Application eaples: signal processing Very Long Instrction Word Processors (VLIWs) o Used in Data Doinated Systes o Application eaples: iage processing /9/203 kai.hang@t 39
40 ASIP: Micro Controllers Control-doinant applications o Spports process schedling and synchronization o Preeption (interrpt), contet switch o Short latency ties Low power consption Peripheral nits often integrated Sited for real-tie applications Philips 83 C552: 8 bit-805 based icrocontroller /9/203 kai.hang@t 40
41 ASIP: Digital Signal Processors Optiized for data-flow applications Sited for siple control flow Parallel hardware nits Specialized instrction set High data throghpt Zero-overhead loops Specialized eory Sited for real-tie applications /9/203 4
42 Very Long Instrction Word Processors Key idea: detection of possible parallelis to be done by copiler, not by hardware at rn-tie (inefficient). VLIW: parallel operations (instrctions) encoded in one long word (instrction packet), each instrction controlling one fnctional nit. VLIW processors are an eaple of the so called Eplicit Parallelis Instrction Copters (EPIC) /9/203 42
43 Philips TriMedia VLIW CPU 5 isse slots (fnctional nits FU), therefore p to 5 instrctions can be eected in parallel /9/203 kai.hang@t 43
44 Application Specific Integrated Circits (ASICs) Csto-designed circits necessary o if ltiate speed or o energy efficiency is the goal and o large nbers can be sold. Approach sffers fro o long design ties, o lack of fleibility (changing standards) o high costs, i.e., Millions of $ ask costs /9/203 kai.hang@t 44
45 Reconfigrable Processing Units (RPUs) Fll csto chips (HW) ay be too epensive, software (SW) too slow. Cobine the speed of HW with the fleibility of SW o HW with prograable fnctions and interconnect. o HW (Re-)Configrable at design-tie or at rn-tie (dynaic reconfigration) Field Prograable Gate Arrays (FPGAs) o Crrently the ost sophisticated and sed RPUs o Applications Fast and very cheap prototyping of (MP-)SoCs Encryption, Fast object recognition (edical and ilitary) Adapting obile phones to different standards Very poplar devices fro o XILINX (Virte II(Pro), Virte 4, Virte 5, Virte 6, Virte 7) o Altera (Cyclone, Arria, Strati) o Actel and others /9/203 kai.hang@t 45
46 Floor-plan of VIRTEX II FPGAs Configrable Logic Block (CLB) Digital Clock Manager (DCM) Inpt/Otpt Blocks (IOB) /9/203 46
47 Syste-on-Chip (SoC) /9/203 47
48 Syste Specialization The ain difference between general prpose highest vole icroprocessors and ebedded systes is specialization. Specialization shold respect fleibility o application doain specific systes shall cover a class of applications o soe fleibility is reqired to accont for late changes, debgging Syste analysis reqired o identification of application properties which can be sed for specialization o qantification of individal specialization effects /9/203 kai.hang@t 48
49 Why Ipleentation Alternatives? Trade-off between Fleibility and Perforance/Power Efficiency /9/203 49
50 Energy Efficiency Hgo De Man, IMEC, Philips, 2007 /9/203 50
LECTURE 8. Pipelining: Datapath and Control
LECTURE 8 Pipelining: Datapath and Control PIPELINED DATAPATH As with the single-cycle and multi-cycle implementations, we will start by looking at the datapath for pipelining. We already know that pipelining
More informationRISC Design: Pipelining
RISC Design: Pipelining Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/
More informationComputer Architecture
Computer Architecture An Introduction Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/
More informationSelected Solutions to Problem-Set #3 COE 608: Computer Organization and Architecture Single Cycle Datapath and Control
Selected Solutions to Problem-Set #3 COE 608: Computer Organization and Architecture Single Cycle Datapath and Control 4.1. Done in the class 4.2. Try it yourself Q4.3. 4.3.1 a. Logic Only b. Logic Only
More informationPipelined Processor Design
Pipelined Processor Design COE 38 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline Pipelining versus Serial
More informationPipelining A B C D. Readings: Example: Doing the laundry. Ann, Brian, Cathy, & Dave. each have one load of clothes to wash, dry, and fold
Pipelining Readings: 4.5-4.8 Example: Doing the laundry Ann, Brian, Cathy, & Dave A B C D each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 40 minutes Folder takes
More informationA B C D. Ann, Brian, Cathy, & Dave each have one load of clothes to wash, dry, and fold. Time
Pipelining Readings: 4.5-4.8 Example: Doing the laundry A B C D Ann, Brian, Cathy, & Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 40 minutes Folder takes
More information7/19/2012. IF for Load (Review) CSE 2021: Computer Organization. EX for Load (Review) ID for Load (Review) WB for Load (Review) MEM for Load (Review)
CSE 2021: Computer Organization IF for Load (Review) Lecture-11 CPU Design : Pipelining-2 Review, Hazards Shakil M. Khan CSE-2021 July-19-2012 2 ID for Load (Review) EX for Load (Review) CSE-2021 July-19-2012
More informationCSE 2021: Computer Organization
CSE 2021: Computer Organization Lecture-11 CPU Design : Pipelining-2 Review, Hazards Shakil M. Khan IF for Load (Review) CSE-2021 July-14-2011 2 ID for Load (Review) CSE-2021 July-14-2011 3 EX for Load
More information7/11/2012. Single Cycle (Review) CSE 2021: Computer Organization. Multi-Cycle Implementation. Single Cycle with Jump. Pipelining Analogy
CSE 2021: Computer Organization Single Cycle (Review) Lecture-10 CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan CSE-2021 July-12-2012 2 Single Cycle with Jump Multi-Cycle Implementation
More informationChapter 4. Pipelining Analogy. The Processor. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop:
Chapter 4 The Processor Part II Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup p = 2n/(0.5n + 1.5) 4 =
More informationLecture Topics. Announcements. Today: Pipelined Processors (P&H ) Next: continued. Milestone #4 (due 2/23) Milestone #5 (due 3/2)
Lecture Topics Today: Pipelined Processors (P&H 4.5-4.10) Next: continued 1 Announcements Milestone #4 (due 2/23) Milestone #5 (due 3/2) 2 1 ISA Implementations Three different strategies: single-cycle
More informationLecture 4: Introduction to Pipelining
Lecture 4: Introduction to Pipelining Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes A B C D Dryer takes 40 minutes Folder
More informationIF ID EX MEM WB 400 ps 225 ps 350 ps 450 ps 300 ps
CSE 30321 Computer Architecture I Fall 2010 Homework 06 Pipelined Processors 85 points Assigned: November 2, 2010 Due: November 9, 2010 PLEASE DO THE ASSIGNMENT ON THIS HANDOUT!!! Problem 1: (25 points)
More informationRISC Central Processing Unit
RISC Central Processing Unit Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Spring, 2014 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/
More informationImplementation of Adaptive Viterbi Decoder
Ipleentation of Adaptive Viterbi Decoder Devendra Made #1 VIII Se B.E.(Etrx) K.D.K.College of Engineering, Nagpur, Maharashtra(I) Asst. Prof. R.B. Khule *2 M.Tech V.L.S.I. K.D.K.College of Engineering,
More informationEECS150 - Digital Design Lecture 2 - Synchronous Digital Systems Review Part 1. Outline
EECS5 - Digital Design Lecture 2 - Synchronous Digital Systems Review Part January 2, 2 John Wawrzynek Electrical Engineering and Computer Sciences University of California, Berkeley http://www-inst.eecs.berkeley.edu/~cs5
More informationCS 110 Computer Architecture Lecture 11: Pipelining
CS 110 Computer Architecture Lecture 11: Pipelining Instructor: Sören Schwertfeger http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on
More informationEE 457 Homework 5 Redekopp Name: Score: / 100_
EE 457 Homework 5 Redekopp Name: Score: / 100_ Single-Cycle CPU The following exercises are taken from Hennessy and Patterson, CO&D 2 nd, 3 rd, and 4 th Ed. 1.) (6 pts.) Review your class notes. a. Is
More informationEECE 321: Computer Organiza5on
EECE 321: Computer Organiza5on Mohammad M. Mansour Dept. of Electrical and Compute Engineering American University of Beirut Lecture 21: Pipelining Processor Pipelining Same principles can be applied to
More informationCS420/520 Computer Architecture I
CS42/52 Computer rchitecture I Designing a Pipeline Processor (C4: ppendix ) Dr. Xiaobo Zhou Department of Computer Science CS42/52 pipeline. UC. Colorado Springs dapted from UCB97 & UCB3 Branch Jump Recap:
More informationSuggested Readings! Lecture 12" Introduction to Pipelining! Example: We have to build x cars...! ...Each car takes 6 steps to build...! ! Readings!
1! CSE 30321 Lecture 12 Introduction to Pipelining! CSE 30321 Lecture 12 Introduction to Pipelining! 2! Suggested Readings!! Readings!! H&P: Chapter 4.5-4.7!! (Over the next 3-4 lectures)! Lecture 12"
More informationIF ID EX MEM WB 400 ps 225 ps 350 ps 450 ps 300 ps
CSE 30321 Computer Architecture I Fall 2011 Homework 06 Pipelined Processors 75 points Assigned: November 1, 2011 Due: November 8, 2011 PLEASE DO THE ASSIGNMENT ON THIS HANDOUT!!! Problem 1: (15 points)
More informationCMSC 611: Advanced Computer Architecture
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science
More informationYou are Here! Processor Design Process. Agenda. Agenda 10/25/12. CS 61C: Great Ideas in Computer Architecture Single Cycle MIPS CPU Part II
/26/2 CS 6C: Great Ideas in Computer Architecture Single Cycle MIPS CPU Part II /25/2 ructors: Krste Asanovic, Randy H. Katz hcp://inst.eecs.berkeley.edu/~cs6c/fa2 Fall 22 - - Lecture #26 Parallel Requests
More informationSingle-Cycle CPU The following exercises are taken from Hennessy and Patterson, CO&D 2 nd, 3 rd, and 4 th Ed.
EE 357 Homework 7 Redekopp Name: Lec: 9:30 / 11:00 Score: Submit answers via Blackboard for all problems except 5.) and 6.). For those questions, submit a hardcopy with your answers, diagrams, circuit
More informationCSEN 601: Computer System Architecture Summer 2014
CSEN 601: Cmputer System Architecture Summer 2014 Practice Assignment 7 Slutin Exercise 7-1: Based n the MIPS pipeline implementatin yu studied, what are the cntrl signals that have t be stred in the ID/EX
More informationECE473 Computer Architecture and Organization. Pipeline: Introduction
Computer Architecture and Organization Pipeline: Introduction Lecturer: Prof. Yifeng Zhu Fall, 2015 Portions of these slides are derived from: Dave Patterson UCB Lec 11.1 The Laundry Analogy Student A,
More informationAsanovic/Devadas Spring Pipeline Hazards. Krste Asanovic Laboratory for Computer Science M.I.T.
Pipeline Hazards Krste Asanovic Laboratory for Computer Science M.I.T. Pipelined DLX Datapath without interlocks and jumps 31 0x4 RegDst RegWrite inst Inst rs1 rs2 rd1 ws wd rd2 GPRs Imm Ext A B OpSel
More informationComputer Hardware. Pipeline
Computer Hardware Pipeline Conventional Datapath 2.4 ns is required to perform a single operation (i.e. 416.7 MHz). Register file MUX B 0.6 ns Clock 0.6 ns 0.2 ns Function unit 0.8 ns MUX D 0.2 ns c. Production
More information6.S084 Tutorial Problems L19 Control Hazards in Pipelined Processors
6.S084 Tutorial Problems L19 Control Hazards in Pipelined Processors Options for dealing with data and control hazards: stall, bypass, speculate 6.S084 Worksheet - 1 of 10 - L19 Control Hazards in Pipelined
More informationA HASP architecture solution in microcellular wireless networks
roceedings of the 5th WSEAS Int. Conf. on ALIED INFOMATICS and COMMUNICATIONS, Malta, Septeber 15-17, 25 (pp71-76) A AS architectre soltion in icrocelllar wireless networs D.KAABOULAS, S.LOUVOS, S.KOTSOOULOS,
More informationProject 5: Optimizer Jason Ansel
Project 5: Optimizer Jason Ansel Overview Project guidelines Benchmarking Library OoO CPUs Project Guidelines Use optimizations from lectures as your arsenal If you decide to implement one, look at Whale
More informationInstruction Level Parallelism. Data Dependence Static Scheduling
Instruction Level Parallelism Data Dependence Static Scheduling Basic Block A straight line code sequence with no branches in except to the entry and no branches out except at the exit Loop: L.D ADD.D
More informationEmbedded System Hardware - Reconfigurable Hardware -
2 Embedded System Hardware - Reconfigurable Hardware - Peter Marwedel Informatik 2 TU Dortmund Germany GOPs/J Courtesy: Philips Hugo De Man, IMEC, 27 Energy Efficiency of FPGAs 2, 28-2- Reconfigurable
More informationPower Improvement in 64-Bit Full Adder Using Embedded Technologies Er. Arun Gandhi 1, Dr. Rahul Malhotra 2, Er. Kulbhushan Singla 3
Power Iproveent in 64-Bit Full Adder Using Ebedded Technologies Er. Arun Gandhi 1, Dr. Rahul Malhotra 2, Er. Kulbhushan Singla 3 1 Departent of ECE, GTBKIET, Chhapianwali Malout, Punjab 2 Director, Principal,
More informationPipelined Beta. Handouts: Lecture Slides. Where are the registers? Spring /10/01. L16 Pipelined Beta 1
Pipelined Beta Where are the registers? Handouts: Lecture Slides L16 Pipelined Beta 1 Increasing CPU Performance MIPS = Freq CPI MIPS = Millions of Instructions/Second Freq = Clock Frequency, MHz CPI =
More informationECE 2300 Digital Logic & Computer Organization. More Pipelined Microprocessor
ECE 2300 Digital ogic & Computer Organization Spring 2018 ore Pipelined icroprocessor ecture 18: 1 nnouncements No instructor office hour today Rescheduled to onday pril 16, 4:00-5:30pm Prelim 2 review
More informationREAL TIME DIGITAL SIGNAL PROCESSING. Introduction
REAL TIME DIGITAL SIGNAL Introduction Why Digital? A brief comparison with analog. PROCESSING Seminario de Electrónica: Sistemas Embebidos Advantages The BIG picture Flexibility. Easily modifiable and
More informationA Design Procedure for Control Systems of Inverterbased DG in Microgrids
A Design Procedre for Systes of Inverterbased DG in Microgrids Toshihisa Fnabashi, Shota Igarashi, Yske Manabe, Mneaki Krioto, Takeyoshi Kato Abstract-- In constrcting icrogrids with only inverter-based
More informationCZ3001 ADVANCED COMPUTER ARCHITECTURE
CZ3001 ADVANCED COMPUTER ARCHITECTURE Lab 3 Report Abstract Pipelining is a process in which successive steps of an instruction sequence are executed in turn by a sequence of modules able to operate concurrently,
More informationSome material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier
Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science !!! Basic MIPS integer pipeline Branches with one
More informationDesign and Implementation of Multilevel QAM Band pass Modems (8QAM, 16QAM, 32QAM and 64QAM) for WIMAX System Based on SDR Using FPGA
International Jornal of Soft Compting and Engineering (IJSCE) ISSN: 223-237, Volme-4, Isse-, March 24 esign and Implementation of Mltilevel AM Band pass Modems (8AM, 6AM, 32AM and 64AM) for WIMAX System
More informationApplication of digital filters for measurement of nonlinear distortions in loudspeakers using Wolf s method
Application o digital ilters or measrement o nonlinear distortions in lodspeakers sing Wol s method R. Siczek Wroclaw University o Technology, Wybrzeze Wyspianskiego 7, 50-70 Wroclaw, Poland raal.siczek@pwr.wroc.pl
More informationDesign and Implementation of Block Based Transpose Form FIR Filter
Design and Ipleentation of Bloc Based Transpose For FIR Filter O. Venata rishna 1, Dr. C. Venata Narasihulu 2, Dr.. Satya Prasad 3 1 (ECE, CVR College of Engineering, Hyderabad, India) 2 (ECE, Geethanjali
More informationTechnology Timeline. Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs. FPGAs. The Design Warrior s Guide to.
FPGAs 1 CMPE 415 Technology Timeline 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs FPGAs The Design Warrior s Guide
More informationACCURATE DISPLACEMENT MEASUREMENT BASED ON THE FREQUENCY VARIATION MONITORING OF ULTRASONIC SIGNALS
XVII IMEKO World Congress Metrology in 3rd Millenniu June 22 27, 2003, Dubrovnik, Croatia ACCURATE DISPLACEMENT MEASUREMENT BASED ON THE FREQUENCY VARIATION MONITORING OF ULTRASONIC SIGNALS Ch. Papageorgiou
More informationFlexible Full-duplex Cognitive Radio Networks by Antenna Reconfiguration
IEEE/CIC ICCC Symposim on Wireless Commnications Systems Flexible Fll-dplex Cognitive Radio Networks by Antenna Reconfigration Liwei Song Yn Liao and Lingyang Song State Key Laboratory of Advanced Optical
More informationComputer Science 246. Advanced Computer Architecture. Spring 2010 Harvard University. Instructor: Prof. David Brooks
Advanced Computer Architecture Spring 2010 Harvard University Instructor: Prof. dbrooks@eecs.harvard.edu Lecture Outline Instruction-Level Parallelism Scoreboarding (A.8) Instruction Level Parallelism
More informationOn the Rules of Low-Power Design
On the Rules of Low-Power Design (and Why You Should Break Them) Prof. Todd Austin University of Michigan austin@umich.edu A long time ago, in a not so far away place The Rules of Low-Power Design P =
More informationREVOLUTIONIZING THE COMPUTING LANDSCAPE AND BEYOND.
December 3-6, 2018 Santa Clara Convention Center CA, USA REVOLUTIONIZING THE COMPUTING LANDSCAPE AND BEYOND. https://tmt.knect365.com/risc-v-summit @risc_v ACCELERATING INFERENCING ON THE EDGE WITH RISC-V
More informationHigh-Throughput Low-Complexity Successive- Cancellation Polar Decoder Architecture using One s Complement Scheme
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.5, NO.3, JUNE, 5 ISSN(Print) 598-657 http://dx.doi.org/.5573/jsts.5.5.3.47 ISSN(Online) 33-4866 High-Throghpt Low-Complexity Sccessive- Cancellation
More informationFIR Filter Design Using The Signed-Digit Number System and Carry Save Adders A Comparison
(IJAA) International Jornal of Advanced ompter cience and Applications, Vol. 4, No., 03 FIR Filter Design Using The igned-digit Nmber ystem and arry ave Adders A omparison Hesham Altwaijry ompter Engineering
More informationEECS150 - Digital Design Lecture 28 Course Wrap Up. Recap 1
EECS150 - Digital Design Lecture 28 Course Wrap Up Dec. 5, 2013 Prof. Ronald Fearing Electrical Engineering and Computer Sciences University of California, Berkeley (slides courtesy of Prof. John Wawrzynek)
More informationPipelined Architecture (2A) Young Won Lim 4/10/18
Pipelined Architecture (2A) Copyright (c) 2014-2018 Young W. Lim. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2
More informationPipelined Architecture (2A) Young Won Lim 4/7/18
Pipelined Architecture (2A) Copyright (c) 2014-2018 Young W. Lim. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2
More informationComputer Elements and Datapath. Microarchitecture Implementation of an ISA
6.823, L5--1 Computer Elements and atapath Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 status lines Microarchitecture Implementation of an ISA ler control points 6.823, L5--2
More informationWHAT ARE FIELD PROGRAMMABLE. Audible plays called at the line of scrimmage? Signaling for a squeeze bunt in the ninth inning?
WHAT ARE FIELD PROGRAMMABLE Audible plays called at the line of scrimmage? Signaling for a squeeze bunt in the ninth inning? They re none of the above! We re going to take a look at: Field Programmable
More informationPE713 FPGA Based System Design
PE713 FPGA Based System Design Why VLSI? Dept. of EEE, Amrita School of Engineering Why ICs? Dept. of EEE, Amrita School of Engineering IC Classification ANALOG (OR LINEAR) ICs produce, amplify, or respond
More informationDatorstödd Elektronikkonstruktion
Datorstödd Elektronikkonstruktion [Computer Aided Design of Electronics] Zebo Peng, Petru Eles and Gert Jervan Embedded Systems Laboratory IDA, Linköping University http://www.ida.liu.se/~tdts80/~tdts80
More informationInstruction Level Parallelism Part II - Scoreboard
Course on: Advanced Computer Architectures Instruction Level Parallelism Part II - Scoreboard Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Basic Assumptions We consider
More informationRAKE Receiver. Tommi Heikkilä S Postgraduate Course in Radio Communications, Autumn II.
S-72333 Postgraduate Course in Radio Counications, Autun 2004 1 RAKE Receiver Toi Heikkilä toiheikkila@teliasoneraco Abstract RAKE receiver is used in CDMA-based (Code Division Multiple Access) systes
More informationDSI3 Sensor to Master Current Threshold Adaptation for Pattern Recognition
International Journal of Signal Processing Systes Vol., No. Deceber 03 DSI3 Sensor to Master Current Threshold Adaptation for Pattern Recognition David Levy Infineon Austria AG, Autootive Power Train Systes,
More informationFlexibility, Speed and Accuracy in VLIW Architectures Simulation and Modeling
Flexibility, Speed and Accuracy in VLIW Architectures Simulation and Modeling IVANO BARBIERI, MASSIMO BARIANI, ALBERTO CABITTO, MARCO RAGGIO Department of Biophysical and Electronic Engineering University
More informationR&S IMS Hardware Setup according IEC / EN (radiated immunity) Products: R&S IMS, R&S NRP-Z91, HL046E. Application Note
Products: R&S, R&S NRP-Z91, HL046E R&S Hardware Setup according IEC / EN 61000-4-3 (radiated iunity) Application Note This application note describes the general setup and required equipent for EMC easureents
More informationCMP 301B Computer Architecture. Appendix C
CMP 301B Computer Architecture Appendix C Dealing with Exceptions What should be done when an exception arises and many instructions are in the pipeline??!! Force a trap instruction in the next IF stage
More informationEXPERIMENTATION FOR ACTIVE VIBRATION CONTROL
CHPTER - 6 EXPERIMENTTION FOR CTIVE VIBRTION CONTROL 6. INTRODUCTION The iportant issues in vibration control applications are odeling the sart structure with in-built sensing and actuation capabilities
More informationDesign and Implementation of Serial Port Ultrasonic Distance Measurement System Based on STC12 Jian Huang
International Conference on Education, Manageent and Coputer Science (ICEMC 2016) Design and Ipleentation of Serial Port Ultrasonic Distance Measureent Syste Based on STC12 Jian Huang Xijing University,
More informationCISC 662 Graduate Computer Architecture. Lecture 9 - Scoreboard
CISC 662 Graduate Computer Architecture Lecture 9 - Scoreboard Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture tes from John Hennessy and David Patterson s: Computer
More informationMetrics How to improve performance? CPI MIPS Benchmarks CSC3501 S07 CSC3501 S07. Louisiana State University 4- Performance - 1
Performance of Computer Systems Dr. Arjan Durresi Louisiana State University Baton Rouge, LA 70810 Durresi@Csc.LSU.Edu LSUEd These slides are available at: http://www.csc.lsu.edu/~durresi/csc3501_07/ Louisiana
More informationThe Metrics and Designs of an Arithmetic Logic Function over
The Metrics and Designs of an Arithmetic Logic Function over 2002-2015 Jimmy Vallejo Department of Electrical and Computer Engineering University of Central Flida Orlando, FL 32816-2362 Abstract There
More informationCSE502: Computer Architecture CSE 502: Computer Architecture
CSE 502: Computer Architecture Out-of-Order Execution and Register Rename In Search of Parallelism rivial Parallelism is limited What is trivial parallelism? In-order: sequential instructions do not have
More informationEN164: Design of Computing Systems Lecture 22: Processor / ILP 3
EN164: Design of Computing Systems Lecture 22: Processor / ILP 3 Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University
More informationDetector Implementations Based on Software Defined Radio for Next Generation Wireless Systems Janne Janhunen
GIGA seminar 11.1.2010 Detector Implementations Based on Software Defined Radio for Next Generation Wireless Systems Janne Janhunen janne.janhunen@ee.oulu.fi 2 Outline Introduction Benefits and Challenges
More informationASIP Solution for Implementation of H.264 Multi Resolution Motion Estimation
Int. J. Communications, Network and System Sciences, 2010, 3, 453-461 doi:10.4236/ijcns.2010.35060 Published Online May 2010 (http://www.scirp.org/journal/ijcns/) ASIP Solution for Implementation of H.264
More informationImage processing. Case Study. 2-diemensional Image Convolution. From a hardware perspective. Often massively yparallel.
Case Study Image Processing Image processing From a hardware perspective Often massively yparallel Can be used to increase throughput Memory intensive Storage size Memory bandwidth -diemensional Image
More informationCHAPTER III THE FPGA IMPLEMENTATION OF PULSE WIDTH MODULATION
34 CHAPTER III THE FPGA IMPLEMENTATION OF PULSE WIDTH MODULATION 3.1 Introduction A number of PWM schemes are used to obtain variable voltage and frequency supply. The Pulse width of PWM pulsevaries with
More informationTwo Control Strategies for Aggregated Wind Turbine Model with Permanent Magnet Synchronous Generator
Eropean Association for the Development of Renewable Energies, Environment and Power Qality (EA4EPQ) International Conference on Renewable Energies and Power Qality (ICREPQ ) Santiago de Compostela (Spain),
More informationInstructor: Dr. Mainak Chaudhuri. Instructor: Dr. S. K. Aggarwal. Instructor: Dr. Rajat Moona
NPTEL Online - IIT Kanpur Instructor: Dr. Mainak Chaudhuri Instructor: Dr. S. K. Aggarwal Course Name: Department: Program Optimization for Multi-core Architecture Computer Science and Engineering IIT
More informationREAL TIME COMPUTATION OF DIFFERENCE EQUATIONS
REAL TIME COMPUTATION OF DIFFERENCE EQUATIONS Carlos Celaya Borges, Jorges Illescas Chávez, Esteban Torres León, Artro Prieto Fenlabrada Institto Tecnológico de Pebla, Universidad Atónoma de Pebla ccelaya@si.bap.mx,
More informationEvolution of DSP Processors. Kartik Kariya EE, IIT Bombay
Evolution of DSP Processors Kartik Kariya EE, IIT Bombay Agenda Expected features of DSPs Brief overview of early DSPs Multi-issue DSPs Case Study: VLIW based Processor (SPXK5) for Mobile Applications
More informationOut-of-Order Execution. Register Renaming. Nima Honarmand
Out-of-Order Execution & Register Renaming Nima Honarmand Out-of-Order (OOO) Execution (1) Essence of OOO execution is Dynamic Scheduling Dynamic scheduling: processor hardware determines instruction execution
More informationLow Power Design Part I Introduction and VHDL design. Ricardo Santos LSCAD/FACOM/UFMS
Low Power Design Part I Introduction and VHDL design Ricardo Santos ricardo@facom.ufms.br LSCAD/FACOM/UFMS Motivation for Low Power Design Low power design is important from three different reasons Device
More informationLecture Perspectives. Administrivia
Lecture 29-30 Perspectives Administrivia Final on Friday May 18 12:30-3:30 pm» Location: 251 Hearst Gym Topics all what was covered in class. Review Session Time and Location TBA Lab and hw scores to be
More informationReference. Wayne Wolf, FPGA-Based System Design Pearson Education, N Krishna Prakash,, Amrita School of Engineering
FPGA Fabrics Reference Wayne Wolf, FPGA-Based System Design Pearson Education, 2004 CPLD / FPGA CPLD Interconnection of several PLD blocks with Programmable interconnect on a single chip Logic blocks executes
More informationPipelining and ISA Design
Pipelined instuc.on Execu.on 1 Pipelining and ISA Design MIPS Instuc:on Set designed fo pipelining All instuc:ons ae 32- bits Easie to fetch and decode in one cycle x86: 1- to 17- byte instuc:ons (x86
More information0 A. Review. Lecture #16. Pipeline big-delay CL for faster clock Finite State Machines extremely useful You!ll see them again in 150, 152 & 164
CS61C L15 Representations of Combinatorial Logic Circuits (1) inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #16 Representations of Combinatorial Logic Circuits CPS today! 2005-10-26
More informationA Wireless Transmission Technique for Remote Monitoring and Recording System on Power Devices by GPRS Network
Proceedings of the 6th WSEAS International Conference on Instruentation, Measureent, Circuits & Systes, Hangzhou, China, April 15-17, 007 13 A Wireless Transission Technique for Reote Monitoring and Recording
More informationCHAPTER 4 FIELD PROGRAMMABLE GATE ARRAY IMPLEMENTATION OF FIVE LEVEL CASCADED MULTILEVEL INVERTER
87 CHAPTER 4 FIELD PROGRAMMABLE GATE ARRAY IMPLEMENTATION OF FIVE LEVEL CASCADED MULTILEVEL INVERTER 4.1 INTRODUCTION The Field Programmable Gate Array (FPGA) is a high performance data processing general
More informationLecture 1: Introduction to Digital System Design & Co-Design
Design & Co-design of Embedded Systems Lecture 1: Introduction to Digital System Design & Co-Design Computer Engineering Dept. Sharif University of Technology Winter-Spring 2008 Mehdi Modarressi Topics
More informationLecture 30. Perspectives. Digital Integrated Circuits Perspectives
Lecture 30 Perspectives Administrivia Final on Friday December 15 8 am Location: 251 Hearst Gym Topics all what was covered in class. Precise reading information will be posted on the web-site Review Session
More informationSENSOR TECHNOLGY APPLICATIONS FOR MEDIUM VOLTAGE
1(8) SENSOR TECHNOLGY APPLICATIONS FOR MEDIUM VOLTAGE )** Athor & Presenter: Bo Westerholm B.Sc. Prodct development engineer, ABB Oy, Medim Voltage Technology 1. Introdction Sensors are a new soltion for
More informationCSE502: Computer Architecture CSE 502: Computer Architecture
CSE 502: Computer Architecture Out-of-Order Schedulers Data-Capture Scheduler Dispatch: read available operands from ARF/ROB, store in scheduler Commit: Missing operands filled in from bypass Issue: When
More informationHigh performance Radix-16 Booth Partial Product Generator for 64-bit Binary Multipliers
High performance Radix-16 Booth Partial Product Generator for 64-bit Binary Multipliers Dharmapuri Ranga Rajini 1 M.Ramana Reddy 2 rangarajini.d@gmail.com 1 ramanareddy055@gmail.com 2 1 PG Scholar, Dept
More informationCSE502: Computer Architecture CSE 502: Computer Architecture
CSE 502: Computer Architecture Out-of-Order Execution and Register Rename In Search of Parallelism rivial Parallelism is limited What is trivial parallelism? In-order: sequential instructions do not have
More informationDynamic Scheduling I
basic pipeline started with single, in-order issue, single-cycle operations have extended this basic pipeline with multi-cycle operations multiple issue (superscalar) now: dynamic scheduling (out-of-order
More informationCS 61C: Great Ideas in Computer Architecture Finite State Machines, Functional Units
CS 61C: Great Ideas in Computer Architecture Finite State Machines, Functional Units Instructors: Vladimir Stojanovic and Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/sp16 1 Machine Interpretation
More informationVery Large Scale Integration (VLSI)
Very Large Scale Integration (VLSI) Lecture 6 Dr. Ahmed H. Madian Ah_madian@hotmail.com Dr. Ahmed H. Madian-VLSI 1 Contents Array subsystems Gate arrays technology Sea-of-gates Standard cell Macrocell
More informationLow-Power CMOS VLSI Design
Low-Power CMOS VLSI Design ( 范倫達 ), Ph. D. Department of Computer Science, National Chiao Tung University, Taiwan, R.O.C. Fall, 2017 ldvan@cs.nctu.edu.tw http://www.cs.nctu.tw/~ldvan/ Outline Introduction
More informationNeuro-predictive control based self-tuning of PID controllers
Nero-predictive control based self-tning of PID controllers Corneli Lazar, Sorin Carari, Dragna Vrabie, Maris Kloetzer Gh. Asachi Technical Universit of Iasi, Department of Atomatic Control Blvd. D. Mangeron
More information