FMP For More Practice

Similar documents
LECTURE 8. Pipelining: Datapath and Control

IF ID EX MEM WB 400 ps 225 ps 350 ps 450 ps 300 ps

Chapter 4. Pipelining Analogy. The Processor. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop:

COSC4201. Scoreboard

7/19/2012. IF for Load (Review) CSE 2021: Computer Organization. EX for Load (Review) ID for Load (Review) WB for Load (Review) MEM for Load (Review)

IF ID EX MEM WB 400 ps 225 ps 350 ps 450 ps 300 ps

CSE 2021: Computer Organization

Instruction Level Parallelism. Data Dependence Static Scheduling

EECE 321: Computer Organiza5on

CMP 301B Computer Architecture. Appendix C

EN164: Design of Computing Systems Lecture 22: Processor / ILP 3

RISC Design: Pipelining

CZ3001 ADVANCED COMPUTER ARCHITECTURE

7/11/2012. Single Cycle (Review) CSE 2021: Computer Organization. Multi-Cycle Implementation. Single Cycle with Jump. Pipelining Analogy

Lecture Topics. Announcements. Today: Pipelined Processors (P&H ) Next: continued. Milestone #4 (due 2/23) Milestone #5 (due 3/2)

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Computer Science 246. Advanced Computer Architecture. Spring 2010 Harvard University. Instructor: Prof. David Brooks

CISC 662 Graduate Computer Architecture. Lecture 9 - Scoreboard

6.S084 Tutorial Problems L19 Control Hazards in Pipelined Processors

Pipelining A B C D. Readings: Example: Doing the laundry. Ann, Brian, Cathy, & Dave. each have one load of clothes to wash, dry, and fold

ECE473 Computer Architecture and Organization. Pipeline: Introduction

Computer Hardware. Pipeline

Problem: hazards delay instruction completion & increase the CPI. Compiler scheduling (static scheduling) reduces impact of hazards

Parallel architectures Electronic Computers LM

Instruction Level Parallelism Part II - Scoreboard

Number Series Workbook V-1

Suggested Readings! Lecture 12" Introduction to Pipelining! Example: We have to build x cars...! ...Each car takes 6 steps to build...! ! Readings!

CS 110 Computer Architecture Lecture 11: Pipelining

A B C D. Ann, Brian, Cathy, & Dave each have one load of clothes to wash, dry, and fold. Time

Pipelined Processor Design

Project 5: Optimizer Jason Ansel

GPLMS Revision Programme GRADE 3 Booklet

Asanovic/Devadas Spring Pipeline Hazards. Krste Asanovic Laboratory for Computer Science M.I.T.

Dynamic Scheduling I

Lecture 4: Introduction to Pipelining

Sudoku goes Classic. Gaming equipment and the common DOMINARI - rule. for 2 players from the age of 8 up

RISC Central Processing Unit

CS521 CSE IITG 11/23/2012

Computer Elements and Datapath. Microarchitecture Implementation of an ISA

CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES

HIGH PERFORMANCE BAUGH WOOLEY MULTIPLIER USING CARRY SKIP ADDER STRUCTURE

Reconfigurable High Performance Baugh-Wooley Multiplier for DSP Applications

CMSC 611: Advanced Computer Architecture

EE 457 Homework 5 Redekopp Name: Score: / 100_

An Embedded Pointing System for Lecture Rooms Installing Multiple Screen

1-20 Diagnostic Interview Assessment

A Novel High Performance 64-bit MAC Unit with Modified Wallace Tree Multiplier

EECS 470 Lecture 4. Pipelining & Hazards II. Winter Prof. Ronald Dreslinski h8p://

Pipelining and ISA Design

CS100: DISCRETE STRUCTURES. Lecture 8 Counting - CH6

2014 Edmonton Junior High Math Contest ANSWER KEY

An Analysis of Multipliers in a New Binary System

IJCSIET--International Journal of Computer Science information and Engg., Technologies ISSN

Pipelined Architecture (2A) Young Won Lim 4/7/18

Pipelined Architecture (2A) Young Won Lim 4/10/18

Unit 3. Logic Design

ECE 2300 Digital Logic & Computer Organization. More Pipelined Microprocessor

Thursday 6 June 2013 Afternoon

CS/ECE 252: INTRODUCTION TO COMPUTER ENGINEERING UNIVERSITY OF WISCONSIN MADISON

Special Notice. Rules. Weiss Schwarz Comprehensive Rules ver Last updated: September 3, Outline of the Game

CM 3310 Process Control, Spring Lecture 17

CS429: Computer Organization and Architecture

Digital Integrated CircuitDesign

CS420/520 Computer Architecture I

Computer Architecture

Chapter 16 - Instruction-Level Parallelism and Superscalar Processors

MEDIUM SPEED ANALOG-DIGITAL CONVERTERS

ECE 4750 Computer Architecture, Fall 2016 T09 Advanced Processors: Superscalar Execution

Pipelined Beta. Handouts: Lecture Slides. Where are the registers? Spring /10/01. L16 Pipelined Beta 1

Computer Architecture and Organization:

EC4205 Microprocessor and Microcontroller

Objectives. Materials

Grade 2 Mathematics Scope and Sequence

Mathematical Magic Tricks

Copyright 2015 Edmentum - All rights reserved.

Chapter 4: The Building Blocks: Binary Numbers, Boolean Logic, and Gates

OOO Execution & Precise State MIPS R10000 (R10K)

DIGITAL DESIGN WITH SM CHARTS

GCSE Mathematics (Non-calculator Paper)

Module 5. DC to AC Converters. Version 2 EE IIT, Kharagpur 1

The twenty-six pictures game.

Department of Electronics and Communication Engineering

Multiplying Three Factors and Missing Factors

1 P a g e

GPLMS Revision Programme GRADE 4 Booklet

0 A. Review. Lecture #16. Pipeline big-delay CL for faster clock Finite State Machines extremely useful You!ll see them again in 150, 152 & 164

CS 61C: Great Ideas in Computer Architecture. Pipelining Hazards. Instructor: Senior Lecturer SOE Dan Garcia

12 2½ strips strips ½ to 2½ strips. 42 1½ strips. 4 4 squares

Chapter 01 Test. 1 Write an algebraic expression for the phrase the sum of g and 3. A 3g B 3g + 3 C g 3 D g Write a word phrase for.

Techniques for Implementing Multipliers in Stratix, Stratix GX & Cyclone Devices

Fully Integrated Proximity and Ambient Light Sensor with Infrared Emitter and I 2 C Interface

High performance Radix-16 Booth Partial Product Generator for 64-bit Binary Multipliers

10, J, Q, K, A all of the same suit. Any five card sequence in the same suit. (Ex: 5, 6, 7, 8, 9.) All four cards of the same index. (Ex: A, A, A, A.

Grade 5 Large Numbers

TESTABLE VLSI CIRCUIT DESIGN FOR CELLULAR ARRAYS

Single-Cycle CPU The following exercises are taken from Hennessy and Patterson, CO&D 2 nd, 3 rd, and 4 th Ed.

CS61C : Machine Structures

5. (Adapted from 3.25)

Distribution of Aces Among Dealt Hands

International Journal of Scientific & Engineering Research Volume 3, Issue 12, December ISSN

Transcription:

FP 6.-6 For ore Practice Labeling Pipeline Diagrams with 6.5 [2] < 6.3> To understand how pipeline works, let s consider these five instructions going through the pipeline: lw $, 2($) sub $, $2, $3 and $2, $, $5 or $3, $6, $7 add $, $8, $9 Show the instructions in the pipeline that precede the lw as before <>, before <2>,..., and the instructions after the add as after <>, after <2>,... Figures 6..5 through 6..9 show these instructions proceeding through the nine clock cycles it takes them to complete ex ecution, highlighting what is active in a stage and identifying the instruction associated with each stage during a clock cycle. R eviewing these figures carefully will give you insight into how pipelines work. A few items you may notice: In Figure 6..7 you can see the sequence of the destination numbers from left to right at the bottom of the pipeline s. The numbers advance to the right during each clock cycle, with the E/ pipeline supplying the number of the written during the stage. When a stage is inactive, the values of lines that are deasserted are shown as or ( for don t care). In contrast to C hapter 5, where sequencing of required special hardware, sequencing of is embedded in the pipeline structure itself. First, all instructions take the same number of clock cycles, so there is no special for instruction duration. Second, all information is computed during instruction decode, and then passed along by the pipeline s. Using the same format as Figure 6..5, and starting with the blank pipelining diagram in Figure 6.., draw the pipeline diagrams for the above sequence for a total of clock cycles.

For ore Practice FP 6.-7 lw $, 2($) before<> E: before<2> E: before<3> : before<> ID/E E/E E/ E Clock Reg 2 2 Src [5 ] Sign [2 6] Op [5 ] em em emtoreg sub $, $2, $3 lw $, 2($) E: before<> E: before<2> : before<3> lw ID/E E/E E/ E Clock 2 Reg 2 2 [5 ] Sign 2 [2 6] [5 ] $ $ 2 Src Op em em emtoreg FIGURE 6..5 Clock cycles and 2. The phrase before<i> means the ith instruction before lw. The lw instruction in the top path is in the IF stage. At the end of the clock cycle, the lw instruction is in the pipeline s. In the second clock cycle, seen in the bottom path, the lw moves to the ID stage, and sub enters in the IF stage. N ote that the values of the instruction fields and the selected source s are shown in the ID stage. H ence $ and the constant 2, the operands of lw, are written into the ID/E pipeline. The number, representing the destination number of lw, is also placed in ID/E. Bits 5 are, but we use to show that a field plays no role in a given instruction. The top of the ID/E pipeline shows the values for lw to be used in the remaining stages. These values can be read from thelw row of the table in Figure 6.25 on page.

FP 6.-8 For ore Practice and $2, $, $5 sub $, $2, $3 E: lw $,... E: before<> : before<2> ID/E E/E E/ E Clock 3 2 3 Reg 2 [5 ] [2 6] [5 ] $2 $ $3 2 Sign 2 Src Op em em emtoreg or $3, $6, $7 and $2, $, $5 E: sub $,... E: lw $,... : before<> ID/E E/E E/ and E Clock 5 2 Reg 2 [5 ] [2 6] [5 ] 2 Sign 2 $ $5 $2 $3 Src Op em em emtoreg FIGURE 6..6 Clock cycles 3 and. In the top diagram, lw enters the E stage in the third clock cycle, adding $ and 2 to form the address in the E/E pipeline. (The lw instruction is written lw $,... upon reaching E because the identity of instruction operands is not needed by E or the subsequent stages. In this version of the pipeline, the actions of E, E, and depend only on the instruction and its destination or its target address.) At the same time, sub enters ID, reading s $2 and$3, and the and instruction starts IF. In the fourth clock cycle (bottom path), lw moves into E stage, reading using the value in E/E as the address. In the same clock cycle, the AL U subtracts $3 from $2 and places the difference into E/E, and reads s $ and $5 during ID, and the or instruction enters IF. The two diagrams show the signals being created in the ID stage and peeled off as they are used in subsequent pipe stages.

For ore Practice FP 6.-9 add $, $8, $9 or $3, $6, $7 E: and $2,... E: sub $,... : lw $,... ID/E E/E E/ or E Clock 5 6 7 3 Reg 2 [5 ] [2 6] [5 ] $6 $ $7 2 Sign $5 3 2 Src Op em em emtoreg after<> add $, $8, $9 E: or $3,... E: and $2,... : sub $,... ID/E E/E E/ add E Clock 6 8 9 2 Reg 2 [5 ] [2 6] [5 ] 2 Sign $8 $9 $6 $7 Src Op 3 em em emtoreg 2 FIGURE 6..7 Clock cycles 5 and 6. With add, the final instruction in this example, entering IF in the top path, all instructions are engaged. By writing the in E/ into, lw completes; both the and the number are in E/. In the same clock cycle, sub sends the difference in E/E to E/, and the rest of the instructions move forward. In the next clock cycle, sub selects the value in E/ to write to number, again found in E/. The remaining instructions play follow-the-leader: the calculates the OR of $6 and $7 for the or instruction in the E stage, and s $8 and $9 are read in the ID stage for the add instruction. The instructions after add are shown as inactive just to emphasiz e what occurs for the five instructions in the example. The phrase after<i> means the ith instruction after add.

FP 6.- For ore Practice after<2> after<> E: add $,... E: or $3,... : and $2,... ID/E E/E E/ E Clock 7 2 Reg 2 2 Src [5 ] Sign [2 6] Op [5 ] $8 $9 em em 3 2 emtoreg after<3> after<2> E: after<> E: and $,... : or $3,... ID/E E/E E/ E Clock 8 3 Reg 2 [5 ] [2 6] [5 ] 2 Sign Src Op em em emtoreg 3 FIGURE 6..8 Clock cycles 7 and 8. In the top path, the add instruction brings up the rear, adding the values corresponding to s $8 and$9 during the E stage. The of the or instruction is passed from E/E to E/ in the E stage, and the stage writes the of the and instruction in E/ to $2. Note that the signals are deasserted (set to ) in the ID stage, since no instruction is being executed. In the following clock cycle (lower drawing), the stage writes the to $3, thereby completing or, and the E stage passes the sum from the add in E/E to E/. The instructions after add are shown as inactive for pedagogical reasons.

For ore Practice FP 6.- after<> after<3> E: after<2> E: after<> : add $,... ID/E E/E E/ E Clock 9 Reg 2 [5 ] [2 6] [5 ] 2 Sign Src Op em em emtoreg FIGURE 6..9 Clock cycle 9. The stage writes the sum in E/ into $, completing add and the five-instruction sequence. The instructions after add are shown as inactive for pedagogical reasons.

FP 6.-2 For ore Practice E: E: : ID/E E/E E/ E Clock Reg 2 [5 ] [2 6] [5 ] 2 Sign Src Op em em emtoreg FIGURE 6.. A blank single-clock-cycle pipeline diagram with.