Instructor: Randy H. Katz hap://inst.eecs.berkeley.edu/~cs61c/fa13. Fall Lecture #20. Warehouse Scale Computer

Similar documents
CS 61C: Great Ideas in Computer Architecture Pipelining. Anything can be represented as a number, i.e., data or instrucvons

Pipelining and ISA Design

CS 61C: Great Ideas in Computer Architecture. Pipelining Hazards. Instructor: Senior Lecturer SOE Dan Garcia

CS61C : Machine Structures

CS61C : Machine Structures

CS 110 Computer Architecture Lecture 11: Pipelining

CMSC 611: Advanced Computer Architecture

7/11/2012. Single Cycle (Review) CSE 2021: Computer Organization. Multi-Cycle Implementation. Single Cycle with Jump. Pipelining Analogy

Chapter 4. Pipelining Analogy. The Processor. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop:

Lecture Topics. Announcements. Today: Pipelined Processors (P&H ) Next: continued. Milestone #4 (due 2/23) Milestone #5 (due 3/2)

Pipelining A B C D. Readings: Example: Doing the laundry. Ann, Brian, Cathy, & Dave. each have one load of clothes to wash, dry, and fold

A B C D. Ann, Brian, Cathy, & Dave each have one load of clothes to wash, dry, and fold. Time

Pipelined Processor Design

CMSC 611: Advanced Computer Architecture

Lecture 4: Introduction to Pipelining

LECTURE 8. Pipelining: Datapath and Control

ECE473 Computer Architecture and Organization. Pipeline: Introduction

You are Here! Processor Design Process. Agenda. Agenda 10/25/12. CS 61C: Great Ideas in Computer Architecture Single Cycle MIPS CPU Part II

EECE 321: Computer Organiza5on

CS420/520 Computer Architecture I

7/19/2012. IF for Load (Review) CSE 2021: Computer Organization. EX for Load (Review) ID for Load (Review) WB for Load (Review) MEM for Load (Review)

CSE 2021: Computer Organization

Instruction Level Parallelism. Data Dependence Static Scheduling

RISC Design: Pipelining

IF ID EX MEM WB 400 ps 225 ps 350 ps 450 ps 300 ps

CS429: Computer Organization and Architecture

Suggested Readings! Lecture 12" Introduction to Pipelining! Example: We have to build x cars...! ...Each car takes 6 steps to build...! ! Readings!

Computer Architecture

IF ID EX MEM WB 400 ps 225 ps 350 ps 450 ps 300 ps

Asanovic/Devadas Spring Pipeline Hazards. Krste Asanovic Laboratory for Computer Science M.I.T.

An Efficient Control Approach for DC-DC Buck-Boost Converter

Selected Solutions to Problem-Set #3 COE 608: Computer Organization and Architecture Single Cycle Datapath and Control

Design and Implementation of 4 - QAM VLSI Architecture for OFDM Communication

N2-1. The Voltage Source. V = ε ri. The Current Source

Statement of Works Data Template Version: 4.0 Date:

RISC Central Processing Unit

ABSTRACTT FFT FFT-' Proc. of SPIE Vol U-1

Experimental Investigation of Influence on Non-destructive Testing by Form of Eddy Current Sensor Probe

VLSI Implementation of Low Complexity MIMO Detection Algorithms

EECS150 - Digital Design Lecture 2 - Synchronous Digital Systems Review Part 1. Outline

Investigation. Name: a About how long would the threaded rod need to be if the jack is to be stored with

Proposal of Circuit Breaker Type Disconnector for Surge Protective Device

OPTIMUM MEDIUM ACCESS TECHNIQUE FOR NEXT GENERATION WIRELESS SYSTEMS

where and are polynomials with real coefficients and of degrees m and n, respectively. Assume that and have no zero on axis.

Lecture 2: Review of Pipelines

Pipelined Beta. Handouts: Lecture Slides. Where are the registers? Spring /10/01. L16 Pipelined Beta 1

ECEN326: Electronic Circuits Fall 2017

Sliding Mode Control for Half-Wave Zero Current Switching Quasi-Resonant Buck Converter

Analysis and Implementation of LLC Burst Mode for Light Load Efficiency Improvement

STACK DECODING OF LINEAR BLOCK CODES FOR DISCRETE MEMORYLESS CHANNEL USING TREE DIAGRAM

Chapter 9 Cascode Stages and Current Mirrors

Week 5. Lecture Quiz 1. Forces of Friction, cont. Forces of Friction. Forces of Friction, final. Static Friction

Minimizing Ringing and Crosstalk

Efficient Power Control for Broadcast in Wireless Communication Systems

6.S084 Tutorial Problems L19 Control Hazards in Pipelined Processors

Configurable M-factor VLSI DVB-S2 LDPC decoder architecture with optimized memory tiling design

THE UNIVERSITY OF NEW SOUTH WALES. School of Electrical Engineering & Telecommunications

A multichannel Satellite Scheduling Algorithm

10! !. 3. Find the probability that a five-card poker hand (i.e. 5 cards out of a 52-card deck) will be:

A New Buck-Boost DC/DC Converter of High Efficiency by Soft Switching Technique

Analysis of Occurrence of Digit 0 in Natural Numbers Less Than 10 n

E /11/2018 AA

Discussion #7 Example Problem This problem illustrates how Fourier series are helpful tools for analyzing electronic circuits. Often in electronic

Assignment 0/0 2 /0 8 /0 16 Version: 3.2a Last Updated: 9/20/ :29 PM Binary Ones Comp Twos Comp

ECE 2300 Digital Logic & Computer Organization. More Pipelined Microprocessor

Lecture 23. OUTLINE BJT Differential Amplifiers (cont d) Reading: Chapter

Design and Characterization of Conformal Microstrip Antennas Integrated into 3D Orthogonal Woven Fabrics

Computer Hardware. Pipeline

Optimization of the law of variation of shunt regulator impedance for Proximity Contactless Smart Card Applications to reduce the loading effect.

IEEE Broadband Wireless Access Working Group < Modifications to the Feedback Methodologies in UL Sounding

HYBRID FUZZY PD CONTROL OF TEMPERATURE OF COLD STORAGE WITH PLC

INCREMENTAL REDUNDANCY (IR) SCHEMES FOR W-CDMA HS-DSCH

GRADE 6 FLORIDA. Division WORKSHEETS

CSE502: Computer Architecture CSE 502: Computer Architecture

Short-Circuit Fault Protection Strategy of Parallel Three-phase Inverters

Spectrum Sharing between Public Safety and Commercial Users in 4G-LTE

Development of Corona Ozonizer Using High Voltage Controlling of Produce Ozone Gas for Cleaning in Cage

Surface Areas of Cylinders ACTIVITY: Finding Surface Area. ACTIVITY: Finding Area. How can you find the surface area of. a cylinder?

Optimal Design of Smart Mobile Terminal Antennas for Wireless Communication and Computing Systems

Synopsis of Technical Report: Designing and Specifying Aspheres for Manufacturability By Jay Kumler

CSE502: Computer Architecture CSE 502: Computer Architecture

An Ultra Low Power Segmented Digital-to-Analog Converter

International Journal of Advance Engineering and Research Development. Implementation of Vector Oriented Control for Induction Motor Using DS1104

Computer Science 246. Advanced Computer Architecture. Spring 2010 Harvard University. Instructor: Prof. David Brooks

Novel Analytic Technique for PID and PIDA Controller Design. Seul Jung and Richard C. Dorf. Department of Electrical and Computer Engineering

Low-Complexity Time-Domain SNR Estimation for OFDM Systems

Multiagent Reinforcement Learning Dynamic Spectrum Access in Cognitive Radios

BLACKBOARD SYSTEM AND TOP-DOWN PROCESSING FOR THE TRANSCRIPTION OF SIMPLE POLYPHONIC MUSIC. Juan Pablo Bello and Mark Sandler

6.1 Reciprocal, Quotient, and Pythagorean Identities

1 Performance and Cost

An Improved Implementation of Activity Based Costing Using Wireless Mesh Networks with MIMO Channels

PERFORMANCE OF TOA ESTIMATION TECHNIQUES IN INDOOR MULTIPATH CHANNELS

Figure Geometry for Computing the Antenna Parameters.

Optimized Fuzzy Controller Design to Stabilize Voltage and Frequency Amplitude in a Wind Turbine Based on Induction Generator Using Firefly Algorithm

Figure 1-1 Sample Antenna Pattern

Dynamic Scheduling I

Audio Engineering Society. Convention Paper. Presented at the 120th Convention 2006 May Paris, France

77 GHz ACC Radar Simulation Platform

Distributive Radiation Characterization Based on the PEEC Method

Optimal Strategies in Jamming Resistant Uncoordinated Frequency Hopping Systems. Bingwen Zhang

Transcription:

CS 61C: Geat Ideas in Compute Achitectue Contol and Pipelining Instucto: Randy H. Katz hap://inst.eecs.bekeley.edu/~cs61c/fa13 11/5/13 Fall 2013 - - Lectue #20 1 So0wae Paallel Requests Assigned to compute e.g., Seach Katz Paallel Theads Assigned to coe e.g., Lookup, Ads Paallel InstucVons >1 instucvon @ one Vme e.g., 5 pipelined instucvons Paallel Data >1 data item @ one Vme e.g., Add of 4 pais of wods Hadwae descipvons All gates @ one Vme Pogamming Languages You Ae Hee! Haness Paallelism & Achieve High Pefomance Hadwae Today s Lectue Waehouse Scale Compute Coe Memoy Input/Output InstucVon Unit(s) Main Memoy Coe 2 Compute (Cache) Coe FuncVonal Unit(s) A 0 +B 0 A 1 +B 1 A 2 +B 2 A 3 +B 3 Smat Phone Logic Gates 1

Machine Intepeta4on Levels of RepesentaVon/ IntepetaVon High Level Language Pogam (e.g., C) Compile Assembly Language Pogam (e.g., MIPS) Assemble Machine Language Pogam (MIPS) Hadwae Achitectue DescipCon (e.g., block diagams) Achitectue Implementa4on Logic Cicuit DescipCon (Cicuit SchemaCc Diagams) temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; lw $t0, 0($2) lw $t1, 4($2) sw $t1, 0($2) sw $t0, 4($2) Anything can be epesented as a numbe, i.e., data o instucvons 0000 1001 1100 0110 1010 1111 0101 1000 1010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111! 3 InstucVon Level Paallelism (ILP) Anothe paallelism fom to go with Request Level Paallelism and Data Level Paallelism RLP e.g., Waehouse Scale CompuVng DLP e.g., SIMD, Map- Reduce ILP e.g., Pipelined InstucBon ExecuBon 5 stage pipeline => 5 instucvons execuvng simultaneously, one at each pipeline stage 4 2

Pipelined ExecuVon Pipelined Datapath Agenda Stuctual and Data Hazads Contol Hazads 5 Pipelined ExecuVon Pipelined Datapath Agenda Stuctual and Data Hazads Contol Hazads 6 3

Review: Single- Cycle Pocesso Five steps to design a pocesso: 1. Analyze instucvon set à Pocesso datapath equiements Contol 2. Select set of datapath Memoy components & establish Datapath clock methodology 3. Assemble datapath meevng the equiements: e- examine fo pipelining 4. Analyze implementavon of each instucvon to detemine semng of contol points that effects the egiste tansfe. 5. Assemble the contol logic Fomulate Logic EquaVons Design Cicuits 7 Input Output Pipeline Analogy: Doing Laundy Ann, Bian, Cathy, Dave each have one load of clothes to wash, dy, fold, and put away Washe takes 30 minutes Dye takes 30 minutes Folde takes 30 minutes Stashe takes 30 minutes to put clothes into dawes A B C D 8 4

SequenVal Laundy 6 PM 7 8 9 10 11 12 1 2 AM T a s k O d e A B C D 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 Time SequenVal laundy takes 8 hous fo 4 loads 9 Pipelined Laundy 12 2 AM 6 PM 7 8 9 10 11 1 T a s k O d e A B C D 30 30 30 30 30 30 30 Pipelined laundy takes 3.5 hous fo 4 loads! Time 10 5

T a s k O d e 6 PM 7 8 9 A B C D Pipelining Lessons (1/2) Time 30 30 30 30 30 30 30 Pipelining doesn t help latency of single task, it helps thoughput of enve wokload MulVple tasks opeavng simultaneously using diffeent esouces PotenVal speedup = Numbe pipe stages (4 in this case) Time to fill pipeline and Vme to dain it educes speedup: 8 hous/3.5 hous o 2.3X v. potenval 4X in this example 11 T a s k O d e 6 PM 7 8 9 A B C D Pipelining Lessons (2/2) Time 30 30 30 30 30 30 30 Suppose new Washe takes 20 minutes, new Stashe takes 20 minutes. How much faste is pipeline? Pipeline ate limited by slowest pipeline stage Unbalanced lengths of pipe stages educes speedup 12 6

Pipelined ExecuVon Pipelined Datapath Agenda Stuctual and Data Hazads Contol Hazads 13 Review: RISC Design Pinciples A simple coe is a faste coe ReducVon in the numbe and complexity of instucvons in the ISA à simplifies pipelined implementavon Common RISC stategies: Fixed instucvon length, geneally a single wod (MIPS = 32b); Simplifies pocess of fetching instucvons fom memoy Simplified addessing modes; (MIPS just egiste + offset) Simplifies pocess of fetching opeands fom memoy Fewe and simple instucvons in the instucvon set; Simplifies pocess of execuvng instucvons Simplified memoy access: only load and stoe instucvons access memoy; Let the compile do it. Use a good compile to beak complex high- level language statements into a numbe of simple assembly language statements 14 7

Review: Single Cycle Datapath 31 26 21 op s t immediate Data Memoy {R[s] + SignExt[imm16]} = R[t] RegDst= Rd RegW= busw 32 npc_sel= 1 clk Rs 5 5 Rw imm16 Rt 0 Ra Rb RegFile 16 clk Rt 5 ExtOp= Extende busa busb 32 32 16 inst fetch unit Rs Rt Rd Imm16 zeo ct= MemtoReg= 32 0 1 Sc= InstucVon<31:0> <21:25> = 32 Data In clk <16:20> MemW= 32 <11:15> WEn Ad <0:15> Data Memoy 15 0 1 0 Steps in ExecuVng MIPS 1) IF: InstucVon Fetch, Incement PC 2) ID: InstucVon Decode, Read Registes 3) EX: ExecuVon Mem- ef: Calculate Addess Aith- log: Pefom OpeaVon 4) Mem: Load: Read Data fom Memoy Stoe: Wite Data to Memoy 5) WB: Wite Data Back to Registe 16 8

Redawn Single- Cycle Datapath PC instucvon memoy d s t egistes Data memoy +4 imm 1. InstucVon Fetch 2. Decode/ Registe Read 3. Execute 4. Memoy 5. Wite Back 17 Pipelined Datapath PC instucvon memoy d s t egistes Data memoy +4 imm 1. InstucVon Fetch 2. Decode/ Registe Read 3. Execute 4. Memoy 5. Wite Back Add egistes between stages Hold infomavon poduced in pevious cycle 5 stage pipeline; clock ate potenval 5X faste 18 9

Moe Detailed Pipeline Registes named fo adjacent stages, e.g., IF/ID 19 IF fo Load, Stoe, Highlight combinavonal logic components used + ight half of state logic on ead, le~ half on wite 20 10

ID fo Load, Stoe, 21 EX fo Load 22 11

MEM fo Load 23 WB fo Load Has Bug that was in 1 st edivon of textbook! Wong egiste numbe 24 12

Coected Datapath fo Load Coect egiste numbe 25 Pipelined ExecuVon RepesentaVon Time IF ID EX Mem WB IF ID EX Mem WB IF ID EX Mem WB IF ID EX Mem WB IF ID EX Mem WB IF ID EX Mem WB Evey instucvon must take same numbe of steps, also called pipeline stages, so some will go idle somevmes 26 13

Gaphical Pipeline Diagams PC instucvon memoy d s t egistes Data memoy +4 imm 1. InstucVon Fetch 2. Decode/ Registe Read 3. Execute 4. Memoy 5. Wite Back Use datapath figue below to epesent pipeline IF ID EX Mem WB 27 I n s t. O d e Gaphical Pipeline RepesentaVon (In Reg, ight half highlight ead, lev half wite) Time (clock cycles) Load Add Stoe Sub O I$ Reg I$ Reg I$ D$ Reg I$ 28 Reg D$ Reg I$ Reg D$ Reg Reg D$ Reg D$ Reg 14

Pipeline Pefomance Assume Vme fo stages is 100ps fo egiste ead o wite 200ps fo othe stages What is pipelined clock ate? Compae pipelined datapath with single- cycle datapath Inst Inst fetch Registe ead op Memoy access Registe wite Total time lw 200ps 100 ps 200ps 200ps 100 ps 800ps sw 200ps 100 ps 200ps 200ps 700ps R-fomat 200ps 100 ps 200ps 100 ps 600ps beq 200ps 100 ps 200ps 500ps 29 Student RouleAe? Pipeline Pefomance Single- cycle (T c = 800ps) Pipelined (T c = 200ps) 30 15

Pipeline Speedup If all stages ae balanced i.e., all take the same Vme Time between instucvons pipelined = Time between instucvons nonpipelined Numbe of stages If not balanced, speedup is less Speedup due to inceased thoughput Latency (Vme fo each instucvon) does not decease 31 Pipelined ExecuVon Pipelined Datapath Agenda Stuctual and Data Hazads Contol Hazads 32 16

Hazads SituaVons that pevent stavng the next logical instucvon in the next clock cycle 1. Stuctual hazads Requied esouce is busy (e.g., stashe is studying) 2. Data hazad Need to wait fo pevious instucvon to complete its data ead/wite (e.g., pai of socks in diffeent loads) 3. Contol hazad Deciding on contol acvon depends on pevious instucvon (e.g., how much detegent based on how clean pio load tuns out) 33 1. Stuctual Hazads Conflict fo use of a esouce In MIPS pipeline with a single memoy Load/Stoe equies memoy access fo data InstucVon fetch would have to stall fo that cycle Causes a pipeline bubble Hence, pipelined datapaths equie sepaate instucvon/data memoies In eality, povide sepaate L1 instucvon cache and L1 data cache 34 17

I n s t. O d e 1. Stuctual Hazad #1: Single Memoy Time (clock cycles) Load Inst 1 Inst 2 Inst 3 Inst 4 I$ Read same memoy twice in same clock cycle Reg D$ Reg 35 I n s t. 1. Stuctual Hazad #2: Registes (1/2) sw Inst 1 Time (clock cycles) O d e Inst 2 Inst 3 Inst 4 I$ Reg D$ Reg Can we ead and wite to egistes simultaneously? 36 18

1. Stuctual Hazad #2: Registes (2/2) Two diffeent soluvons have been used: 1) RegFile access is VERY fast: takes less than half the Vme of stage Wite to Registes duing fist half of each clock cycle Read fom Registes duing second half of each clock cycle 2) Build RegFile with independent ead and wite pots Result: can pefom Read and Wite duing same clock cycle 37 2. Data Hazads An instucvon depends on complevon of data access by a pevious instucvon add $s0, $t0, $t1 sub $t2, $s0, $t3 38 19

Fowading (aka Bypassing) Use esult when it is computed Don t wait fo it to be stoed in a egiste Requies exta connecvons in the datapath 39 Coected Datapath fo Fowading? 40 20

Fowading Paths Chapte 4 The Pocesso 41 Load- Use Data Hazad Can t always avoid stalls by fowading If value not computed when needed Can t fowad backwad in Vme! 42 21

Stall/Bubble in the Pipeline Stall inseted hee Chapte 4 The Pocesso 43 Pipelining and ISA Design MIPS InstucVon Set designed fo pipelining All instucvons ae 32- bits Easie to fetch and decode in one cycle x86: 1- to 17- byte instucvons (x86 HW actually tanslates to intenal RISC instucvons!) Few and egula instucvon fomats, 2 souce egiste fields always in same place Can decode and ead egistes in one step Memoy opeands only in Loads and Stoes Can calculate addess 3 d stage, access memoy 4 th stage Alignment of memoy opeands Memoy access takes only one cycle 44 22

Why Isn t the DesVnaVon Registe Always in the Same Field in MIPS ISA? 31 26 21 op s t d shamt funct Need to have 2 pat immediate if 2 souces and 1 desvnavon always in same place 16 31 6 bits 26 5 bits 21 5 bits 16 5 bits 5 bits 6 bits 0 op s t immediate 6 bits 5 bits 5 bits 16 bits 11 6 0 SPUR pocesso (A poject Dave PaAeson and Randy woked on togethe) 45 3. Contol Hazads Banch detemines flow of contol Fetching next instucvon depends on banch outcome Pipeline can t always fetch coect instucvon SVll woking on ID stage of banch BEQ, BNE in MIPS pipeline Simple soluvon OpVon 1: Stall on evey banch unvl have new PC value Would add 2 bubbles/clock cycles fo evey Banch! (~ 20% of instucvons executed) 46 23

I n s t. O d e beq Inst 1 Inst 2 Inst 3 Inst 4 Stall => 2 Bubbles/Clocks Time (clock cycles) I$ Reg D$ Reg Whee do we do the compae fo the banch? 47 3. Contol Hazad: Banching OpVmizaVon #1: Inset special banch compaato in Stage 2 As soon as instucvon is decoded (Opcode idenvfies it as a banch), immediately make a decision and set the new value of the PC Benefit: since banch is complete in Stage 2, only one unnecessay instucvon is fetched, so only one no- op is needed Side Note: means that banches ae idle in Stages 3, 4 and 5 48 24

Coected Datapath fo BEQ/BNE? 49 Student RouleAe? One Clock Cycle Stall Time (clock cycles) I n s t. O d e beq Inst 1 Inst 2 Inst 3 Inst 4 I$ Reg D$ Reg Banch compaato moved to Decode stage. 50 25

Pipelined ExecuVon Pipelined Datapath Agenda Stuctual and Data Hazads Contol Hazads 51 3. Contol Hazads OpVon 2: Pedict outcome of a banch, fix up if guess wong Must cancel all instucvons in pipeline that depended on guess that was wong Simplest hadwae if we pedict that all banches ae NOT taken Why? 52 26

3. Contol Hazad: Banching OpVon #3: Redefine banches Old definivon: if we take the banch, none of the instucvons a~e the banch get executed by accident New definivon: whethe o not we take the banch, the single instucvon immediately following the banch gets executed (the banch- delay slot) Delayed Banch means we always execute inst a0e banch This opvmizavon is used with MIPS 53 3. Contol Hazad: Banching Notes on Banch- Delay Slot Wost- Case Scenaio: put a no- op in the banch- delay slot BeAe Case: place some instucvon peceding the banch in the banch- delay slot as long as the changed doesn t affect the logic of pogam Re- odeing instucvons is common way to speed up pogams Compile usually finds such an instucvon 50% of Vme Jumps also have a delay slot 54 27

Example: Nondelayed vs. Delayed Banch Nondelayed Banch Delayed Banch o $8, $9, $10 add $1, $2, $3 add $1, $2, $3 sub $4, $5, $6 beq $1, $4, Exit xo $10, $1, $11 sub $4, $5, $6 beq $1, $4, Exit o $8, $9, $10 xo $10, $1, $11 Exit: Exit: 55 Delayed Banch/Jump and MIPS ISA? Why does JAL put PC+8 in egiste 31? 56 28

Code Scheduling to Avoid Stalls Reode code to avoid use of load esult in the next instucvon C code fo A = B + E; C = B + F; stall stall lw $t1, 0($t0) lw $t2, 4($t0) add $t3, $t1, $t2 sw $t3, 12($t0) lw $t4, 8($t0) add $t5, $t1, $t4 sw $t5, 16($t0) 13 cycles lw $t1, 0($t0) lw $t2, 4($t0) lw $t4, 8($t0) add $t3, $t1, $t2 sw $t3, 12($t0) add $t5, $t1, $t4 sw $t5, 16($t0) 11 cycles 58 Pee InstucVon I. Thanks to pipelining, I have educed the time it took me to wash my one shit. II. Longe pipelines ae always a win (since less wok pe stage & a faste clock). A)(oange) I is Tue and II is Tue B)(geen) I is False and II is Tue C)(pink) I is Tue and II is False 59 29

And, in Conclusion, Pipelining impoves pefomance by inceasing instucvon thoughput: exploits ILP Executes mulvple instucvons in paallel Each instucvon has the same latency Key enable is placing egistes between pipeline stages Subject to hazads Stuctue, data, contol Stalls educe pefomance But ae equied to get coect esults Compile can aange code to avoid hazads and stalls Requies knowledge of the pipeline stuctue 61 30