RISC Design: Pipelining

Size: px

Start display at page:

Download "RISC Design: Pipelining"

Helena Wade
5 years ago
Views:

RISC Design: Pipelining Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering

1 RISC Design: Pipelining Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay CP-226: Computer Architecture Lecture 10 (20 Feb 2013)

2 0-25 Jump Shift 4 Add RegDst opcode left 2 CONTROL Branch 0 mux 1 MemtoReg PC Instr. mem. Combined Datapaths Sign ext. Reg. File Shift left 2 Cont. zero MemWrite MemRead Data mem. 0 mux Feb 2013 Computer Architecture@MNIT 2

3 Pipelining in a Computer Ø Divide datapath into nearly equal tasks, to be performed serially and requiring non- overlapping resources. Ø Insert registers at task boundaries in the datapath; registers pass the output data from one task as input data to the next task. Ø Synchronize tasks with a clock having a cycle Bme that just exceeds the Bme required by the longest task. Ø Break each instrucbon down into a fixed number of tasks so that instrucbons can be executed in a staggered fashion. 20 Feb 2013 Computer Architecture@MNIT 3

4 Single-Cycle Datapath Instruction class Instr. fetch (IF) Instr. Decode (also reg. file read) (ID) Execution ( Operation) (EX) Data access (MEM) Write Back (Reg. file write) (WB) Total time lw 1ns 1ns 8ns sw 1ns 8ns R-format add, sub, and, or, slt 1ns 1ns 8ns B-format, beq 1ns 8ns No operation on data; idle time equalizes instruction length to a fixed clock period. 20 Feb 2013 Computer Architecture@MNIT 4

5 Execution Time: Single-Cycle lw $1, 100($0) lw $2, 200($0) lw $3, 300($0) IF ID EX MEM WB IF ID EX MEM WB IF ID Time (ns) EX MEM WB Clock cycle Bme = 8 ns Total Bme for execubng three lw instrucbons = 24 ns 20 Feb 2013 Computer Architecture@MNIT 5

6 Pipelined Datapath Instruction class Instr. fetch (IF) Instr. Decode (also reg. file read) (ID) Execution ( Operation) (EX) Data access (MEM) Write Back (Reg. file write) (WB) Total time lw 1ns 1ns 10ns sw 1ns 1ns 10ns R-format: add, sub, and, or, slt 1ns 1ns 10ns B-format: beq 1ns 1ns 10ns No operation on data; idle time inserted to equalize instruction lengths. 20 Feb 2013 Computer Architecture@MNIT 6

7 Execution Time: Pipeline lw $1, 100($0) lw $2, 200($0) lw $3, 300($0) IF ID EX MEM RW IF ID EX MEM RW IF ID EX MEM RW Time (ns) Clock cycle time = 2 ns, four times faster than single-cycle clock Total time for executing three lw instructions = 14 ns Single-cycle time 24 Performance ratio = = = 1.7 Pipeline time Feb 2013 Computer Architecture@MNIT 7

8 Pipeline Performance Clock cycle Bme = 2 ns 1,003 lw instruc+ons: Total Bme for execubng 1,003 lw instrucbons = 2,014 ns Single- cycle Bme 8,024 Performance rabo = = = 3.98 Pipeline Bme 2,014 10,003 lw instruc+ons: Performance rabo = 80,024 / 20,014 = Clock cycle rabo (4) Pipeline performance approaches clock- cycle rabo for long programs. 20 Feb 2013 Computer Architecture@MNIT 8

9 IF: Instr. fetch Single-Cycle Datapath ID: Instr. decode, reg. file read EX: Execute, address calc. MEM: mem. access WB: write back PC 4 Add Instr. mem opcode RegDst Sign ext. CONTROL RegWrite Reg. File Shift left 2 Branch Src Op Cont. zero 20 Feb 2013 Computer Architecture@MNIT 9 MemWrite MemRead Data mem. MemtoReg 0 mux 1

10 Pipelining of RISC Instructions Fetch Instruction Examine Opcode Fetch Operands Perform Operation Store Result IF ID EX MEM WB Instruction Instruction Execute Memory Write Fetch Decode and Operation Back Fetch operands to Reg file Although an instruc/on takes five clock cycles, one instruc/on is completed every cycle. 20 Feb 2013 Computer 10

11 Pipeline Registers PC 4 Add Instr. mem. This requires a CONTROL not too different from single-cycle IF/ID ID/EX EX/MEM opcode RegDst Sign ext. CONTROL RegWrite Reg. File Shift left 2 Branch Src Op Cont. zero MemWrite MemRead Data mem. MemtoReg MEM/WB 0 mux 1 20 Feb 2013 Computer Architecture@MNIT 11

12 Pipeline Register Functions Four pipeline registers are added: Register name IF/ID ID/EX EX/MEM MEM/WB Data held PC+4, Instruction word (IW) PC+4, R1, R2, IW(0-15) sign ext., IW(11-15) PC+4, zero, Result, R2, IW(11-15) or IW(16-20) M[Result], Result, IW(11-15) or IW(16-20) 20 Feb 2013 Computer 12

13 Pipelined Datapath PC 4 Add Instr mem for R-type for I-type lw IF/ID ID/EX EX/MEM MEM/WB opcode Reg. File Sign ext. Shift left 2 zero Data mem. 0 mux Feb 2013 Computer Architecture@MNIT 13

14 Five-Cycle Pipeline CC1 CC2 CC3 CC4 CC5 IM IF/ID ID, REG. READ ID/EX EX/MEM DM MEM/WB REG. WRITE 20 Feb 2013 Computer 14

15 Add Instruction add $t0, $s1, $s2 Machine instrucbon word opcode $s1 $s2 $t0 funcbon CC1 CC2 CC3 CC4 CC5 IM IF/ID ID, REG. READ ID/EX EX/MEM DM MEM/WB REG. WRITE IF ID EX MEM WB read $s1 add write $t0 read $s2 $s1+$s2 20 Feb 2013 Computer 15

16 Pipelined Datapath Executing add PC for R-type for I-type lw t0 Add Instr mem IF/ID ID/EX EX/MEM MEM/WB opcode s1 Reg. File s2 $s2 Sign ext. Shift left 2 $s1 zero addr Data mem data 0 mux 1 20 Feb 2013 Computer Architecture@MNIT 16

17 Load Instruction lw $t0, 1200 ($t1) opcode $t1 $t CC1 CC2 CC3 CC4 CC5 IM IF/ID ID, REG. READ ID/EX EX/MEM DM MEM/WB REG. WRITE IF ID EX MEM WB read $t1 add read write $t0 sign ext $t M[addr] Feb 2013 Computer 17

18 Pipelined Datapath Executing lw PC 4 Add for R-type for I-type lw t0 Instr mem IF/ID ID/EX EX/MEM MEM/WB opcode t1 Reg. File Sign ext. Shift left 2 $t1 zero Feb 2013 Computer Architecture@MNIT 18 addr Data mem data 0 mux 1

19 Store Instruction sw $t0, 1200 ($t1) opcode $t1 $t CC1 CC2 CC3 CC4 CC5 IM IF/ID ID, REG. READ ID/EX EX/MEM DM MEM/WB REG. WRITE IF ID EX MEM WB read $t1 add write sign ext $t M[addr] 1200 (addr) $t0 20 Feb 2013 Computer 19

20 Pipelined Datapath Executing sw PC 4 Add Instr mem for R-type for I-type lw IF/ID ID/EX EX/MEM MEM/WB opcode t0 t1 Reg. File Sign ext. $t0 Shift left 2 $t1 zero addr Data mem data 0 mux Feb 2013 Computer Architecture@MNIT 20

21 Executing a Program Consider a five- instruc+on segment: lw $10, 20($1) sub $11, $2, $3 add $12, $3, $4 lw $13, 24($1) add $14, $5, $6 20 Feb 2013 Computer Architecture@MNIT 21

22 Program Execution CC1 CC2 CC3 CC4 CC5 time IM IF/ID ID, REG. READ ID/EX EX/MEM DM MEM/WB REG. WRITE lw $10, 20($1) IM IF/ID add $12, $3, $4 ID, REG. READ ID/EX IM IF/ID ID, REG. READ EX/MEM ID/EX DM MEM/WB EX/MEM REG. WRITE DM MEM/WB sub $11, $2, $3 REG. WRITE Program instructions lw $13, 24($1) IM IF/ID ID, REG. READ ID/EX EX/MEM DM MEM/WB REG. WRITE add $14, $5, $6 IM IF/ID ID, REG. READ ID/EX 20 Feb 2013 Computer 22 EX/MEM DM MEM/WB REG. WRITE

23 IF: add $14, $5, $6 CC5 ID: lw $13, 24($1) EX: add $12, $3, $4 MEM: sub $11, $2, $3 WB: lw $10, 20($1) PC 4 Add Instr mem for R-type for I-type lw IF/ID ID/EX EX/MEM MEM/WB opcode Reg. File Sign ext. Shift left 2 zero Data mem. 0 mux Feb 2013 Computer Architecture@MNIT 23

24 Advantages of Pipeline A^er the fi^h cycle (CC5), one instrucbon is completed each cycle; CPI 1, neglecbng the inibal pipeline latency of 5 cycles. Pipeline latency is defined as the number of stages in the pipeline, or The number of clock cycles a@er which the first instruc+on is completed. The clock cycle Bme is about four Bmes shorter than that of single- cycle datapath and about the same as that of mulbcycle datapath. For mulbcycle datapath, CPI = 3.. So, pipelined execubon is faster, but Feb 2013 Computer Architecture@MNIT 24

25 Thank You 20 Feb 2013 Computer 25

Computer Architecture

Computer Architecture An Introduction Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/