EE382V-ICS: System-on-a-Chip (SoC) Design

Size: px
Start display at page:

Download "EE382V-ICS: System-on-a-Chip (SoC) Design"

Transcription

1 EE38V-CS: System-on-a-Chip (SoC) Design Hardware Synthesis and Architectures Source: D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner, Embedded System Design: Modeling, Synthesis, Verification, Chapter 6: Hardware Synthesis, Springer, 9. Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin Outline Design flow RTL architecture nput specification Specification profiling RTL synthesis Variable merging (Storage sharing) Operation Merging (FU sharing) Connection Merging (Bus sharing) Chaining and multi-cycling Data and control pipelining Scheduling Component interfacing Conclusions EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner

2 Hardware Synthesis Design Flow Compilation Estimation HLS Model generation RTL synthesis Logic synthesis Layout RTL Component Library Specification Compilation Tool Model Estimation HLS Allocation Binding Scheduling Model Generation RTL Model RTL Tools... EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 3 Hardware Synthesis Design flow RTL architecture nput specification Specification profiling RTL synthesis Variable merging (Storage sharing) Operation Merging (FU sharing) Connection Merging (Bus sharing) Chaining and multi-cycling Data and control pipelining Scheduling Component interfacing Conclusions EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 4

3 RTL Processor Architecture ler FSM controller Programmable controller Datapath components Storage components Functional units Connection components Pipelining Functional unit Datapath Structure Chaining Multicycling Forwarding Branch prediction Caching EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 5 RTL Processor with FSM ler Simple architecture Small number of states nputs Data nputs Output Logic Signals B B RF nput Logic Status Signals ALU MUL Memory FSM ler Outputs B3 Datapath Data Outputs EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 6 3

4 RTL with Programmable Complex architecture and datapath pipelining Advanced structural features Large number of states (CW or S) nputs Data nputs PC Cmem or PMem R or CWR Signals B B RF Offset AG Status Address SR ALU MUL Memory Programmable ler B3 Datapath Outputs Data nputs EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 7 Outline Design flow RTL architecture nput specification Specification profiling RTL synthesis Variable merging (Storage sharing) Operation Merging (FU sharing) Connection Merging (Bus sharing) Chaining and multi-cycling Data and control pipelining Scheduling Component interfacing Conclusions EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 8 4

5 nput Specification Programming language (C/C, ) Programming semantics requires pre-synthesis optimization System description language (SystemC, ) Simulation semantics requires pre-synthesis optimization /Data flow graph (CDFG) CDFG generation requires dependence analysis Finite state machine with data (FSMD) State interpretation requires some kind of scheduling RTL netlist RTL design that requires only input and output logic synthesis Hardware description language (Verilog / VHDL) HDL description requires RTL library and logic synthesis EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 9 C Code for Ones Counter Programming language semantics Sequential execution, Coding style to minimize coding HW design Parallel execution, Communication through signals : int OnesCounter(int Data){ : int Ocount = ; 3: int Temp, Mask = ; 4: while (Data > ) { 5: Temp = Data & Mask; 6 Ocount = Data Temp; 7: Data >>= ; 8: } 9: return Ocount; : } Function-based C code : while() { : while (Start == ); 3: Done = ; 4: Data = nput; 5: Ocount = ; 6: Mask = ; 7: while (Data>) { 8: Temp = Data & Mask; 9: Ocount = Ocount Temp; : Data >>= ; : } : Output = Ocount; 3: Done = ; 4: } RTL-based C code EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 5

6 CDFG for Ones Counter /Data flow graph Resembles programming language Loops, ifs, basic blocks (BBs) Explicit dependencies dependences between BBs Data dependences inside BBs Missing dependencies between BBs Start nput Data Mask Ocount Done Data Mask Ocount Done & >> Data Ocount Done Data > Output Done EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner FSMD for Ones Counter FSMD more detailed then CDFG States may represent clock cycles Conditionals and statements executed concurrently All statement in each state executed concurrently signal and variable assignments executed concurrently FSMD includes scheduling FSMD doesn't specify binding or connectivity EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 6

7 CDFG and FSMD for Ones Counter EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 3 RTL Specification for Ones Counter RTL Specification ler and datapath netlist nput and output tables for logic synthesis RTL library needed for netlist Present State S S S nputs: Start Data = nput logic table Next State S S S Output: Done Output logic table Start nport State RF Read Port A RF Read Port B ALU Shifter RF selector RF Write Outport S Z Output Logic Signals RF S RF[] RF[] subtract pass nport B3 RF[] RF[] Z Z B B RF[] increment pass B3 RF[] Z nput Logic status ALU Shifter RF[] RF[] RF[] RF[3] AND add pass pass B3 B3 RF[3] RF[] Z Z FSM ler Done B3 Datapath Outport RF[] RF[] pass shift right B3 RF[] disable Z enable RF[] = Data, RF[] = Mask, RF[] = Ocount, RF[3] = Temp EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 4 7

8 HDL description of Ones Counter HDL description Same as RTL description Several levels of abstraction Variable binding to storage Operation binding to FUs Transfer binding to connections Netlist must be synthesized Partial HLS may be needed : // : always@(posedge clk) 3: begin : output_logic 4: case (state) 5: // 6: : begin 7: B = RF[]; 8: B = RF[]; 9: B3 = alu(b, B, l_and); : RF[3] = B3; : next_state = ; : end 3: // 4: : begin 5: B = RF[]; 6: Outport <= B; 7: done <= ; 8: next_state = S; 9: end : endcase : end : endmodule EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 5 Outline Design flow RTL architecture nput specification Specification profiling RTL synthesis Variable merging (Storage sharing) Operation Merging (FU sharing) Connection Merging (Bus sharing) Chaining and multi-cycling Data and control pipelining Scheduling Component interfacing Conclusions EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 6 8

9 Profiling and Estimation Pre-synthesis optimization Preliminary scheduling Simple scheduling algorithm Profiling Operation usage Variable life-times Connection usage Estimation Performance Cost Power EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 7 Square-Root Algorithm (SRA) SQR = ((.875x.5y), x) x = ( a, b ) y= min ( a, b ) S a = n b = n Start S t = a t = b x = ( t, t ) y = min ( t, t ) t3 = x >> 3 t4 = y >> t5 = x t3 t6 = t4 t5 t7 = ( t6, x ) Done = Out = t7 EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 8 9

10 Variable and Operation Usage S Variable usage a b t t x y t3 t4 t5 t6 t7 No. of live variables 3 3 S a = n b = n Start S t = a t = b x = ( t, t ) y = min ( t, t ) t3 = x >> 3 t4 = y >> t5 = x t3 abs min >> - No. of operations S Max. no. of units Operation usage t6 = t4 t5 t7 = ( t6, x ) Done = Out = t7 EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 9 Connectivity usage S Max. no. of units Operation usage abs min >> - No. of operations S S a = n b = n Start t = a t = b x = ( t, t ) y = min ( t, t ) t3 = x >> 3 t4 = y >> a b t t x y t3 t4 t5 t6 t7 Connectivity usage t5 = x t3 abs O abs O t6 = t4 t5 min >>3 >> O O O O O t7 = ( t6, x ) Done = Out = t7 - O O EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner

11 Outline Design flow RTL architecture nput specification Specification profiling RTL synthesis Variable merging (Storage sharing) Operation Merging (FU sharing) Connection Merging (Bus sharing) Chaining and multi-cycling Data and control pipelining Scheduling Component interfacing Conclusions EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner Datapath Synthesis Variable Merging (Storage Sharing) Operation Merging (FU Sharing) Connection Merging (Bus Sharing) Register merging (RF sharing) Chaining and Multi-Cycling Data and Pipelining EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner

12 Register Sharing Register sharing Grouping variables with non-overlapping lifetimes Sharing reduces connectivity cost a c b d a, c b, d x y x, y Partial FSMD Datapath without register sharing Datapath with register sharing EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 3 General Partitioning Algorithm Compatibility graph Compatibility: Non-overlapping in time Not using the same resource Non-compatible: Overlapping in time Using the same resource Priority Critical path Same source, same destination no Start Create compatibility graph Merge highest priority nodes Upgrade compatibility graph All nodes incompatible yes Stop EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 4

13 / Variable Merging for SRA / / / (a) nitial compatibility graph (b) Compatibility graph after merging t3, t5, and t6 (c) Compatibility graph after merging t, x, and t7 (d) Compatibility graph after merging t and y R = [ a, t, x, t7 ] R = [ b, t, y, t3, t5, t6 ] R3 = [ t4 ] (e) Final compatibility graph (f) Final register assignments EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 5 Datapath with Shared Registers Variables combined into registers One functional unit for each operation R R R3 a b min - >> >>3 EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 6 3

14 Functional Unit Sharing Functional unit sharing Smaller number of FUs Larger connectivity cost Si x = a b a c b d Sj y = c d /- x y Partial FSMD Non-shared design Shared design EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 7 Operation Merging for SRA / a / / b / / a / / b / / / min / / / / / / / - min / / / / - nitial compatibility graph Compatibility graph after merging of and - Compatibility graph after merging of min,, and - Final graph partitions EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 8 4

15 Datapath with Shared Registers and FUs Variables combined into registers Operations combined into functional units R R R3 abs/ abs/min//- >> >>3 EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 9 Connection Usage for SRA S a = n b = n Start S t = a t = b x = ( t, t ) y = min ( t, t ) t3 = x >> 3 t4 = y >> t5 = x t3 t6 = t4 t5 t7 = ( t6, x ) Done = Out = t7 Find compatible connections for merging into buses A B C D E F G H J K L M N Connection usage table S S EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 3 5

16 Connection Merging for SRA Combine connection not used at the same time Priority to same source, same destination Priority to imum groups M K S S A B C D E F G H J K L M N Compatibility graph for input buses Bus assignment J L N Compatibility graph for output buses EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 3 Datapath with Shared Registers, FUs, Buses Minimal SRA architecture 3 registers 4 () functional units 4 buses EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 3 6

17 Register Merging into RFs Register merging: Port sharing Merge registers with non-overlapping access times No of ports is equal to simultaneous read/write accesses Register assignment S a = n b = n Start S t = a t = b x = ( t, t ) y = min ( t, t ) t3 = x >> 3 t4 = y >> t5 = x t3 R R S S t6 = t4 t5 R t7 = ( t6, x ) R3 R R3 Done = Out = t7 Compatibility graph Register access table EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 33 Datapath with Shared RF RF minimize connectivity cost by sharing ports n n R R3 R Bus Bus abs/ abs/min//- H >>3 >> Bus3 Bus4 Out EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 34 7

18 Outline Design flow RTL architecture nput specification Specification profiling RTL synthesis Variable merging (Storage sharing) Operation Merging (FU sharing) Connection Merging (Bus sharing) Chaining and multi-cycling Data and control pipelining Scheduling Component interfacing Conclusions EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 35 Datapath with Chaining Chaining connects two or more FUs Allows execution of two or more operation in a single clock cycle mproves performance at no cost S a = n b = n Start = S t = a t = b x = ( t, t ) t3 = ( t, t )>>3 t4 = min ( t, t )>> t5 = x t3 t6 = t4 t5 t7 = ( t6, x ) Done = n Out = t7 EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 36 8

19 Datapath with Chaining and Multi-Cycling Multi-cycling S S a = n b = n Start t = a t = b Operations that take more than one cycle Allows use of slower FUs Allows faster clock-cycle n n x = ( t, t ) t3 = ( t, t )>>3 [t4]= min ( t, t )>> R R R3 t5 = x t3 t4 = [min ( t, t ) >>] Bus Bus t6 = t4 t5 abs/ abs//- min t7 = ( t6, x ) Done = n Out = t7 Bus 3 >>3 >> Bus 4 Out EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 37 Outline Design flow RTL architecture nput specification Specification profiling RTL synthesis Variable merging (Storage sharing) Operation Merging (FU sharing) Connection Merging (Bus sharing) Chaining and multi-cycling Data and control pipelining Scheduling Component interfacing Conclusions EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 38 9

20 Pipelining Functional Unit pipelining Two or more operation executing at the same time Datapath pipelining Two or more register transfers executing at the same time Pipelining Two or more instructions generated at the same time EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 39 Functional Unit Pipelining () S S a = n b = n Start t = a t = b Operation delay cut in half Shorter clock cycle Dependencies may delay some states Extra NO states reduce performance gain x = ( t, t ) t3 = ( t, t )>>3 t4 = min( t, t )>> t5 = x t3 t6 = t4 t5 t7 = ( t6, x ) S8 Done = n Out = t7 EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 4

21 Functional Unit Pipelining () S a = n b = n Start S t = a t = b x = ( t, t ) t3 = ( t, t )>>3 t4 = min( t, t )>> Timing diagram with 4 additional NO states S S NO NO NO NO S8 t5 = x t3 t6 = t4 t5 Read R Read R Read R3 ALU stage ALU stage a a b b a b t t t t t3 min - min - t5 t4 t6 t7 t7 = ( t6, x ) S8 Done = n Out = t7 Shifters Write R Write R Write R3 Write Out a b t t t3 >>3 t4 >> t5 t6 t7 t7 EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 4 Datapath Pipelining () Register-to-register delay cut in equal parts Much shorter clock cycle Dependencies may delay some states Extra NO states reduce performance gain n n R R R3 Bus Bus ALU >>3 >> Bus3 Bus4 Out EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 4

22 Datapath pipelining () n n S a = n b = n Bus Bus R R R3 Start ALU S t = a Bus3 >>3 >> Bus4 t = b Out x = ( t, t ) t3 = ( t, t )>>3 t4 = min( t, t )>> t5 = x t3 t6 = t4 t5 Timing diagram with additional NO clock cycles Cycles Read R a t t Read R b t t Read R3 ALUn(L) a t t 8 9 x t3 x t5 t4 t4 ALUn(R) b t t t3 t5 ALUOut a b min x t6 x t6 8 t7 t7 = ( t6, x ) Shifters >>3 >> S8 Done = n Out = t7 Write R Write R Write R3 a b t t x t3 t4 t5 t6 t7 Write Out t7 EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 43 Datapath and Pipelining () Fetch delay cut into several parts Shorter clock cycle Conditionals may delay some states Extra NO states reduce performance gain nputs Data nputs signals S PC CMem CWR Register RF Mem a>b x = c d AG Offset Status Signals ALU / Bus Bus y = x - ler SR Outputs Register Data Outputs Bus3 Datapath EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 44

23 Data and Pipelining () nputs signals Data nputs 3 NOP cycles for the branch NOP cycles for data dependence PC AG CMem Offset CWR Status Signals SR Register ALU RF Mem / Bus Bus Bus3 ler Outputs Timing diagram with additional NO clock cycles Register Data Outputs Datapath S Cycle Operation Read PC a>b Read CWR Read RF(L) S a NO NO NO c NO NO x x = c d Read RF(R) Write ALUn(L) Write ALUn(R) Write ALUOut b a b d c d cd x x- y = x - Write RF x y Write SR a>b Write PC 3 4/ EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 45 Hardware Synthesis Design flow RTL architecture nput specification Specification profiling RTL synthesis Variable merging (Storage sharing) Operation Merging (FU sharing) Connection Merging (Bus sharing) Chaining and multi-cycling Data and control pipelining Scheduling Component interfacing Conclusions EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 46 3

24 Scheduling Scheduling assigns clock cycles to register transfers Non-constrained scheduling ASAP scheduling ALAP scheduling Constrained scheduling Resource constrained (RC) scheduling Given resources, minimize metrics (time, power, ) Time constrained (TC) scheduling Given time, minimize resources (FUs, storage, connections) EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 47 C and CDFG for SRA Algorithm a=n b=n Start t= a t= b x=(t,t) y=min(t,t) t3=x>>3 t4=y>> t5=x-t3 t6=t4t5 t7=(t6,x) Done= Out=t7 n n a b Start a b a b min >> >>3 - Out Done C flowchart CDFG EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 48 4

25 RC Scheduling ASAP schedule ALAP schedule Ready list with mobilities (ALAP ASAP) RC schedule (for single FU and shifters) EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 49 Scheduling Algorithms RC algorithm TC algorithm EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 5 5

26 TC Scheduling a b a b a b S a b a min - a b b >> >>3 - min >>3 min >>3 >> - >> - S8 Out ASAP Out ALAP Out TC schedule EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 5 Distribution Graphs for TC scheduling AU units Probability sum/state Shift units AU units Probability sum/state Shift units S a b. S a b. min min >> >> >>3. >> nitial probability distribution graph Graph after,, and were scheduled EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 5 6

27 Distribution Graphs for TC scheduling AU units Probability sum/state Shift units AU units Probability sum/state Shift units S a b. S a.. b... min. >>3. min. >>3. -. >>. -. >>..... Graph after,, -, min, >>3, and >> were scheduled Distribution graph for final schedule EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 53 Hardware Synthesis Design flow RTL architecture nput specification Specification profiling RTL synthesis Variable merging (Storage sharing) Operation Merging (FU sharing) Connection Merging (Bus sharing) Chaining and multi-cycling Data and control pipelining Scheduling Component interfacing Conclusions EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 54 7

28 nterface Synthesis Combine process and channel codes HW and protocol clock cycles may differ nsert a bus-interface component Communication in three parts: Freely schedulable code Scheduled with process code Schedule constrained code MAC driver for selected bus interface Bus interface mplemented by bus interface component from library EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 55 Bus nterface ler () ler Datapath CMem signals Reg RF Mem Bus offset Bus AG Status signals ALU / Bus 3 Bus 4 ready ack OutCntrl OutAddr OutData ndata nput logic Output logic signals Address NC Write Queue Read Queue MAC driver REQUEST GRANT CONTROL ADDRESS DATA EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 56 8

29 Bus nterface ler () ler Datapath CMem signals S ready = Reg RF Mem Bus ready = offset Bus OutAddr = BusAddr AG Status signals ALU / ready = OutData = BusData OutCntrl = WRTE_WORD ack = ready = ack = Bus 3 Bus 4 ready ack OutCntrl OutAddr OutData ndata signals Output logic Write Read Queue Queue nput Address logic NC MAC driver REQUEST GRANT CONTROL ADDRESS DATA Bus protocol EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 57 Transducer/ Bridge Translates one protocol into another ler receives data with protocol and writes into queue ler reads from queue and sends data with protocol PE Bus Bus Transducer PE nterrupt Ready Ack nterrupt Processor <clk> ler <clk> Ready Ack ler <clk> Processor <clk> Data Data Memory Queue <clk3> Memory EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 58 9

30 Conclusions Synthesis techniques Variable Merging (Storage Sharing) Operation Merging (FU Sharing) Connection Merging (Bus Sharing) Architecture techniques Chaining and Multi-Cycling Data and Pipelining Forwarding and Caching Scheduling Metric constrained scheduling nterfacing Part of HW component Bus interface unit f too complex, use partial order EE38V-CS: SoC Design 9 D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner 59 3

EE382V: Embedded System Design and Modeling

EE382V: Embedded System Design and Modeling EE382V: Embedded System Design and - Introduction Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu : Outline Introduction Embedded systems System-level

More information

Digital Systems Design

Digital Systems Design Digital Systems Design Digital Systems Design and Test Dr. D. J. Jackson Lecture 1-1 Introduction Traditional digital design Manual process of designing and capturing circuits Schematic entry System-level

More information

RISC Central Processing Unit

RISC Central Processing Unit RISC Central Processing Unit Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Spring, 2014 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/

More information

EE382V: Embedded System Design and Modeling

EE382V: Embedded System Design and Modeling EE382V: Embedded System Design and System-Level Design Tools Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu : Outline Overview System-level design

More information

Lecture Topics. Announcements. Today: Pipelined Processors (P&H ) Next: continued. Milestone #4 (due 2/23) Milestone #5 (due 3/2)

Lecture Topics. Announcements. Today: Pipelined Processors (P&H ) Next: continued. Milestone #4 (due 2/23) Milestone #5 (due 3/2) Lecture Topics Today: Pipelined Processors (P&H 4.5-4.10) Next: continued 1 Announcements Milestone #4 (due 2/23) Milestone #5 (due 3/2) 2 1 ISA Implementations Three different strategies: single-cycle

More information

Hardware Implementation of Automatic Control Systems using FPGAs

Hardware Implementation of Automatic Control Systems using FPGAs Hardware Implementation of Automatic Control Systems using FPGAs Lecturer PhD Eng. Ionel BOSTAN Lecturer PhD Eng. Florin-Marian BÎRLEANU Romania Disclaimer: This presentation tries to show the current

More information

REVOLUTIONIZING THE COMPUTING LANDSCAPE AND BEYOND.

REVOLUTIONIZING THE COMPUTING LANDSCAPE AND BEYOND. December 3-6, 2018 Santa Clara Convention Center CA, USA REVOLUTIONIZING THE COMPUTING LANDSCAPE AND BEYOND. https://tmt.knect365.com/risc-v-summit @risc_v ACCELERATING INFERENCING ON THE EDGE WITH RISC-V

More information

Policy-Based RTL Design

Policy-Based RTL Design Policy-Based RTL Design Bhanu Kapoor and Bernard Murphy bkapoor@atrenta.com Atrenta, Inc., 2001 Gateway Pl. 440W San Jose, CA 95110 Abstract achieving the desired goals. We present a new methodology to

More information

Evolution of DSP Processors. Kartik Kariya EE, IIT Bombay

Evolution of DSP Processors. Kartik Kariya EE, IIT Bombay Evolution of DSP Processors Kartik Kariya EE, IIT Bombay Agenda Expected features of DSPs Brief overview of early DSPs Multi-issue DSPs Case Study: VLIW based Processor (SPXK5) for Mobile Applications

More information

Disseny físic. Disseny en Standard Cells. Enric Pastor Rosa M. Badia Ramon Canal DM Tardor DM, Tardor

Disseny físic. Disseny en Standard Cells. Enric Pastor Rosa M. Badia Ramon Canal DM Tardor DM, Tardor Disseny físic Disseny en Standard Cells Enric Pastor Rosa M. Badia Ramon Canal DM Tardor 2005 DM, Tardor 2005 1 Design domains (Gajski) Structural Processor, memory ALU, registers Cell Device, gate Transistor

More information

RECONFIGURABLE RADIO DESIGN AND VERIFICATION

RECONFIGURABLE RADIO DESIGN AND VERIFICATION RECONFIGURABLE RADIO DESIGN AND VERIFICATION September, 10, 2015 Vladimir Ivanov, LG Electronics Markus Mueck, Intel Corporation Seungwon Choi, Hanyang University DVCON 2015 Bangalore, India OUTLINE Reconfigurable

More information

Architecture and Synthesis for Multi-Cycle On-Chip Communication

Architecture and Synthesis for Multi-Cycle On-Chip Communication Architecture and Synthesis for MultiCycle OnChip Communication Jason Cong VLSI CAD Lab Computer Science Department University of California, Los Angeles cong@cs cs.ucla.edu http://cadlab cadlab.cs.ucla.edu

More information

Computer Architecture

Computer Architecture Computer Architecture An Introduction Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/

More information

7/11/2012. Single Cycle (Review) CSE 2021: Computer Organization. Multi-Cycle Implementation. Single Cycle with Jump. Pipelining Analogy

7/11/2012. Single Cycle (Review) CSE 2021: Computer Organization. Multi-Cycle Implementation. Single Cycle with Jump. Pipelining Analogy CSE 2021: Computer Organization Single Cycle (Review) Lecture-10 CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan CSE-2021 July-12-2012 2 Single Cycle with Jump Multi-Cycle Implementation

More information

Lecture 4&5 CMOS Circuits

Lecture 4&5 CMOS Circuits Lecture 4&5 CMOS Circuits Xuan Silvia Zhang Washington University in St. Louis http://classes.engineering.wustl.edu/ese566/ Worst-Case V OL 2 3 Outline Combinational Logic (Delay Analysis) Sequential Circuits

More information

Computer Aided Design of Electronics

Computer Aided Design of Electronics Computer Aided Design of Electronics [Datorstödd Elektronikkonstruktion] Zebo Peng, Petru Eles, and Nima Aghaee Embedded Systems Laboratory IDA, Linköping University www.ida.liu.se/~tdts01 Electronic Systems

More information

Pipelining A B C D. Readings: Example: Doing the laundry. Ann, Brian, Cathy, & Dave. each have one load of clothes to wash, dry, and fold

Pipelining A B C D. Readings: Example: Doing the laundry. Ann, Brian, Cathy, & Dave. each have one load of clothes to wash, dry, and fold Pipelining Readings: 4.5-4.8 Example: Doing the laundry Ann, Brian, Cathy, & Dave A B C D each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 40 minutes Folder takes

More information

Low Power Design Methods: Design Flows and Kits

Low Power Design Methods: Design Flows and Kits JOINT ADVANCED STUDENT SCHOOL 2011, Moscow Low Power Design Methods: Design Flows and Kits Reported by Shushanik Karapetyan Synopsys Armenia Educational Department State Engineering University of Armenia

More information

Chapter 4. Pipelining Analogy. The Processor. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop:

Chapter 4. Pipelining Analogy. The Processor. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Chapter 4 The Processor Part II Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup p = 2n/(0.5n + 1.5) 4 =

More information

Datorstödd Elektronikkonstruktion

Datorstödd Elektronikkonstruktion Datorstödd Elektronikkonstruktion [Computer Aided Design of Electronics] Zebo Peng, Petru Eles and Gert Jervan Embedded Systems Laboratory IDA, Linköping University http://www.ida.liu.se/~tdts80/~tdts80

More information

CS 110 Computer Architecture Lecture 11: Pipelining

CS 110 Computer Architecture Lecture 11: Pipelining CS 110 Computer Architecture Lecture 11: Pipelining Instructor: Sören Schwertfeger http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on

More information

(VE2: Verilog HDL) Software Development & Education Center

(VE2: Verilog HDL) Software Development & Education Center Software Development & Education Center (VE2: Verilog HDL) VLSI Designing & Integration Introduction VLSI: With the hardware market booming with the rise demand in chip driven products in consumer electronics,

More information

A B C D. Ann, Brian, Cathy, & Dave each have one load of clothes to wash, dry, and fold. Time

A B C D. Ann, Brian, Cathy, & Dave each have one load of clothes to wash, dry, and fold. Time Pipelining Readings: 4.5-4.8 Example: Doing the laundry A B C D Ann, Brian, Cathy, & Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 40 minutes Folder takes

More information

VLSI Design Verification and Test Delay Faults II CMPE 646

VLSI Design Verification and Test Delay Faults II CMPE 646 Path Counting The number of paths can be an exponential function of the # of gates. Parallel multipliers are notorious for having huge numbers of paths. It is possible to efficiently count paths in spite

More information

Run-Length Based Huffman Coding

Run-Length Based Huffman Coding Chapter 5 Run-Length Based Huffman Coding This chapter presents a multistage encoding technique to reduce the test data volume and test power in scan-based test applications. We have proposed a statistical

More information

IF ID EX MEM WB 400 ps 225 ps 350 ps 450 ps 300 ps

IF ID EX MEM WB 400 ps 225 ps 350 ps 450 ps 300 ps CSE 30321 Computer Architecture I Fall 2010 Homework 06 Pipelined Processors 85 points Assigned: November 2, 2010 Due: November 9, 2010 PLEASE DO THE ASSIGNMENT ON THIS HANDOUT!!! Problem 1: (25 points)

More information

Lecture 1. Tinoosh Mohsenin

Lecture 1. Tinoosh Mohsenin Lecture 1 Tinoosh Mohsenin Today Administrative items Syllabus and course overview Digital systems and optimization overview 2 Course Communication Email Urgent announcements Web page http://www.csee.umbc.edu/~tinoosh/cmpe650/

More information

2002 IEEE International Solid-State Circuits Conference 2002 IEEE

2002 IEEE International Solid-State Circuits Conference 2002 IEEE Outline 802.11a Overview Medium Access Control Design Baseband Transmitter Design Baseband Receiver Design Chip Details What is 802.11a? IEEE standard approved in September, 1999 12 20MHz channels at 5.15-5.35

More information

A HIGH PERFORMANCE HARDWARE ARCHITECTURE FOR HALF-PIXEL ACCURATE H.264 MOTION ESTIMATION

A HIGH PERFORMANCE HARDWARE ARCHITECTURE FOR HALF-PIXEL ACCURATE H.264 MOTION ESTIMATION A HIGH PERFORMANCE HARDWARE ARCHITECTURE FOR HALF-PIXEL ACCURATE H.264 MOTION ESTIMATION Sinan Yalcin and Ilker Hamzaoglu Faculty of Engineering and Natural Sciences, Sabanci University, 34956, Tuzla,

More information

IF ID EX MEM WB 400 ps 225 ps 350 ps 450 ps 300 ps

IF ID EX MEM WB 400 ps 225 ps 350 ps 450 ps 300 ps CSE 30321 Computer Architecture I Fall 2011 Homework 06 Pipelined Processors 75 points Assigned: November 1, 2011 Due: November 8, 2011 PLEASE DO THE ASSIGNMENT ON THIS HANDOUT!!! Problem 1: (15 points)

More information

VLSI System Testing. Outline

VLSI System Testing. Outline ECE 538 VLSI System Testing Krish Chakrabarty System-on-Chip (SOC) Testing ECE 538 Krish Chakrabarty 1 Outline Motivation for modular testing of SOCs Wrapper design IEEE 1500 Standard Optimization Test

More information

CSE502: Computer Architecture CSE 502: Computer Architecture

CSE502: Computer Architecture CSE 502: Computer Architecture CSE 502: Computer Architecture Out-of-Order Schedulers Data-Capture Scheduler Dispatch: read available operands from ARF/ROB, store in scheduler Commit: Missing operands filled in from bypass Issue: When

More information

Detector Implementations Based on Software Defined Radio for Next Generation Wireless Systems Janne Janhunen

Detector Implementations Based on Software Defined Radio for Next Generation Wireless Systems Janne Janhunen GIGA seminar 11.1.2010 Detector Implementations Based on Software Defined Radio for Next Generation Wireless Systems Janne Janhunen janne.janhunen@ee.oulu.fi 2 Outline Introduction Benefits and Challenges

More information

Pipelined Processor Design

Pipelined Processor Design Pipelined Processor Design COE 38 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline Pipelining versus Serial

More information

FPGA & Pulse Width Modulation. Digital Logic. Programing the FPGA 7/23/2015. Time Allotment During the First 14 Weeks of Our Advanced Lab Course

FPGA & Pulse Width Modulation. Digital Logic. Programing the FPGA 7/23/2015. Time Allotment During the First 14 Weeks of Our Advanced Lab Course 1.9.8.7.6.5.4.3.2.1.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 DAC Vin 7/23/215 FPGA & Pulse Width Modulation Allotment During the First 14 Weeks of Our Advanced Lab Course Sigma Delta Pulse Width Modulated

More information

Tomasolu s s Algorithm

Tomasolu s s Algorithm omasolu s s Algorithm Fall 2007 Prof. homas Wenisch http://www.eecs.umich.edu/courses/eecs4 70 Floating Point Buffers (FLB) ag ag ag Storage Bus Floating Point 4 3 Buffers FLB 6 5 5 4 Control 2 1 1 Result

More information

Low Power Design Part I Introduction and VHDL design. Ricardo Santos LSCAD/FACOM/UFMS

Low Power Design Part I Introduction and VHDL design. Ricardo Santos LSCAD/FACOM/UFMS Low Power Design Part I Introduction and VHDL design Ricardo Santos ricardo@facom.ufms.br LSCAD/FACOM/UFMS Motivation for Low Power Design Low power design is important from three different reasons Device

More information

Hardware-Software Co-Design Cosynthesis and Partitioning

Hardware-Software Co-Design Cosynthesis and Partitioning Hardware-Software Co-Design Cosynthesis and Partitioning EE8205: Embedded Computer Systems http://www.ee.ryerson.ca/~courses/ee8205/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer

More information

ESE535: Electronic Design Automation. Previously. Today. Precedence. Conclude. Precedence Constrained

ESE535: Electronic Design Automation. Previously. Today. Precedence. Conclude. Precedence Constrained ESE535: Electronic Design Automation Day 5: January, 013 Scheduling Variants and Approaches Penn ESE535 Spring 013 -- DeHon 1 Previously Resources aren t free Share to reduce costs Schedule operations

More information

An Efficent Real Time Analysis of Carry Select Adder

An Efficent Real Time Analysis of Carry Select Adder An Efficent Real Time Analysis of Carry Select Adder Geetika Gesu Department of Electronics Engineering Abha Gaikwad-Patil College of Engineering Nagpur, Maharashtra, India E-mail: geetikagesu@gmail.com

More information

EE 434 ASIC and Digital Systems. Prof. Dae Hyun Kim School of Electrical Engineering and Computer Science Washington State University.

EE 434 ASIC and Digital Systems. Prof. Dae Hyun Kim School of Electrical Engineering and Computer Science Washington State University. EE 434 ASIC and Digital Systems Prof. Dae Hyun Kim School of Electrical Engineering and Computer Science Washington State University Preliminaries VLSI Design System Specification Functional Design RTL

More information

Reduction. CSCE 6730 Advanced VLSI Systems. Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are

Reduction. CSCE 6730 Advanced VLSI Systems. Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are Lecture e 8: Peak Power Reduction CSCE 6730 Advanced VLSI Systems Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed from various books, websites, authors

More information

Suggested Readings! Lecture 12" Introduction to Pipelining! Example: We have to build x cars...! ...Each car takes 6 steps to build...! ! Readings!

Suggested Readings! Lecture 12 Introduction to Pipelining! Example: We have to build x cars...! ...Each car takes 6 steps to build...! ! Readings! 1! CSE 30321 Lecture 12 Introduction to Pipelining! CSE 30321 Lecture 12 Introduction to Pipelining! 2! Suggested Readings!! Readings!! H&P: Chapter 4.5-4.7!! (Over the next 3-4 lectures)! Lecture 12"

More information

Data Word Length Reduction for Low-Power DSP Software

Data Word Length Reduction for Low-Power DSP Software EE382C: LITERATURE SURVEY, APRIL 2, 2004 1 Data Word Length Reduction for Low-Power DSP Software Kyungtae Han Abstract The increasing demand for portable computing accelerates the study of minimizing power

More information

ECE473 Computer Architecture and Organization. Pipeline: Introduction

ECE473 Computer Architecture and Organization. Pipeline: Introduction Computer Architecture and Organization Pipeline: Introduction Lecturer: Prof. Yifeng Zhu Fall, 2015 Portions of these slides are derived from: Dave Patterson UCB Lec 11.1 The Laundry Analogy Student A,

More information

Advanced FPGA Design. Tinoosh Mohsenin CMPE 491/691 Spring 2012

Advanced FPGA Design. Tinoosh Mohsenin CMPE 491/691 Spring 2012 Advanced FPGA Design Tinoosh Mohsenin CMPE 491/691 Spring 2012 Today Administrative items Syllabus and course overview Digital signal processing overview 2 Course Communication Email Urgent announcements

More information

Out-of-Order Execution. Register Renaming. Nima Honarmand

Out-of-Order Execution. Register Renaming. Nima Honarmand Out-of-Order Execution & Register Renaming Nima Honarmand Out-of-Order (OOO) Execution (1) Essence of OOO execution is Dynamic Scheduling Dynamic scheduling: processor hardware determines instruction execution

More information

Advanced Digital Logic Design

Advanced Digital Logic Design \ / Advanced Digital Logic Design Using VHDL, State Machines, and Synthesis for FPGAs Sunggu Lee С ENGAGE 1% Learning" Australia Canada Mexico Singapore Spain United Kingdom United States Ф Ф ФФтшш»» '

More information

CMOS VLSI IC Design. A decent understanding of all tasks required to design and fabricate a chip takes years of experience

CMOS VLSI IC Design. A decent understanding of all tasks required to design and fabricate a chip takes years of experience CMOS VLSI IC Design A decent understanding of all tasks required to design and fabricate a chip takes years of experience 1 Commonly used keywords INTEGRATED CIRCUIT (IC) many transistors on one chip VERY

More information

Chapter 16 - Instruction-Level Parallelism and Superscalar Processors

Chapter 16 - Instruction-Level Parallelism and Superscalar Processors Chapter 16 - Instruction-Level Parallelism and Superscalar Processors Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ L. Tarrataca Chapter 16 - Superscalar Processors 1 / 78 Table of Contents I 1 Overview

More information

Mapping Multiplexers onto Hard Multipliers in FPGAs

Mapping Multiplexers onto Hard Multipliers in FPGAs Mapping Multiplexers onto Hard Multipliers in FPGAs Peter Jamieson and Jonathan Rose The Edward S. Rogers Sr. Department of Electrical and Computer Engineering University of Toronto Modern FPGAs Consist

More information

Simulation Performance Optimization of Virtual Prototypes Sammidi Mounika, B S Renuka

Simulation Performance Optimization of Virtual Prototypes Sammidi Mounika, B S Renuka Simulation Performance Optimization of Virtual Prototypes Sammidi Mounika, B S Renuka Abstract Virtual prototyping is becoming increasingly important to embedded software developers, engineers, managers

More information

EECE 321: Computer Organiza5on

EECE 321: Computer Organiza5on EECE 321: Computer Organiza5on Mohammad M. Mansour Dept. of Electrical and Compute Engineering American University of Beirut Lecture 21: Pipelining Processor Pipelining Same principles can be applied to

More information

Lecture 4: Introduction to Pipelining

Lecture 4: Introduction to Pipelining Lecture 4: Introduction to Pipelining Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes A B C D Dryer takes 40 minutes Folder

More information

Pipelined Architecture (2A) Young Won Lim 4/10/18

Pipelined Architecture (2A) Young Won Lim 4/10/18 Pipelined Architecture (2A) Copyright (c) 2014-2018 Young W. Lim. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2

More information

Pipelined Architecture (2A) Young Won Lim 4/7/18

Pipelined Architecture (2A) Young Won Lim 4/7/18 Pipelined Architecture (2A) Copyright (c) 2014-2018 Young W. Lim. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2

More information

Stratix II Filtering Lab

Stratix II Filtering Lab October 2004, ver. 1.0 Application Note 362 Introduction The filtering reference design provided in the DSP Development Kit, Stratix II Edition, shows you how to use the Altera DSP Builder for system design,

More information

EECS 470. Tomasulo s Algorithm. Lecture 4 Winter 2018

EECS 470. Tomasulo s Algorithm. Lecture 4 Winter 2018 omasulo s Algorithm Winter 2018 Slides developed in part by Profs. Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, yson, Vijaykumar, and Wenisch of Carnegie Mellon University,

More information

Low-Power CMOS VLSI Design

Low-Power CMOS VLSI Design Low-Power CMOS VLSI Design ( 范倫達 ), Ph. D. Department of Computer Science, National Chiao Tung University, Taiwan, R.O.C. Fall, 2017 ldvan@cs.nctu.edu.tw http://www.cs.nctu.tw/~ldvan/ Outline Introduction

More information

Computer Science 246. Advanced Computer Architecture. Spring 2010 Harvard University. Instructor: Prof. David Brooks

Computer Science 246. Advanced Computer Architecture. Spring 2010 Harvard University. Instructor: Prof. David Brooks Advanced Computer Architecture Spring 2010 Harvard University Instructor: Prof. dbrooks@eecs.harvard.edu Lecture Outline Instruction-Level Parallelism Scoreboarding (A.8) Instruction Level Parallelism

More information

D16550 IP Core. Configurable UART with FIFO v. 2.25

D16550 IP Core. Configurable UART with FIFO v. 2.25 2017 D16550 IP Core Configurable UART with FIFO v. 2.25 C O M P A N Y O V E R V I E W Digital Core Design is a leading IP Core provider and a SystemonChip design house. The company was founded in 1999

More information

EECS150 - Digital Design Lecture 28 Course Wrap Up. Recap 1

EECS150 - Digital Design Lecture 28 Course Wrap Up. Recap 1 EECS150 - Digital Design Lecture 28 Course Wrap Up Dec. 5, 2013 Prof. Ronald Fearing Electrical Engineering and Computer Sciences University of California, Berkeley (slides courtesy of Prof. John Wawrzynek)

More information

Digital Integrated CircuitDesign

Digital Integrated CircuitDesign Digital Integrated CircuitDesign Lecture 13 Building Blocks (Multipliers) Register Adder Shift Register Adib Abrishamifar EE Department IUST Acknowledgement This lecture note has been summarized and categorized

More information

CHAPTER 4 FIELD PROGRAMMABLE GATE ARRAY IMPLEMENTATION OF FIVE LEVEL CASCADED MULTILEVEL INVERTER

CHAPTER 4 FIELD PROGRAMMABLE GATE ARRAY IMPLEMENTATION OF FIVE LEVEL CASCADED MULTILEVEL INVERTER 87 CHAPTER 4 FIELD PROGRAMMABLE GATE ARRAY IMPLEMENTATION OF FIVE LEVEL CASCADED MULTILEVEL INVERTER 4.1 INTRODUCTION The Field Programmable Gate Array (FPGA) is a high performance data processing general

More information

Course Outcome of M.Tech (VLSI Design)

Course Outcome of M.Tech (VLSI Design) Course Outcome of M.Tech (VLSI Design) PVL108: Device Physics and Technology The students are able to: 1. Understand the basic physics of semiconductor devices and the basics theory of PN junction. 2.

More information

Stratix Filtering Reference Design

Stratix Filtering Reference Design Stratix Filtering Reference Design December 2004, ver. 3.0 Application Note 245 Introduction The filtering reference designs provided in the DSP Development Kit, Stratix Edition, and in the DSP Development

More information

Chapter 3 Describing Logic Circuits Dr. Xu

Chapter 3 Describing Logic Circuits Dr. Xu Chapter 3 Describing Logic Circuits Dr. Xu Chapter 3 Objectives Selected areas covered in this chapter: Operation of truth tables for AND, NAND, OR, and NOR gates, and the NOT (INVERTER) circuit. Boolean

More information

ECOM 4311 Digital System Design using VHDL. Chapter 9 Sequential Circuit Design: Practice

ECOM 4311 Digital System Design using VHDL. Chapter 9 Sequential Circuit Design: Practice ECOM 4311 Digital System Design using VHDL Chapter 9 Sequential Circuit Design: Practice Outline 1. Poor design practice and remedy 2. More counters 3. Register as fast temporary storage 4. Pipelined circuit

More information

Compiler Optimisation

Compiler Optimisation Compiler Optimisation 6 Instruction Scheduling Hugh Leather IF 1.18a hleather@inf.ed.ac.uk Institute for Computing Systems Architecture School of Informatics University of Edinburgh 2018 Introduction This

More information

Three-Stage Coil Gun

Three-Stage Coil Gun Three-Stage Coil Gun Final Project Report December 8, 2006 E155 Dan Pivonka and Michael Pugh Abstract: A coil gun is an electronic gun that fires a projectile by means of the magnetic field generated when

More information

Method We follow- How to Get Entry Pass in SEMICODUCTOR Industries for 2 nd year engineering students

Method We follow- How to Get Entry Pass in SEMICODUCTOR Industries for 2 nd year engineering students Method We follow- How to Get Entry Pass in SEMICODUCTOR Industries for 2 nd year engineering students FIG-2 Winter/Summer Training Level 1 (Basic & Mandatory) & Level 1.1 continues. Winter/Summer Training

More information

Understanding Engineers #2

Understanding Engineers #2 Understanding Engineers #! The graduate with a Science degree asks, "Why does it work?"! The graduate with an Engineering degree asks, "How does it work?"! The graduate with an Accounting degree asks,

More information

AN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER

AN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER AN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER K. RAMAMOORTHY 1 T. CHELLADURAI 2 V. MANIKANDAN 3 1 Department of Electronics and Communication

More information

SW simulation and Performance Analysis

SW simulation and Performance Analysis SW simulation and Performance Analysis In Multi-Processing Embedded Systems Eugenio Villar University of Cantabria Context HW/SW Embedded Systems Design Flow HW/SW Simulation Performance Analysis Design

More information

Chapter 1 Introduction

Chapter 1 Introduction Chapter 1 Introduction 1.1 Introduction There are many possible facts because of which the power efficiency is becoming important consideration. The most portable systems used in recent era, which are

More information

CZ3001 ADVANCED COMPUTER ARCHITECTURE

CZ3001 ADVANCED COMPUTER ARCHITECTURE CZ3001 ADVANCED COMPUTER ARCHITECTURE Lab 3 Report Abstract Pipelining is a process in which successive steps of an instruction sequence are executed in turn by a sequence of modules able to operate concurrently,

More information

Digital Systems Design

Digital Systems Design Digital Systems Design Clock Networks and Phase Lock Loops on Altera Cyclone V Devices Dr. D. J. Jackson Lecture 9-1 Global Clock Network & Phase-Locked Loops Clock management is important within digital

More information

Microprocessor & Interfacing Lecture Programmable Interval Timer

Microprocessor & Interfacing Lecture Programmable Interval Timer Microprocessor & Interfacing Lecture 30 8254 Programmable Interval Timer P A R U L B A N S A L A S S T P R O F E S S O R E C S D E P A R T M E N T D R O N A C H A R Y A C O L L E G E O F E N G I N E E

More information

Dual-K K Versus Dual-T T Technique for Gate Leakage Reduction : A Comparative Perspective

Dual-K K Versus Dual-T T Technique for Gate Leakage Reduction : A Comparative Perspective Dual-K K Versus Dual-T T Technique for Gate Leakage Reduction : A Comparative Perspective S. P. Mohanty, R. Velagapudi and E. Kougianos Dept of Computer Science and Engineering University of North Texas

More information

A FFT/IFFT Soft IP Generator for OFDM Communication System

A FFT/IFFT Soft IP Generator for OFDM Communication System A FFT/IFFT Soft IP Generator for OFDM Communication System Tsung-Han Tsai, Chen-Chi Peng and Tung-Mao Chen Department of Electrical Engineering, National Central University Chung-Li, Taiwan Abstract: -

More information

Clock-Powered CMOS: A Hybrid Adiabatic Logic Style for Energy-Efficient Computing

Clock-Powered CMOS: A Hybrid Adiabatic Logic Style for Energy-Efficient Computing Clock-Powered CMOS: A Hybrid Adiabatic Logic Style for Energy-Efficient Computing Nestoras Tzartzanis and Bill Athas nestoras@isiedu, athas@isiedu http://wwwisiedu/acmos Information Sciences Institute

More information

L9: Analog Building Blocks (OpAmps, A/D, D/A)

L9: Analog Building Blocks (OpAmps, A/D, D/A) L9: Analog Building Blocks (OpAmps, A/D, D/A) Courtesy of Dave Wentzloff. Used with permission. 1 Introduction to Operational Amplifiers v id in DC Model a v id LM741 Pinout out 10 to 15V Typically very

More information

Coverage Metrics. UC Berkeley EECS 219C. Wenchao Li

Coverage Metrics. UC Berkeley EECS 219C. Wenchao Li Coverage Metrics Wenchao Li EECS 219C UC Berkeley 1 Outline of the lecture Why do we need coverage metrics? Criteria for a good coverage metric. Different approaches to define coverage metrics. Different

More information

L9: Analog Building Blocks (OpAmps,, A/D, D/A)

L9: Analog Building Blocks (OpAmps,, A/D, D/A) L9: Analog Building Blocks (OpAmps,, A/D, D/A) Acknowledgement: Dave Wentzloff Introduction to Operational Amplifiers DC Model Typically very high input resistance ~ 300KΩ v id in a v id out High DC gain

More information

Digital Signal Processing for an Integrated Power-Meter

Digital Signal Processing for an Integrated Power-Meter 49. Internationales Wissenschaftliches Kolloquium Technische Universität Ilmenau 27.-30. September 2004 Borisav Jovanović / Milunka Damnjanović / Predrag Petković Digital Signal Processing for an Integrated

More information

Dynamic Scheduling I

Dynamic Scheduling I basic pipeline started with single, in-order issue, single-cycle operations have extended this basic pipeline with multi-cycle operations multiple issue (superscalar) now: dynamic scheduling (out-of-order

More information

L15: VLSI Integration and Performance Transformations

L15: VLSI Integration and Performance Transformations L15: VLSI Integration and Performance Transformations Average Cost of one transistor Acknowledgement: 10 1 0.1 0.01 0.001 0.0001 0.00001 $ 0.000001 Gordon Moore, Keynote Presentation at ISSCC 2003 0.0000001

More information

CS250 VLSI Systems Design. Lecture 3: Physical Realities: Beneath the Digital Abstraction, Part 1: Timing

CS250 VLSI Systems Design. Lecture 3: Physical Realities: Beneath the Digital Abstraction, Part 1: Timing CS250 VLSI Systems Design Lecture 3: Physical Realities: Beneath the Digital Abstraction, Part 1: Timing Fall 2010 Krste Asanovic, John Wawrzynek with John Lazzaro and Yunsup Lee (TA) What do Computer

More information

ENHANCING SPEED AND REDUCING POWER OF SHIFT AND ADD MULTIPLIER

ENHANCING SPEED AND REDUCING POWER OF SHIFT AND ADD MULTIPLIER ENHANCING SPEED AND REDUCING POWER OF SHIFT AND ADD MULTIPLIER 1 ZUBER M. PATEL 1 S V National Institute of Technology, Surat, Gujarat, Inida E-mail: zuber_patel@rediffmail.com Abstract- This paper presents

More information

Introduction to Simulation of Verilog Designs. 1 Introduction. For Quartus II 11.1

Introduction to Simulation of Verilog Designs. 1 Introduction. For Quartus II 11.1 Introduction to Simulation of Verilog Designs For Quartus II 11.1 1 Introduction An effective way of determining the correctness of a logic circuit is to simulate its behavior. This tutorial provides an

More information

EE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling

EE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling EE241 - Spring 2004 Advanced Digital Integrated Circuits Borivoje Nikolic Lecture 15 Low-Power Design: Supply Voltage Scaling Announcements Homework #2 due today Midterm project reports due next Thursday

More information

EE19D Digital Electronics. Lecture 1: General Introduction

EE19D Digital Electronics. Lecture 1: General Introduction EE19D Digital Electronics Lecture 1: General Introduction 1 What are we going to discuss? Some Definitions Digital and Analog Quantities Binary Digits, Logic Levels and Digital Waveforms Introduction to

More information

Technology Timeline. Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs. FPGAs. The Design Warrior s Guide to.

Technology Timeline. Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs. FPGAs. The Design Warrior s Guide to. FPGAs 1 CMPE 415 Technology Timeline 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs FPGAs The Design Warrior s Guide

More information

Exploiting Regularity for Low-Power Design

Exploiting Regularity for Low-Power Design Reprint from Proceedings of the International Conference on Computer-Aided Design, 996 Exploiting Regularity for Low-Power Design Renu Mehra and Jan Rabaey Department of Electrical Engineering and Computer

More information

Controller Implementation--Part I. Cascading Edge-triggered Flip-Flops

Controller Implementation--Part I. Cascading Edge-triggered Flip-Flops Controller Implementation--Part I Alternative controller FSM implementation approaches based on: Classical Moore and Mealy machines Time state: Divide and Counter Jump counters Microprogramming (ROM) based

More information

Mohit Arora. The Art of Hardware Architecture. Design Methods and Techniques. for Digital Circuits. Springer

Mohit Arora. The Art of Hardware Architecture. Design Methods and Techniques. for Digital Circuits. Springer Mohit Arora The Art of Hardware Architecture Design Methods and Techniques for Digital Circuits Springer Contents 1 The World of Metastability 1 1.1 Introduction 1 1.2 Theory of Metastability 1 1.3 Metastability

More information

EECS 470 Lecture 5. Intro to Dynamic Scheduling (Scoreboarding) Fall 2018 Jon Beaumont

EECS 470 Lecture 5. Intro to Dynamic Scheduling (Scoreboarding) Fall 2018 Jon Beaumont Intro to Dynamic Scheduling (Scoreboarding) Fall 2018 Jon Beaumont http://www.eecs.umich.edu/courses/eecs470 Many thanks to Prof. Martin and Roth of University of Pennsylvania for most of these slides.

More information

Computer Architecture and Organization:

Computer Architecture and Organization: Computer Architecture and Organization: L03: Register transfer and System Bus By: A. H. Abdul Hafez Abdul.hafez@hku.edu.tr, ah.abdulhafez@gmail.com 1 CAO, by Dr. A.H. Abdul Hafez, CE Dept. HKU Outlines

More information

Switch/ Jumper Table 1-1: Factory Settings Factory Settings (Jumpers Installed) Function Controlled Activates pull-up/ pull-down resistors on Port 0 digital P7 I/O lines Activates pull-up/ pull-down resistors

More information

L15: VLSI Integration and Performance Transformations

L15: VLSI Integration and Performance Transformations L15: VLSI Integration and Performance Transformations Acknowledgement: Materials in this lecture are courtesy of the following sources and are used with permission. Curt Schurgers J. Rabaey, A. Chandrakasan,

More information