CSE241 VLSI Digital Circuits Winter Lecture 06: Timing

Size: px
Start display at page:

Download "CSE241 VLSI Digital Circuits Winter Lecture 06: Timing"

Transcription

1 CSE241 VLSI Digital Circuits Winter 2003 Lecture 06: Timing CSE241 L3 ASICs.1 Kahng & Cichy, UCSD 2003

2 This Class + Logistics Timing Flip-flop timing Clock distribution Clock tree synthesis Reading: White papers on static timing analysis, papers on clock tree synthesis Lab #2 due date: Monday January 27th Slide courtesy of S. P. Levitan, U. Pittsburg CSE241 L3 ASICs.2 Kahng & Cichy, UCSD 2003

3 Review Static timing analysis (Lecture 4) Pin-based timing graph Directed acyclic graph (DAG) of timing arcs Longest path in DAG time linear in #arcs (edges) Slack = required arrival time actual arrival time (long path analysis) Logic synthesis (Lecture 5) Slide courtesy of S. P. Levitan, U. Pittsburg CSE241 L3 ASICs.3 Kahng & Cichy, UCSD 2003

4 Static Analysis vs. Dynamic Analysis Why static analysis when dynamic simulation is more accurate? Drawbacks of simulation Requires input vectors (stimuli for circuit) Long runtimes Example: calculate worst-case rising delay from a to z Exponential explosion with number of possible design input states a b z c c=0 c=1 b=0 a-z delay1 a-z delay2 b=1 a-z delay3 a-z delay4 CSE241 L3 ASICs.4 Kahng & Cichy, UCSD 2003

5 STA Terminology (Actual) arrival time (AAT, or AT) = time at which a pin switches state Usually 50% point on voltage curve, i.e., AT = t 50 Slew time = time over which signal switches Usually difference between 10% and 90% on voltage curve, i.e., t slew = t 90 t 10 Required arrival time (RAT) = time at which a signal must arrive in order to avoid a chip fail Slack = RAT AAT Positive slack good (= margin), negative slack bad Vdd Time CSE241 L3 ASICs.5 Kahng & Cichy, UCSD 2003

6 Example: What is slack at PO? at=0 at=0 d=1 d=2 at=1 at=2 d=2 d=3 temp at=3 at=5 d=1 at=6 at=5 d=1 d=3 temp at=7 at=8 d=3 at=11 rat=10 at=0 d=5 Slack= -1 CSE241 L3 ASICs.6 Kahng & Cichy, UCSD 2003

7 Example: Incremental Timing Analysis at=0 at=0 at=0 d=1 d=2 at=1 at=2 d=1 temp at=3 d=2 at=5 d=3 d=1 d=1 d=5 d=1 at=6 at=5 at=3 d=1 d=3 temp at=7 at=8 at=7 d=3 at=10 at=11 rat=10 Slack = 0 Amount of work is bounded by sizes of fanin, fanout cones of logic CSE241 L3 ASICs.7 Kahng & Cichy, UCSD 2003

8 Early-Mode Analysis Definitions change as follows RAT = lower bound on arrival time Propagate shortest possible instead of longest possible delays Slack = Arrival Required Example: negative slack because AT c is too small (early) SL a AT a AT b SL b = 0 0 = = 0 = 1 a b = 1 0 = c SL y AT y y = 1 1 = = 1 1 AT c = 0 SL c = 0 1 = 1 0 RAT x = 2 x AT x =1 SL x = 1 2 = 1 CSE241 L3 ASICs.8 Kahng & Cichy, UCSD 2003

9 Enhancements of STA Incremental timing analysis Nanometer-scale process effects variation ( probabilistic timing analysis) Interference crosstalk Multiple inputs switching Conservatism of delay propagation HW #8: Suppose you change the size of one (combinational) gate in your design, thus invalidating the previous timing analysis. How much work must be done to regain a correct timing analysis? Courtesy K. Keutzer et al. UCB CSE241 L3 ASICs.9 Kahng & Cichy, UCSD 2003

10 Timing Correction Driven by STA Incremental performance analysis backplane Fix electrical violations Resize cells Buffer nets Copy (clone) cells Fix timing problems Local transforms (bag of tricks) Path-based transforms DAC-2002, Physical Chip Implementation CSE241 L3 ASICs.10 Kahng & Cichy, UCSD 2003

11 Local Synthesis Transforms Resize cells Buffer or clone to reduce load on critical nets Decompose large cells Swap connections on commutative pins or among equivalent nets Move critical signals forward Pad early paths Area recovery DAC-2002, Physical Chip Implementation CSE241 L3 ASICs.11 Kahng & Cichy, UCSD 2003

12 Transform Example.. Double Inverter Removal.... Delay = 4 Delay = 2 DAC-2002, Physical Chip Implementation CSE241 L3 ASICs.12 Kahng & Cichy, UCSD 2003

13 Resizing a b a b? A d e f d load A B C a b C DAC-2002, Physical Chip Implementation CSE241 L3 ASICs.13 Kahng & Cichy, UCSD 2003

14 Cloning d load A B C d 0.2 d a b? e f g h a b A B e f g h DAC-2002, Physical Chip Implementation CSE241 L3 ASICs.14 Kahng & Cichy, UCSD 2003

15 Buffering d load A B C a b d 0.2 e 0.2 a f? 0.2 b g 0.2 h 0.2 B 0.1 B d e f g h DAC-2002, Physical Chip Implementation CSE241 L3 ASICs.15 Kahng & Cichy, UCSD 2003

16 Redesign Fan-in Tree Arr(a)=4 Arr(b)=3 Arr(c)=1 Arr(d)=0 a b c d e Arr(e)=6 c d 1 b 1 a 1 e Arr(e)=5 DAC-2002, Physical Chip Implementation CSE241 L3 ASICs.16 Kahng & Cichy, UCSD 2003

17 Redesign Fan-out Tree Longest Path = 5 Longest Path = 4 Slowdown of buffer due to load DAC-2002, Physical Chip Implementation CSE241 L3 ASICs.17 Kahng & Cichy, UCSD 2003

18 Decomposition DAC-2002, Physical Chip Implementation CSE241 L3 ASICs.18 Kahng & Cichy, UCSD 2003

19 Swap Commutative Pins 0 a b 1 2 c 2 Simple sorting on arrival times and delay works a b c DAC-2002, Physical Chip Implementation CSE241 L3 ASICs.19 Kahng & Cichy, UCSD 2003

20 Outline Clocking Storage elements Clocking metrics and methodology Clock distribution Package and useful-skew degrees of freedom Clock power issues Gate timing models CSE241 L3 ASICs.20 Kahng & Cichy, UCSD 2003

21 Why Clocks? Clocks provide the means to synchronize By allowing events to happen at known timing boundaries, we can sequence these events Greatly simplifies building of state machines No need to worry about variable delay through combinational logic (CL) All signals delayed until clock edge (clock imposes the worst case delay) FSM Dataflow Comb Logic register register Comb Logic register CSE241 L3 ASICs.21 Kahng & Cichy, UCSD 2003

22 Clock Cycle Time Cycle time is determined by the delay through the CL Signal must arrive before the latching edge If too late, it waits until the next cycle - Synchronization and sequential order becomes incorrect t cycle > t prop_delay + t overhead Can change circuit architecture to obtain smaller T cycle Pipelining Parallelism CSE241 L3 ASICs.22 Kahng & Cichy, UCSD 2003

23 Pipelining For dataflow: Instead of a long critical path, split the critical path into chunks Insert registers to store intermediate results This allows 2 waves of data to coexist within the CL Can we extend this ad infinitum? Overhead eventually limits the pipelining - E.g., 1.5 to 2 gate delays for latch or FF Granularity limits as well - Minimum time quantum: delay of a gate t cycle > t pd + t overhead t cycle > max(t pd1, t pd2 ) + t overhead register CL CL A+B A+B register register CL CL A register CL CL B register t pd t pd1 t pd2 CSE241 L3 ASICs.23 Kahng & Cichy, UCSD 2003

24 Parallelism For FSMs: Same functionality and performance can be achieved at half the clock rate However, the input and output signals must be doubled (to account for the outputs for each original cycle) Instead of doubling the delay, the optimized logic is often logarithmically related to the degree of parallelism t cycle1 > t pd + t ov t cycle2 > Nt pd + t ov t cycle3 > log(nt pd ) + t ov M-bits CL CL t pd register M-bits CL CL t pd M-bits CL CL t pd reg 2*M-bits Opt. Opt. CL CL t pd register reg CSE241 L3 ASICs.24 Kahng & Cichy, UCSD 2003

25 Outline Clocking Storage elements Clocking metrics and methodology Clock distribution Package and useful-skew degrees of freedom Clock power issues Gate timing models CSE241 L3 ASICs.25 Kahng & Cichy, UCSD 2003

26 Storage Elements Latches Level sensitive transparent when H, hold when L d ck q Flip-flops d ck Edge-triggered data is sampled at the clock edge d ck q q d p_q q ckb ck CSE241 L3 ASICs.26 Kahng & Cichy, UCSD 2003

27 Latch and Flip-Flop Gates Active high latch clock Rising edge flip-flop clock clock D clock clock QN D clock clock clock clock QN clock Q clock clock Q enable enable in enable out in enable out Latch and flip-flop schematics from TSMC 0.13um LV Artisan Sage-X Standard Cell Library. CSE241 L3 ASICs.27 Kahng & Cichy, UCSD 2003

28 Latch and Flip-Flop Behavior Active high latch Rising edge flip-flop When clock is high When clock is high D QN D QN Q Q t DQ 2 inverter delays t CQ 4 inverter delays When clock is low When clock is low D QN D QN Q Q CSE241 L3 ASICs.28 Kahng & Cichy, UCSD 2003

29 Clock Skew and Jitter A B Clock skew Duty cycle jitter Cycle-to-cycle edge jitter clock clock at A clock at B (a) clock at B clock at B (b) clock at B clock at B (c) t sk,ab T high t duty t duty t j /2 T t j t j /2 t sk,ab CSE241 L3 ASICs.29 Kahng & Cichy, UCSD 2003

30 Flip-Flop Timing Characteristics Rising edge flip-flop A B clock t comb,min non-ideal clock A B t CQmax t comb,max t su t sk +t j A non-ideal clock B t CQ,min T flip-flops t sk t h Setup time constraint Hold time constraint CSE241 L3 ASICs.30 Kahng & Cichy, UCSD 2003

31 Latch Setup Time and Transparency Active high latch A B A B clock non-ideal A clock B clock non-ideal clock A B t CQ t comb,max t su t duty t sk +t j t DQ t comb t DQ Setup time constraint No penalty to clock period for setup time constraint! CSE241 L3 ASICs.31 Kahng & Cichy, UCSD 2003

32 Outline Clocking Storage elements Clocking metrics and methodology Clock distribution Package and useful-skew degrees of freedom Clock power issues Gate timing models CSE241 L3 ASICs.32 Kahng & Cichy, UCSD 2003

33 Setup Time Important characteristics of storage elements Setup time, hold time, clock-to-q delay Setup time, t su Time before the clock edge that the data must arrive in order for the new data to be stored The setup time for a F/F occurs before the latching edge. The setup time for a Latch occurs before the transition from transparent to hold d ck t setup q CSE241 L3 ASICs.33 Kahng & Cichy, UCSD 2003

34 Hold Time A second important characteristic is the hold time, t h Time after the clock edge that the data must remain in order to the data to be properly held Note that Hold time (and Setup time) can be negative Why isn t hold time just the negative of setup time? Storage elements typically have some data dependence - Capacitances, and devices may be faster for one data value versus another Specify the worst case for process technology and operating condition variations d ck t hold q CSE241 L3 ASICs.34 Kahng & Cichy, UCSD 2003

35 Clocking Overhead Inherent delay in any storage element The delay is measured from Clock transition to Output data transition, t c2q Input data transition to Output data transition, t d2q Flip-flop is edge triggered The overhead is t c2q + t su Latch is level-sensitive The overhead is t d2q d ck q t c2q t d2q CSE241 L3 ASICs.35 Kahng & Cichy, UCSD 2003

36 Clock Skew Most high-profile of clock network metrics Maximum difference in arrival times of clock signal to any 2 latches/ff s fed by the network Skew = max t 1 t 2 t 1 t 2 Skew CLK2 CLK1 Clock Source (ex. PLL) Latency Time Time Time Fig. From Zarkesh-Ha Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

37 Clock Skew Causes Designed (unavoidable) variations mismatch in buffer load sizes, interconnect lengths Process variation process spread across die yielding different L eff, T ox, etc. values Temperature gradients changes MOSFET performance across die IR voltage drop in power supply changes MOSFET performance across die Note: Delay from clock generator to fan-out points (clock latency) is not important by itself BUT: increased latency leads to larger skew for same amount of relative variation Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

38 Clock Jitter Clock network delay uncertainty From one clock cycle to the next, the period is not exactly the same each time Maximum difference in phase of clock between any two periods is jitter Must be considered in max path (setup) timing; typically O(50ps) for high-end designs Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

39 Clock Jitter Causes PLL oscillation frequency Various noise sources affecting clock generation and distribution E.g., power supply noise dynamically alters drive strength of intermediate buffer stages Jitter reduced by minimizing IR and L*(di/dt) noise Courtesy Cypress Semi Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

40 Clocking Methodology (Edge-Triggered) Comb Comb Logic Logic FlipFlop t per Max(t pd ) < t per t su t c2q t skew Delay is too long for data to be captured Min(t pd ) > t h -t c2q +t skew Delay is too short and data can race through, skipping a state CSE241 L3 ASICs.40 Kahng & Cichy, UCSD 2003

41 Example of t pdmax Violation Suppose there is skew between the registers in a dataflow (rega after regb) i gets its input values from rega at transition in Ck CL output o arrives after Ck transition due to skew To correct this problem, can increase cycle time Ck Ck rega i Comb Comb Logic Logic o regb t pdmax t skew Ck Too late! Ck i t pdmax CSE241 L3 ASICs.41 Kahng & Cichy, UCSD 2003 o

42 Example of t pdmin Violation: Race Through Suppose clock skew causes rega to be clocked before regb i passes through the CL with little delay (tpdmin) o arrives before the rising Ck causes the data to be latched This problem cannot be fixed by changing frequency have a rock instead of a chip Ck Ck rega i Comb Comb Logic Logic o regb t pdmin Ck Ck i o t pdmin t skew Too early! CSE241 L3 ASICs.42 Kahng & Cichy, UCSD 2003

43 Time Borrowing (Cycle Stealing) Cycle steal with flip-flops using delayed clocks t pd > t per FlipFlop Comb Comb Logic Logic FlipFlop T pd is safely > t pdmin Ck Intentional delay = skew Time borrowing with latches Latch Comb Comb Logic Logic Latch Comb Comb Logic Logic Ck t pd < t per + t w Give it back in later stages CSE241 L3 ASICs.43 Kahng & Cichy, UCSD 2003

44 Outline Clocking Storage elements Clocking metrics and methodology Clock distribution Package and useful-skew degrees of freedom Clock power issues Gate timing models CSE241 L3 ASICs.44 Kahng & Cichy, UCSD 2003

45 Clock Distribution General goal of clock distribution Deliver clock to all memory elements with acceptable skew Deliver clock edges with acceptable sharpness Clocking network design is one of the greatest challenges in the design of a large chip Clocks generally distributed via wiring trees (and meshes) Low-resistance interconnect to minimize delay Multiple drivers to distribute driver requirements Use optimal sizing principles to design buffers Clock lines can create significant crosstalk CSE241 L3 ASICs.45 Kahng & Cichy, UCSD 2003

46 Clock Distribution Problem Statement Objective Minimum skew (performance and hold time issues) Minimum cell area and metal use (sometimes) minimal latency (sometimes) particular latency (sometimes) intermixed gating for power reduction (sometimes) hold to particular duty cycle: e.g. 50: percent Subject to: Process variation from lot-to-lot Process variation across the die Radically different loading (ff density) around the die Metal variation across the die Power variation across the die (both static IR and dynamic) Coupling (same and other layers) CSE241 L3 ASICs.46 Kahng & Cichy, UCSD 2003

47 Issues in Clock Distribution Network Design Skew Process, voltage, and temperature Data dependence Noise coupling Load balancing Power, CV 2 f (no ½ or α) Clock gating Flexibility/Tunability Compactness fit into existing layout/design Reliability Electromigration CSE241 L3 ASICs.47 Kahng & Cichy, UCSD 2003

48 Skew: Clock Delay Varies With Position CSE241 L3 ASICs.48 Kahng & Cichy, UCSD 2003

49 Clock Distribution Methods RC-Tree Less capacitance More accuracy Flexible wiring Grids Reliable Less data dependency Tunable (late in design) Shown here for final stage drivers driving F/F loads CSE241 L3 ASICs.49 Kahng & Cichy, UCSD 2003

50 RC-Trees H-Tree X-Tree Binary-Tree Asymmetric trees can and are used due to uneven sink distribution, hard macros in floorplan ( hierarchical clock distribution), etc.; the basic goal is to have even RC delays CSE241 L3 ASICs.50 Kahng & Cichy, UCSD 2003

51 Grids Gridded clock distribution common on earlier DEC Alpha microprocessors Advantages: Skew determined by grid density, not too sensitive to load position Clock signals available everywhere Tolerant to process variations Usually yields extremely low skew values Predrivers Global grid Disadvantages: Huge amount of wiring and power To minimize such penalties, need to make grid pitch coarser lose the grid advantage Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

52 Trees H-tree (Bakoglu) One large central driver, recursive structure to match wirelengths Halve wire width at branching points to reduce reflections Disadvantages Slew degradation along long RC paths Unrealistically large central driver - Clock drivers can create large temperature gradients (ex. Alpha ~30 C) Non-uniform load distribution Inherently non-scalable (wire R growth) Partial solution: intermediate buffers at branching points courtesy of P. Zarkesh-Ha Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

53 Buffered Tree L2 Drives all clock loads within its region L3 PLL WGBuf NGBuf SGBuf EGBuf Other regions of the chip Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

54 Buffered H-tree Advantages Ideally zero-skew Can be low power (depending on skew requirements) Low area (silicon and wiring) CAD tool friendly (regular) Disadvantages Sensitive to process variations Local clocking loads inherently non-uniform Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

55 Tree Balancing Some techniques: a) Introduce dummy loads b) Snaking of wirelength to match delays Con: Routing area often more valuable than Silicon Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

56 Examples of Distribution H-Tree, Asymmetric RC-Tree (IBM) Grids DEC [Alphas] Serpentines Intel x86 [Young ISSCC97] CSE241 L3 ASICs.56 Kahng & Cichy, UCSD 2003

57 Examples From Processor Chips DEC-Alpha clock spines DEC-Alpha RC delays DEC-Alpha RC delays for Global Distribution (Spine + Grid) DEC-Alpha RC local delays CSE241 L3 ASICs.57 Kahng & Cichy, UCSD 2003

58 ReShape Clocks Example Balanced, shielded H-tree for pre-clock distribution Mesh for Block level distribution CSE241 L3 ASICs.58 Kahng & Cichy, UCSD 2003

59 Pre-clock 2 Level H-tree All routes 5-6u M6/5, shielded with 1u grounds ~10 buffers per node output mesh must hit every subblock output mesh CSE241 L3 ASICs.59 Kahng & Cichy, UCSD 2003

60 Block Level Mesh (.18u) Clumps of 1-6 clock buffers, surrounded by capacitor pads Shielded input and output m6 shorting straps Pre-clock connects to input shorting straps 1u m5 ribs every u (4 to 6 rows) Max 600u stride CSE241 L3 ASICs.60 Kahng & Cichy, UCSD 2003

61 Problems with Meshes Burn more power at low frequencies Blocks more routing resources (solution, integrated power distribution with ribs can provide shielding for free ) Difficult for spare clock domains that will not tolerate regioning Post placement (and routing) tuning required No beneficial skew (shudder) possible CSE241 L3 ASICs.61 Kahng & Cichy, UCSD 2003

62 Problems with Meshes (#2) Clock gating only easy at root Fighting tools to do analysis: Clumped buffers a problem in Static Timing Analysis tools Large shorted meshes a problem for STA tools Need Full extractions and Spice-Like simulation (e.g. Avant! Star-Sim) to determine skew CSE241 L3 ASICs.62 Kahng & Cichy, UCSD 2003

63 Benefits of Meshes (#3) Deterministic since shielded all the way down to rib distribution No ecoplacement required: all buffers preplaced before block placement Low latency since uses shorted drivers, therefore lower skew Ecoplacements of FFs later do not require rebalance of tree Idealized clocking environment for concurrent RTL design and timing convergence dance. CSE241 L3 ASICs.63 Kahng & Cichy, UCSD 2003

64 Mesh Example ~ 100k flops 6 blocks CSE241 L3 ASICs.64 Kahng & Cichy, UCSD 2003

65 Clock Skew Thermal Map Pre-tuning CSE241 L3 ASICs.65 Kahng & Cichy, UCSD 2003

66 Clock Skew Thermal Map #2 50ps block/ 100ps global skew, post tuning CSE241 L3 ASICs.66 Kahng & Cichy, UCSD 2003

67 Alternative Clock Network Strategy Globally Tree Power requirements reduced relative to global grid Smaller routing requirements, frees up global tracks Trees balanced easily at global level Keeps global skew low (with minimal process variation) Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

68 Outline Clocking Storage elements Clocking metrics and methodology Clock distribution Package and useful-skew degrees of freedom Clock power issues Gate timing models CSE241 L3 ASICs.68 Kahng & Cichy, UCSD 2003

69 Skew Reduction Using Package Most clock network latency occurs at global level (largest distances spanned) Latency Skew With reverse scaling, routing low-rc signals at global level becomes more difficult & areaconsuming Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

70 Skew Reduction Using Package µp/asic System clock Solder bump substrate Incorporate global clock distribution into the package Flip-chip packaging allows for high density, low parasitic access from substrate to IC RC of package-level wiring up to 4 orders of magnitude smaller than on-chip wiring Global skew reduced Lower capacitance lower power Opens up global routing tracks Results not yet conclusive Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

71 Useful Skew (= cycle-stealing) Zero skew Useful skew FF fast FF slow FF FF fast FF slow FF Timing Slacks hold setup hold setup hold setup hold setup Zero skew Global skew constraint All skew is bad Useful skew Local skew constraints Shift slack to critical paths W. Dai, CSE241 UC Santa L3 ASICs.71 Cruz Kahng & Cichy, UCSD 2003

72 Skew = Local Constraint Timing is correct as long as the signal arrives in the permissible skew range FF D : longest path d : shortest path FF -d + t hold < Skew < T period -D-t setup race condition safe permissible range cycle time violation W. Dai, CSE241 UC Santa L3 ASICs.72 Cruz Kahng & Cichy, UCSD 2003

73 Skew Scheduling for Design Robustness Design will be more robust if clock signal arrival time is in the middle of permissible skew range, rather than on the edge FF FF FF 2 ns 6 ns T = 6 ns : at verge of violation : more safety margin W. Dai, CSE241 UC Santa L3 ASICs.73 Cruz Kahng & Cichy, UCSD 2003

74 Potential Advantages of Useful Skew Reduce peak current consumption by distributing the FF switch point in the range of permissible skew CLK CLK 0-skew U-skew Can exploit extra margin to increase clock frequency or reduce sizing (= power) W. Dai, CSE241 UC Santa L3 ASICs.74 Cruz Kahng & Cichy, UCSD 2003

75 Conventional Zero-Skew Flow Synthesis Placement 0-Skew Clock Synthesis Clock Routing Signal Routing Extraction & Delay Calculation Static Timing Analysis W. Dai, CSE241 UC Santa L3 ASICs.75 Cruz Kahng & Cichy, UCSD 2003

76 Useful-Skew Flow Existing Placement U-Skew Clock Synthesis Permissible range generation Initial skew scheduling Clock tree topology synthesis Clock net routing Clock Routing Clock timing verification Signal Routing Extraction & Delay Calculation Static Timing Analysis W. Dai, CSE241 UC Santa L3 ASICs.76 Cruz Kahng & Cichy, UCSD 2003

77 Outline Clocking Storage elements Clocking metrics and methodology Clock distribution Package and used-skew degrees of freedom Clock power issues Gate timing models CSE241 L3 ASICs.77 Kahng & Cichy, UCSD 2003

78 Clock Power Power consumption in clocks due to: Clock drivers Long interconnections Large clock loads all clocked elements (latches, FF s) are driven Different components dominate Depending on type of clock network used Ex. Grid huge pre-drivers & wire cap. drown out load cap. Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

79 Clock Power Is LARGE P = α C V dd2 f Not only is the clock capacitance large, it switches every cycle! Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

80 Low-Power Clocking Gated clocks Prevent switching in areas of chip not being used Easier in static designs Edge-triggered flops in ARM rather than transparent latches in Alpha Reduced load on clock for each latch/flop Eliminated spurious power-consuming transitions during latch flow- through Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

81 Clock Area Clock networks consume silicon area (clock drivers, PLL, etc.) and routing area Routing area is most vital Top-level metals are used to reduce RC delays These levels are precious resources (unscaled) Power routing, clock routing, key global signals Reducing area also reduces wiring capacitance and power Typical # s: Intel Itanium 4% of M4/5 used in clock routing Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

82 Clock Slew Rates To maintain signal integrity and latch performance, minimum slew rates are required Too slow clock is more susceptible to noise, latches are slowed down, setup times eat into timing budget [T setup = * T slew (ps)], more short-circuit power for large clock drivers Too fast burns too much power, overdesigned network, enhanced ground bounce Rule-of-thumb: T rise and T fall of clock are each between 10-20% of clock period (10% - aggressive target) 1 GHz clock; T rise = T fall = ps Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

83 Example: Alpha Grid + H-tree approach Power = 32% of total Wire usage = 3% of metals 3 & 4 4 major clock quadrants, each with a large driver connected to local grid structures Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

84 Alpha Skew Map Ref: Compaq, ASP-DAC00 Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

85 Power vs. Skew Fundamental design decision Meeting skew requirements is easy with unlimited power budget Wide wires reduce RC product but increase total C Driver upsizing reduces latency ( reduces skew as well) but increases buffer cap SOC context: plastic package power limit is 2-3 W Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

86 Clock Distribution Trends Timing Clock period dropping fast, skew must follow Slew rates must also scale with cycle time Jitter PLL s get better with CMOS scaling but other sources of noise increase - Power supply noise more important - Switching-dependent temperature gradients Materials Cu reduces RC slew degradation, potential skew Low-k decreases power, improves latency, skew, slews Power Complexity, dynamic logic, pipelining more clock sinks Larger chips bigger clock networks Sylvester CSE241 / Shepard, L3 ASICs Kahng & Cichy, UCSD 2003

87 Outline Clocking Storage elements Clocking metrics and methodology Clock distribution Package and useful-skew degrees of freedom Clock power issues Gate timing models CSE241 L3 ASICs.87 Kahng & Cichy, UCSD 2003

88 Gate Timing Characterization C L A B D F C L Extract exact transistor characteristics from layout Transistor width, length, junction area and perimeter Local wire length and inter-wire distance Compute all transistor and wire capacitances CSE241 L3 ASICs.88 Kahng & Cichy, UCSD 2003

89 Cell Timing Characterization Delay tables generated using a detailed transistor-level circuit simulator SPICE (differential-equations solver) For a number of different input slews and load capacitances simulate the circuit of the cell Propagation time (50% Vdd at input to 50% at output) Output slew (10% Vdd at output to 90% Vdd at output) Vdd t slew t pd Time CSE241 L3 ASICs.89 Kahng & Cichy, UCSD 2003

90 Non-linear effects reflected in tables D G = f (C L, S in ) and S out = f (C L, S in ) Non-linear Interpolate between table entries Interpolation error is usually below 10% of SPICE Output Capacitance Output Capacitance Input Slew Intrinsic Delay Input Slew Output Slew Delay at the gate Resulting waveform CSE241 L3 ASICs.90 Kahng & Cichy, UCSD 2003

91 Conservatism of Gate Delay Modeling True gate delay depends on input arrival time patterns STA will assume that only 1 input is switching Will use worst slope among several inputs Vdd A A B t F pd B D F C L Time Vdd A t pd F Time CSE241 L3 ASICs.91 Kahng & Cichy, UCSD 2003

Lecture 9: Clocking for High Performance Processors

Lecture 9: Clocking for High Performance Processors Lecture 9: Clocking for High Performance Processors Computer Systems Lab Stanford University horowitz@stanford.edu Copyright 2001 Mark Horowitz EE371 Lecture 9-1 Horowitz Overview Reading Bailey Stojanovic

More information

Lecture 11: Clocking

Lecture 11: Clocking High Speed CMOS VLSI Design Lecture 11: Clocking (c) 1997 David Harris 1.0 Introduction We have seen that generating and distributing clocks with little skew is essential to high speed circuit design.

More information

Timing analysis can be done right after synthesis. But it can only be accurately done when layout is available

Timing analysis can be done right after synthesis. But it can only be accurately done when layout is available Timing Analysis Lecture 9 ECE 156A-B 1 General Timing analysis can be done right after synthesis But it can only be accurately done when layout is available Timing analysis at an early stage is not accurate

More information

Lecture #2 Solving the Interconnect Problems in VLSI

Lecture #2 Solving the Interconnect Problems in VLSI Lecture #2 Solving the Interconnect Problems in VLSI C.P. Ravikumar IIT Madras - C.P. Ravikumar 1 Interconnect Problems Interconnect delay has become more important than gate delays after 130nm technology

More information

Lecture 19: Design for Skew

Lecture 19: Design for Skew Introduction to CMOS VLSI Design Lecture 19: Design for Skew David Harris Harvey Mudd College Spring 2004 Outline Clock Distribution Clock Skew Skew-Tolerant Circuits Traditional Domino Circuits Skew-Tolerant

More information

Microcircuit Electrical Issues

Microcircuit Electrical Issues Microcircuit Electrical Issues Distortion The frequency at which transmitted power has dropped to 50 percent of the injected power is called the "3 db" point and is used to define the bandwidth of the

More information

Interconnect-Power Dissipation in a Microprocessor

Interconnect-Power Dissipation in a Microprocessor 4/2/2004 Interconnect-Power Dissipation in a Microprocessor N. Magen, A. Kolodny, U. Weiser, N. Shamir Intel corporation Technion - Israel Institute of Technology 4/2/2004 2 Interconnect-Power Definition

More information

CS250 VLSI Systems Design. Lecture 3: Physical Realities: Beneath the Digital Abstraction, Part 1: Timing

CS250 VLSI Systems Design. Lecture 3: Physical Realities: Beneath the Digital Abstraction, Part 1: Timing CS250 VLSI Systems Design Lecture 3: Physical Realities: Beneath the Digital Abstraction, Part 1: Timing Fall 2010 Krste Asanovic, John Wawrzynek with John Lazzaro and Yunsup Lee (TA) What do Computer

More information

A Survey of the Low Power Design Techniques at the Circuit Level

A Survey of the Low Power Design Techniques at the Circuit Level A Survey of the Low Power Design Techniques at the Circuit Level Hari Krishna B Assistant Professor, Department of Electronics and Communication Engineering, Vagdevi Engineering College, Warangal, India

More information

A Brief History of Timing

A Brief History of Timing A Brief History of Timing David Hathaway February 28, 2005 Tau 2005 February 28, 2005 Outline Snapshots from past Taus Delay modeling Timing analysis Timing integration Future challenges 2 Tau 2005 February

More information

PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS

PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS The major design challenges of ASIC design consist of microscopic issues and macroscopic issues [1]. The microscopic issues are ultra-high

More information

Leakage Power Minimization in Deep-Submicron CMOS circuits

Leakage Power Minimization in Deep-Submicron CMOS circuits Outline Leakage Power Minimization in Deep-Submicron circuits Politecnico di Torino Dip. di Automatica e Informatica 1019 Torino, Italy enrico.macii@polito.it Introduction. Design for low leakage: Basics.

More information

Geared Oscillator Project Final Design Review. Nick Edwards Richard Wright

Geared Oscillator Project Final Design Review. Nick Edwards Richard Wright Geared Oscillator Project Final Design Review Nick Edwards Richard Wright This paper outlines the implementation and results of a variable-rate oscillating clock supply. The circuit is designed using a

More information

Digital Integrated Circuits Lecture 20: Package, Power, Clock, and I/O

Digital Integrated Circuits Lecture 20: Package, Power, Clock, and I/O Digital Integrated Circuits Lecture 20: Package, Power, Clock, and I/O Chih-Wei Liu VLSI Signal Processing LAB National Chiao Tung University cwliu@twins.ee.nctu.edu.tw DIC-Lec20 cwliu@twins.ee.nctu.edu.tw

More information

ELEC Digital Logic Circuits Fall 2015 Delay and Power

ELEC Digital Logic Circuits Fall 2015 Delay and Power ELEC - Digital Logic Circuits Fall 5 Delay and Power Vishwani D. Agrawal James J. Danaher Professor Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 http://www.eng.auburn.edu/~vagrawal

More information

EE434 ASIC & Digital Systems. Partha Pande School of EECS Washington State University

EE434 ASIC & Digital Systems. Partha Pande School of EECS Washington State University EE434 ASIC & Digital Systems Partha Pande School of EECS Washington State University pande@eecs.wsu.edu Lecture 11 Physical Design Issues Interconnect Scaling Effects Dense multilayer metal increases coupling

More information

EE141-Spring 2007 Digital Integrated Circuits

EE141-Spring 2007 Digital Integrated Circuits EE141-Spring 2007 Digital Integrated Circuits Lecture 22 I/O, Power Distribution dders 1 nnouncements Homework 9 has been posted Due Tu. pr. 24, 5pm Project Phase 4 (Final) Report due Mo. pr. 30, noon

More information

EDA Challenges for Low Power Design. Anand Iyer, Cadence Design Systems

EDA Challenges for Low Power Design. Anand Iyer, Cadence Design Systems EDA Challenges for Low Power Design Anand Iyer, Cadence Design Systems Agenda Introduction ti LP techniques in detail Challenges to low power techniques Guidelines for choosing various techniques Why is

More information

DIGITAL INTEGRATED CIRCUITS A DESIGN PERSPECTIVE 2 N D E D I T I O N

DIGITAL INTEGRATED CIRCUITS A DESIGN PERSPECTIVE 2 N D E D I T I O N DIGITAL INTEGRATED CIRCUITS A DESIGN PERSPECTIVE 2 N D E D I T I O N Jan M. Rabaey, Anantha Chandrakasan, and Borivoje Nikolic CONTENTS PART I: THE FABRICS Chapter 1: Introduction (32 pages) 1.1 A Historical

More information

High-speed Serial Interface

High-speed Serial Interface High-speed Serial Interface Lect. 9 Noises 1 Block diagram Where are we today? Serializer Tx Driver Channel Rx Equalizer Sampler Deserializer PLL Clock Recovery Tx Rx 2 Sampling in Rx Interface applications

More information

Low-Power Digital CMOS Design: A Survey

Low-Power Digital CMOS Design: A Survey Low-Power Digital CMOS Design: A Survey Krister Landernäs June 4, 2005 Department of Computer Science and Electronics, Mälardalen University Abstract The aim of this document is to provide the reader with

More information

Managing Cross-talk Noise

Managing Cross-talk Noise Managing Cross-talk Noise Rajendran Panda Motorola Inc., Austin, TX Advanced Tools Organization Central in-house CAD tool development and support organization catering to the needs of all design teams

More information

Lecture 10. Circuit Pitfalls

Lecture 10. Circuit Pitfalls Lecture 10 Circuit Pitfalls Intel Corporation jstinson@stanford.edu 1 Overview Reading Lev Signal and Power Network Integrity Chandrakasen Chapter 7 (Logic Families) and Chapter 8 (Dynamic logic) Gronowski

More information

Preface to Third Edition Deep Submicron Digital IC Design p. 1 Introduction p. 1 Brief History of IC Industry p. 3 Review of Digital Logic Gate

Preface to Third Edition Deep Submicron Digital IC Design p. 1 Introduction p. 1 Brief History of IC Industry p. 3 Review of Digital Logic Gate Preface to Third Edition p. xiii Deep Submicron Digital IC Design p. 1 Introduction p. 1 Brief History of IC Industry p. 3 Review of Digital Logic Gate Design p. 6 Basic Logic Functions p. 6 Implementation

More information

Signal Integrity Management in an SoC Physical Design Flow

Signal Integrity Management in an SoC Physical Design Flow Signal Integrity Management in an SoC Physical Design Flow Murat Becer Ravi Vaidyanathan Chanhee Oh Rajendran Panda Motorola, Inc., Austin, TX Presenter: Rajendran Panda Talk Outline Functional and Delay

More information

DFT for Testing High-Performance Pipelined Circuits with Slow-Speed Testers

DFT for Testing High-Performance Pipelined Circuits with Slow-Speed Testers DFT for Testing High-Performance Pipelined Circuits with Slow-Speed Testers Muhammad Nummer and Manoj Sachdev University of Waterloo, Ontario, Canada mnummer@vlsi.uwaterloo.ca, msachdev@ece.uwaterloo.ca

More information

EE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling

EE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling EE241 - Spring 2004 Advanced Digital Integrated Circuits Borivoje Nikolic Lecture 15 Low-Power Design: Supply Voltage Scaling Announcements Homework #2 due today Midterm project reports due next Thursday

More information

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2012

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2012 ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2012 Lecture 5: Termination, TX Driver, & Multiplexer Circuits Sam Palermo Analog & Mixed-Signal Center Texas A&M University Announcements

More information

Timing Issues in FPGA Synchronous Circuit Design

Timing Issues in FPGA Synchronous Circuit Design ECE 428 Programmable ASIC Design Timing Issues in FPGA Synchronous Circuit Design Haibo Wang ECE Department Southern Illinois University Carbondale, IL 62901 1-1 FPGA Design Flow Schematic capture HDL

More information

EECS 427 Lecture 13: Leakage Power Reduction Readings: 6.4.2, CBF Ch.3. EECS 427 F09 Lecture Reminders

EECS 427 Lecture 13: Leakage Power Reduction Readings: 6.4.2, CBF Ch.3. EECS 427 F09 Lecture Reminders EECS 427 Lecture 13: Leakage Power Reduction Readings: 6.4.2, CBF Ch.3 [Partly adapted from Irwin and Narayanan, and Nikolic] 1 Reminders CAD assignments Please submit CAD5 by tomorrow noon CAD6 is due

More information

EE241 - Spring 2013 Advanced Digital Integrated Circuits. Announcements. Lecture 13: Timing revisited

EE241 - Spring 2013 Advanced Digital Integrated Circuits. Announcements. Lecture 13: Timing revisited EE241 - Spring 2013 Advanced Digital Integrated Circuits Lecture 13: Timing revisited Announcements Homework 2 due today Quiz #2 on Monday Midterm project report due next Wednesday 2 1 Outline Last lecture

More information

Interconnect/Via CONCORDIA VLSI DESIGN LAB

Interconnect/Via CONCORDIA VLSI DESIGN LAB Interconnect/Via 1 Delay of Devices and Interconnect 2 Reduction of the feature size Increase in the influence of the interconnect delay on system performance Skew The difference in the arrival times of

More information

Analysis and Reduction of On-Chip Inductance Effects in Power Supply Grids

Analysis and Reduction of On-Chip Inductance Effects in Power Supply Grids Analysis and Reduction of On-Chip Inductance Effects in Power Supply Grids Woo Hyung Lee Sanjay Pant David Blaauw Department of Electrical Engineering and Computer Science {leewh, spant, blaauw}@umich.edu

More information

L15: VLSI Integration and Performance Transformations

L15: VLSI Integration and Performance Transformations L15: VLSI Integration and Performance Transformations Acknowledgement: Materials in this lecture are courtesy of the following sources and are used with permission. Curt Schurgers J. Rabaey, A. Chandrakasan,

More information

CPE/EE 427, CPE 527 VLSI Design I: Homeworks 3 & 4

CPE/EE 427, CPE 527 VLSI Design I: Homeworks 3 & 4 CPE/EE 427, CPE 527 VLSI Design I: Homeworks 3 & 4 1 2 3 4 5 6 7 8 9 10 Sum 30 10 25 10 30 40 10 15 15 15 200 1. (30 points) Misc, Short questions (a) (2 points) Postponing the introduction of signals

More information

EECS150 - Digital Design Lecture 19 CMOS Implementation Technologies. Recap and Outline

EECS150 - Digital Design Lecture 19 CMOS Implementation Technologies. Recap and Outline EECS150 - Digital Design Lecture 19 CMOS Implementation Technologies Oct. 31, 2013 Prof. Ronald Fearing Electrical Engineering and Computer Sciences University of California, Berkeley (slides courtesy

More information

Contents CONTRIBUTING FACTORS. Preface. List of trademarks 1. WHY ARE CUSTOM CIRCUITS SO MUCH FASTER?

Contents CONTRIBUTING FACTORS. Preface. List of trademarks 1. WHY ARE CUSTOM CIRCUITS SO MUCH FASTER? Contents Preface List of trademarks xi xv Introduction and Overview of the Book WHY ARE CUSTOM CIRCUITS SO MUCH FASTER? WHO SHOULD CARE? DEFINITIONS: ASIC, CUSTOM, ETC. THE 35,000 FOOT VIEW: WHY IS CUSTOM

More information

CHAPTER 6 DIGITAL CIRCUIT DESIGN USING SINGLE ELECTRON TRANSISTOR LOGIC

CHAPTER 6 DIGITAL CIRCUIT DESIGN USING SINGLE ELECTRON TRANSISTOR LOGIC 94 CHAPTER 6 DIGITAL CIRCUIT DESIGN USING SINGLE ELECTRON TRANSISTOR LOGIC 6.1 INTRODUCTION The semiconductor digital circuits began with the Resistor Diode Logic (RDL) which was smaller in size, faster

More information

On Chip Active Decoupling Capacitors for Supply Noise Reduction for Power Gating and Dynamic Dual Vdd Circuits in Digital VLSI

On Chip Active Decoupling Capacitors for Supply Noise Reduction for Power Gating and Dynamic Dual Vdd Circuits in Digital VLSI ELEN 689 606 Techniques for Layout Synthesis and Simulation in EDA Project Report On Chip Active Decoupling Capacitors for Supply Noise Reduction for Power Gating and Dynamic Dual Vdd Circuits in Digital

More information

Towards PVT-Tolerant Glitch-Free Operation in FPGAs

Towards PVT-Tolerant Glitch-Free Operation in FPGAs Towards PVT-Tolerant Glitch-Free Operation in FPGAs Safeen Huda and Jason H. Anderson ECE Department, University of Toronto, Canada 24 th ACM/SIGDA International Symposium on FPGAs February 22, 2016 Motivation

More information

The Digital Abstraction

The Digital Abstraction The Digital Abstraction 1. Making bits concrete 2. What makes a good bit 3. Getting bits under contract Handouts: Lecture Slides L02 - Digital Abstraction 1 Concrete encoding of information To this point

More information

ECEN 720 High-Speed Links Circuits and Systems

ECEN 720 High-Speed Links Circuits and Systems 1 ECEN 720 High-Speed Links Circuits and Systems Lab4 Receiver Circuits Objective To learn fundamentals of receiver circuits. Introduction Receivers are used to recover the data stream transmitted by transmitters.

More information

ICCAD 2014 Contest Incremental Timing-driven Placement: Timing Modeling and File Formats v1.1 April 14 th, 2014

ICCAD 2014 Contest Incremental Timing-driven Placement: Timing Modeling and File Formats v1.1 April 14 th, 2014 ICCAD 2014 Contest Incremental Timing-driven Placement: Timing Modeling and File Formats v1.1 April 14 th, 2014 http://cad contest.ee.ncu.edu.tw/cad-contest-at-iccad2014/problem b/ 1 Introduction This

More information

Power Spring /7/05 L11 Power 1

Power Spring /7/05 L11 Power 1 Power 6.884 Spring 2005 3/7/05 L11 Power 1 Lab 2 Results Pareto-Optimal Points 6.884 Spring 2005 3/7/05 L11 Power 2 Standard Projects Two basic design projects Processor variants (based on lab1&2 testrigs)

More information

Welcome to 6.111! Introductory Digital Systems Laboratory

Welcome to 6.111! Introductory Digital Systems Laboratory Welcome to 6.111! Introductory Digital Systems Laboratory Handouts: Info form (yellow) Course Calendar Safety Memo Kit Checkout Form Lecture slides Lectures: Chris Terman TAs: Karthik Balakrishnan HuangBin

More information

Incorporating Variability into Design

Incorporating Variability into Design Incorporating Variability into Design Jim Farrell, AMD Designing Robust Digital Circuits Workshop UC Berkeley 28 July 2006 Outline Motivation Hierarchy of Design tradeoffs Design Infrastructure for variability

More information

UNIT-II LOW POWER VLSI DESIGN APPROACHES

UNIT-II LOW POWER VLSI DESIGN APPROACHES UNIT-II LOW POWER VLSI DESIGN APPROACHES Low power Design through Voltage Scaling: The switching power dissipation in CMOS digital integrated circuits is a strong function of the power supply voltage.

More information

EECS150 - Digital Design Lecture 15 - CMOS Implementation Technologies. Overview of Physical Implementations

EECS150 - Digital Design Lecture 15 - CMOS Implementation Technologies. Overview of Physical Implementations EECS150 - Digital Design Lecture 15 - CMOS Implementation Technologies Mar 12, 2013 John Wawrzynek Spring 2013 EECS150 - Lec15-CMOS Page 1 Overview of Physical Implementations Integrated Circuits (ICs)

More information

EECS150 - Digital Design Lecture 9 - CMOS Implementation Technologies

EECS150 - Digital Design Lecture 9 - CMOS Implementation Technologies EECS150 - Digital Design Lecture 9 - CMOS Implementation Technologies Feb 14, 2012 John Wawrzynek Spring 2012 EECS150 - Lec09-CMOS Page 1 Overview of Physical Implementations Integrated Circuits (ICs)

More information

Power Optimization of FPGA Interconnect Via Circuit and CAD Techniques

Power Optimization of FPGA Interconnect Via Circuit and CAD Techniques Power Optimization of FPGA Interconnect Via Circuit and CAD Techniques Safeen Huda and Jason Anderson International Symposium on Physical Design Santa Rosa, CA, April 6, 2016 1 Motivation FPGA power increasingly

More information

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 Lecture 01: the big picture Course objective Brief tour of IC physical design

More information

EE-382M-8 VLSI II. Early Design Planning: Back End. Mark McDermott. The University of Texas at Austin. EE 382M-8 VLSI-2 Page Foil # 1 1

EE-382M-8 VLSI II. Early Design Planning: Back End. Mark McDermott. The University of Texas at Austin. EE 382M-8 VLSI-2 Page Foil # 1 1 EE-382M-8 VLSI II Early Design Planning: Back End Mark McDermott EE 382M-8 VLSI-2 Page Foil # 1 1 Backend EDP Flow The project activities will include: Determining the standard cell and custom library

More information

The Digital Abstraction

The Digital Abstraction The Digital Abstraction 1. Making bits concrete 2. What makes a good bit 3. Getting bits under contract 1 1 0 1 1 0 0 0 0 0 1 Handouts: Lecture Slides, Problem Set #1 L02 - Digital Abstraction 1 Concrete

More information

INF3430 Clock and Synchronization

INF3430 Clock and Synchronization INF3430 Clock and Synchronization P.P.Chu Using VHDL Chapter 16.1-6 INF 3430 - H12 : Chapter 16.1-6 1 Outline 1. Why synchronous? 2. Clock distribution network and skew 3. Multiple-clock system 4. Meta-stability

More information

Propagation Delay, Circuit Timing & Adder Design. ECE 152A Winter 2012

Propagation Delay, Circuit Timing & Adder Design. ECE 152A Winter 2012 Propagation Delay, Circuit Timing & Adder Design ECE 152A Winter 2012 Reading Assignment Brown and Vranesic 2 Introduction to Logic Circuits 2.9 Introduction to CAD Tools 2.9.1 Design Entry 2.9.2 Synthesis

More information

Propagation Delay, Circuit Timing & Adder Design

Propagation Delay, Circuit Timing & Adder Design Propagation Delay, Circuit Timing & Adder Design ECE 152A Winter 2012 Reading Assignment Brown and Vranesic 2 Introduction to Logic Circuits 2.9 Introduction to CAD Tools 2.9.1 Design Entry 2.9.2 Synthesis

More information

6.004 Computation Structures Spring 2009

6.004 Computation Structures Spring 2009 MIT OpenCourseWare http://ocw.mit.edu 6.004 Computation Structures Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. The Digital Abstraction

More information

EE E6930 Advanced Digital Integrated Circuits. Spring, 2002 Lecture 7. Clocked and self-resetting logic I

EE E6930 Advanced Digital Integrated Circuits. Spring, 2002 Lecture 7. Clocked and self-resetting logic I EE E6930 Advanced Digital Integrated Circuits Spring, 2002 Lecture 7. Clocked and self-resetting logic I References CBF, Chapter 8 DP, Section 4.3.3.1-4.3.3.4 Bernstein, High-speed CMOS design styles,

More information

Semiconductor Technology Academic Research Center An RTL-to-GDS2 Design Methodology for Advanced System LSI

Semiconductor Technology Academic Research Center An RTL-to-GDS2 Design Methodology for Advanced System LSI Semiconductor Technology Academic Research Center An RTL-to-GDS2 Design Methodology for Advanced System LSI Jan. 28. 2011 Nobuyuki Nishiguchi Semiconductor Technology Advanced Research Center (STARC) ASP-DAC

More information

Ruixing Yang

Ruixing Yang Design of the Power Switching Network Ruixing Yang 15.01.2009 Outline Power Gating implementation styles Sleep transistor power network synthesis Wakeup in-rush current control Wakeup and sleep latency

More information

Chapter 8: Timing Closure

Chapter 8: Timing Closure Chapter 8 Timing Closure Original Authors: Andrew B. Kahng, Jens, Igor L. Markov, Jin Hu 1 Chapter 8 Timing Closure 8.1 Introduction 8.2 Timing Analysis and Performance Constraints 8.2.1 Static Timing

More information

cq,reg clk,slew min,logic hold clk slew clk,uncertainty

cq,reg clk,slew min,logic hold clk slew clk,uncertainty Clock Network Design for Ultra-Low Power Applications Mingoo Seok, David Blaauw, Dennis Sylvester EECS, University of Michigan, Ann Arbor, MI, USA mgseok@umich.edu ABSTRACT Robust design is a critical

More information

UNIT-III POWER ESTIMATION AND ANALYSIS

UNIT-III POWER ESTIMATION AND ANALYSIS UNIT-III POWER ESTIMATION AND ANALYSIS In VLSI design implementation simulation software operating at various levels of design abstraction. In general simulation at a lower-level design abstraction offers

More information

ECEN 720 High-Speed Links: Circuits and Systems

ECEN 720 High-Speed Links: Circuits and Systems 1 ECEN 720 High-Speed Links: Circuits and Systems Lab4 Receiver Circuits Objective To learn fundamentals of receiver circuits. Introduction Receivers are used to recover the data stream transmitted by

More information

RECENT technology trends have lead to an increase in

RECENT technology trends have lead to an increase in IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39, NO. 9, SEPTEMBER 2004 1581 Noise Analysis Methodology for Partially Depleted SOI Circuits Mini Nanua and David Blaauw Abstract In partially depleted silicon-on-insulator

More information

Low Power Design in VLSI

Low Power Design in VLSI Low Power Design in VLSI Evolution in Power Dissipation: Why worry about power? Heat Dissipation source : arpa-esto microprocessor power dissipation DEC 21164 Computers Defined by Watts not MIPS: µwatt

More information

BICMOS Technology and Fabrication

BICMOS Technology and Fabrication 12-1 BICMOS Technology and Fabrication 12-2 Combines Bipolar and CMOS transistors in a single integrated circuit By retaining benefits of bipolar and CMOS, BiCMOS is able to achieve VLSI circuits with

More information

Welcome to 6.111! Introductory Digital Systems Laboratory

Welcome to 6.111! Introductory Digital Systems Laboratory Welcome to 6.111! Introductory Digital Systems Laboratory Handouts: Info form (yellow) Course Calendar Lecture slides Lectures: Ike Chuang Chris Terman TAs: Javier Castro Eric Fellheimer Jae Lee Willie

More information

L15: VLSI Integration and Performance Transformations

L15: VLSI Integration and Performance Transformations L15: VLSI Integration and Performance Transformations Average Cost of one transistor Acknowledgement: 10 1 0.1 0.01 0.001 0.0001 0.00001 $ 0.000001 Gordon Moore, Keynote Presentation at ISSCC 2003 0.0000001

More information

! Is it feasible? ! How do we decompose the problem? ! Vdd. ! Topology. " Gate choice, logical optimization. " Fanin, fanout, Serial vs.

! Is it feasible? ! How do we decompose the problem? ! Vdd. ! Topology.  Gate choice, logical optimization.  Fanin, fanout, Serial vs. ESE 570: Digital Integrated Circuits and VLSI Fundamentals Design Space Exploration Lec 18: March 28, 2017 Design Space Exploration, Synchronous MOS Logic, Timing Hazards 3 Design Problem Problem Solvable!

More information

Lecture 3, Handouts Page 1. Introduction. EECE 353: Digital Systems Design Lecture 3: Digital Design Flows, Simulation Techniques.

Lecture 3, Handouts Page 1. Introduction. EECE 353: Digital Systems Design Lecture 3: Digital Design Flows, Simulation Techniques. Introduction EECE 353: Digital Systems Design Lecture 3: Digital Design Flows, Techniques Cristian Grecu grecuc@ece.ubc.ca Course web site: http://courses.ece.ubc.ca/353/ What have you learned so far?

More information

Announcements. Advanced Digital Integrated Circuits. Project proposals due today. Homework 1. Lecture 8: Gate delays,

Announcements. Advanced Digital Integrated Circuits. Project proposals due today. Homework 1. Lecture 8: Gate delays, EE4 - Spring 008 Advanced Digital Integrated Circuits Lecture 8: Gate delays, Variability Announcements Project proposals due today Title Team members ½ page ~5 references Post it on your EECS web page

More information

ASICs Concept to Product

ASICs Concept to Product ASICs Concept to Product Synopsis This course is aimed to provide an opportunity for the participant to acquire comprehensive technical and business insight into the ASIC world. As most of these aspects

More information

Technology Timeline. Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs. FPGAs. The Design Warrior s Guide to.

Technology Timeline. Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs. FPGAs. The Design Warrior s Guide to. FPGAs 1 CMPE 415 Technology Timeline 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs FPGAs The Design Warrior s Guide

More information

Module -18 Flip flops

Module -18 Flip flops 1 Module -18 Flip flops 1. Introduction 2. Comparison of latches and flip flops. 3. Clock the trigger signal 4. Flip flops 4.1. Level triggered flip flops SR, D and JK flip flops 4.2. Edge triggered flip

More information

2009 Spring CS211 Digital Systems & Lab 1 CHAPTER 3: TECHNOLOGY (PART 2)

2009 Spring CS211 Digital Systems & Lab 1 CHAPTER 3: TECHNOLOGY (PART 2) 1 CHAPTER 3: IMPLEMENTATION TECHNOLOGY (PART 2) Whatwillwelearninthischapter? we learn in this 2 How transistors operate and form simple switches CMOS logic gates IC technology FPGAs and other PLDs Basic

More information

Amber Path FX SPICE Accurate Statistical Timing for 40nm and Below Traditional Sign-Off Wastes 20% of the Timing Margin at 40nm

Amber Path FX SPICE Accurate Statistical Timing for 40nm and Below Traditional Sign-Off Wastes 20% of the Timing Margin at 40nm Amber Path FX SPICE Accurate Statistical Timing for 40nm and Below Amber Path FX is a trusted analysis solution for designers trying to close on power, performance, yield and area in 40 nanometer processes

More information

Lecture 13: Interconnects in CMOS Technology

Lecture 13: Interconnects in CMOS Technology Lecture 13: Interconnects in CMOS Technology Mark McDermott Electrical and Computer Engineering The University of Texas at Austin 10/18/18 VLSI-1 Class Notes Introduction Chips are mostly made of wires

More information

Accurate Timing and Power Characterization of Static Single-Track Full-Buffers

Accurate Timing and Power Characterization of Static Single-Track Full-Buffers Accurate Timing and Power Characterization of Static Single-Track Full-Buffers By Rahul Rithe Department of Electronics & Electrical Communication Engineering Indian Institute of Technology Kharagpur,

More information

A 0.9 V Low-power 16-bit DSP Based on a Top-down Design Methodology

A 0.9 V Low-power 16-bit DSP Based on a Top-down Design Methodology UDC 621.3.049.771.14:621.396.949 A 0.9 V Low-power 16-bit DSP Based on a Top-down Design Methodology VAtsushi Tsuchiya VTetsuyoshi Shiota VShoichiro Kawashima (Manuscript received December 8, 1999) A 0.9

More information

Power Supply Networks: Analysis and Synthesis. What is Power Supply Noise?

Power Supply Networks: Analysis and Synthesis. What is Power Supply Noise? Power Supply Networs: Analysis and Synthesis What is Power Supply Noise? Problem: Degraded voltage level at the delivery point of the power/ground grid causes performance and/or functional failure Lower

More information

LSI and Circuit Technologies for the SX-8 Supercomputer

LSI and Circuit Technologies for the SX-8 Supercomputer LSI and Circuit Technologies for the SX-8 Supercomputer By Jun INASAKA,* Toshio TANAHASHI,* Hideaki KOBAYASHI,* Toshihiro KATOH,* Mikihiro KAJITA* and Naoya NAKAYAMA This paper describes the LSI and circuit

More information

EECS 427 Lecture 21: Design for Test (DFT) Reminders

EECS 427 Lecture 21: Design for Test (DFT) Reminders EECS 427 Lecture 21: Design for Test (DFT) Readings: Insert H.3, CBF Ch 25 EECS 427 F09 Lecture 21 1 Reminders One more deadline Finish your project by Dec. 14 Schematic, layout, simulations, and final

More information

Design Challenges in Multi-GHz Microprocessors

Design Challenges in Multi-GHz Microprocessors Design Challenges in Multi-GHz Microprocessors Bill Herrick Director, Alpha Microprocessor Development www.compaq.com Introduction Moore s Law ( Law (the trend that the demand for IC functions and the

More information

! Review: Sequential MOS Logic. " SR Latch. " D-Latch. ! Timing Hazards. ! Dynamic Logic. " Domino Logic. ! Charge Sharing Setup.

! Review: Sequential MOS Logic.  SR Latch.  D-Latch. ! Timing Hazards. ! Dynamic Logic.  Domino Logic. ! Charge Sharing Setup. ESE 570: Digital Integrated Circuits and VLSI Fundamentals Lec 9: March 29, 206 Timing Hazards and Dynamic Logic Lecture Outline! Review: Sequential MOS Logic " SR " D-! Timing Hazards! Dynamic Logic "

More information

UNIVERSITY OF BOLTON SCHOOL OF ENGINEERING BENG (HONS) ELECTRICAL & ELECTRONICS ENGINEERING SEMESTER TWO EXAMINATION 2017/2018

UNIVERSITY OF BOLTON SCHOOL OF ENGINEERING BENG (HONS) ELECTRICAL & ELECTRONICS ENGINEERING SEMESTER TWO EXAMINATION 2017/2018 UNIVERSITY OF BOLTON [EES04] SCHOOL OF ENGINEERING BENG (HONS) ELECTRICAL & ELECTRONICS ENGINEERING SEMESTER TWO EXAMINATION 2017/2018 INTERMEDIATE DIGITAL ELECTRONICS AND COMMUNICATIONS MODULE NO: EEE5002

More information

ECE 551: Digital System Design & Synthesis

ECE 551: Digital System Design & Synthesis ECE 551: Digital System Design & Synthesis Lecture Set 9 9.1: Constraints and Timing 9.2: Optimization (In separate file) 03/30/03 1 ECE 551 - Digital System Design & Synthesis Lecture 9.1 - Constraints

More information

ECE 484 VLSI Digital Circuits Fall Lecture 02: Design Metrics

ECE 484 VLSI Digital Circuits Fall Lecture 02: Design Metrics ECE 484 VLSI Digital Circuits Fall 2016 Lecture 02: Design Metrics Dr. George L. Engel Adapted from slides provided by Mary Jane Irwin (PSU) [Adapted from Rabaey s Digital Integrated Circuits, 2002, J.

More information

Fast Placement Optimization of Power Supply Pads

Fast Placement Optimization of Power Supply Pads Fast Placement Optimization of Power Supply Pads Yu Zhong Martin D. F. Wong Dept. of Electrical and Computer Engineering Dept. of Electrical and Computer Engineering Univ. of Illinois at Urbana-Champaign

More information

AN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER

AN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER AN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER K. RAMAMOORTHY 1 T. CHELLADURAI 2 V. MANIKANDAN 3 1 Department of Electronics and Communication

More information

Digital Design and System Implementation. Overview of Physical Implementations

Digital Design and System Implementation. Overview of Physical Implementations Digital Design and System Implementation Overview of Physical Implementations CMOS devices CMOS transistor circuit functional behavior Basic logic gates Transmission gates Tri-state buffers Flip-flops

More information

Delay-Locked Loop Using 4 Cell Delay Line with Extended Inverters

Delay-Locked Loop Using 4 Cell Delay Line with Extended Inverters International Journal of Electronics and Electrical Engineering Vol. 2, No. 4, December, 2014 Delay-Locked Loop Using 4 Cell Delay Line with Extended Inverters Jefferson A. Hora, Vincent Alan Heramiz,

More information

DESIGNING powerful and versatile computing systems is

DESIGNING powerful and versatile computing systems is 560 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO. 5, MAY 2007 Variation-Aware Adaptive Voltage Scaling System Mohamed Elgebaly, Member, IEEE, and Manoj Sachdev, Senior

More information

In this lecture, we will first examine practical digital signals. Then we will discuss the timing constraints in digital systems.

In this lecture, we will first examine practical digital signals. Then we will discuss the timing constraints in digital systems. 1 In this lecture, we will first examine practical digital signals. Then we will discuss the timing constraints in digital systems. The important concepts are related to setup and hold times of registers

More information

DESIGN FOR LOW-POWER USING MULTI-PHASE AND MULTI- FREQUENCY CLOCKING

DESIGN FOR LOW-POWER USING MULTI-PHASE AND MULTI- FREQUENCY CLOCKING 3 rd Int. Conf. CiiT, Molika, Dec.12-15, 2002 31 DESIGN FOR LOW-POWER USING MULTI-PHASE AND MULTI- FREQUENCY CLOCKING M. Stojčev, G. Jovanović Faculty of Electronic Engineering, University of Niš Beogradska

More information

EECS150 - Digital Design Lecture 2 - CMOS

EECS150 - Digital Design Lecture 2 - CMOS EECS150 - Digital Design Lecture 2 - CMOS August 29, 2002 John Wawrzynek Fall 2002 EECS150 - Lec02-CMOS Page 1 Outline Overview of Physical Implementations CMOS devices Announcements/Break CMOS transistor

More information

Low-Power CMOS VLSI Design

Low-Power CMOS VLSI Design Low-Power CMOS VLSI Design ( 范倫達 ), Ph. D. Department of Computer Science, National Chiao Tung University, Taiwan, R.O.C. Fall, 2017 ldvan@cs.nctu.edu.tw http://www.cs.nctu.tw/~ldvan/ Outline Introduction

More information

! Sequential Logic. ! Timing Hazards. ! Dynamic Logic. ! Add state elements (registers, latches) ! Compute. " From state elements

! Sequential Logic. ! Timing Hazards. ! Dynamic Logic. ! Add state elements (registers, latches) ! Compute.  From state elements ESE 570: Digital Integrated Circuits and VLSI Fundamentals Lec 19: April 2, 2019 Sequential Logic, Timing Hazards and Dynamic Logic Lecture Outline! Sequential Logic! Timing Hazards! Dynamic Logic 4 Sequential

More information

Probabilistic and Variation- Tolerant Design: Key to Continued Moore's Law. Tanay Karnik, Shekhar Borkar, Vivek De Circuit Research, Intel Labs

Probabilistic and Variation- Tolerant Design: Key to Continued Moore's Law. Tanay Karnik, Shekhar Borkar, Vivek De Circuit Research, Intel Labs Probabilistic and Variation- Tolerant Design: Key to Continued Moore's Law Tanay Karnik, Shekhar Borkar, Vivek De Circuit Research, Intel Labs 1 Outline Variations Process, supply voltage, and temperature

More information

CMOS circuits and technology limits

CMOS circuits and technology limits Section I CMOS circuits and technology limits 1 Energy efficiency limits of digital circuits based on CMOS transistors Elad Alon 1.1 Overview Over the past several decades, CMOS (complementary metal oxide

More information