Static Timing Overview with intro to FPGAs. Prof. MacDonald

Similar documents
Timing Issues in FPGA Synchronous Circuit Design

Timing analysis can be done right after synthesis. But it can only be accurately done when layout is available

ENGIN 112 Intro to Electrical and Computer Engineering

INF3430 Clock and Synchronization

In this lecture, we will first examine practical digital signals. Then we will discuss the timing constraints in digital systems.

Geared Oscillator Project Final Design Review. Nick Edwards Richard Wright

UNIVERSITY OF BOLTON SCHOOL OF ENGINEERING BENG (HONS) ELECTRICAL & ELECTRONICS ENGINEERING SEMESTER TWO EXAMINATION 2017/2018

Lecture 19: Design for Skew

ECE 551: Digital System Design & Synthesis

Reference. Wayne Wolf, FPGA-Based System Design Pearson Education, N Krishna Prakash,, Amrita School of Engineering

CS/EE Homework 9 Solutions

Lecture 9: Clocking for High Performance Processors

1 Q' 3. You are given a sequential circuit that has the following circuit to compute the next state:

logic system Outputs The addition of feedback means that the state of the circuit may change with time; it is sequential. logic system Outputs

EDA Challenges for Low Power Design. Anand Iyer, Cadence Design Systems

Lecture 02: Digital Logic Review

Technology Timeline. Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs. FPGAs. The Design Warrior s Guide to.

Course Summary. 3213: Digital Systems & Microprocessors: L#14_15

EITF35: Introduction to Structured VLSI Design

CHAPTER III THE FPGA IMPLEMENTATION OF PULSE WIDTH MODULATION

Sequential Logic Circuits

Audio Sample Rate Conversion in FPGAs

Module -18 Flip flops

We ve looked at timing issues in combinational logic Let s now examine timing issues we must deal with in sequential circuits

2014 Paper E2.1: Digital Electronics II

Sequential Logic Circuits

Low Power Design. Prof. MacDonald

Lecture #2 Solving the Interconnect Problems in VLSI

I hope you have completed Part 2 of the Experiment and is ready for Part 3.

On-silicon Instrumentation

Time to Digital Converter Core for Spartan-6 FPGAs

Metastability. 1

CMOS Process Variations: A Critical Operation Point Hypothesis

Logic Synthesis. Logic synthesis transforms RTL code into a gate-level netlist. RTL Verilog converted into Structural Verilog

FINITE IMPULSE RESPONSE (FIR) FILTER

ECE 2300 Digital Logic & Computer Organization

CPE/EE 427, CPE 527 VLSI Design I: Homeworks 3 & 4

Lecture 3, Handouts Page 1. Introduction. EECE 353: Digital Systems Design Lecture 3: Digital Design Flows, Simulation Techniques.

CSE 260 Digital Computers: Organization and Logical Design. Midterm Solutions

Lecture 4&5 CMOS Circuits

ECE380 Digital Logic

Design and implementation of LDPC decoder using time domain-ams processing

Digital Systems Design

ECEN 720 High-Speed Links: Circuits and Systems

The Use and Design of Synchronous Mirror Delays. Vince DiPuccio ECG 721 Spring 2017

Testing Digital Systems II

PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS

Sensing Voltage Transients Using Built-in Voltage Sensor

12 BIT ACCUMULATOR FOR DDS

An Overview of the NASA Goddard Methodology for FPGA Radiation Testing and Soft Error Rate (SER) Prediction

R Using the Virtex Delay-Locked Loop

Managing Cross-talk Noise

CHAPTER 16 SEQUENTIAL CIRCUIT DESIGN. Click the mouse to move to the next page. Use the ESC key to exit this chapter.

VLSI Design 11. Sequential Elements

! Review: Sequential MOS Logic. " SR Latch. " D-Latch. ! Timing Hazards. ! Dynamic Logic. " Domino Logic. ! Charge Sharing Setup.

EECS 270: Lab 7. Real-World Interfacing with an Ultrasonic Sensor and a Servo

EECS 427 Lecture 22: Low and Multiple-Vdd Design

Basic Logic Circuits

The challenges of low power design Karen Yorav

IES Digital Mock Test

Digital Design and System Implementation. Overview of Physical Implementations

Programmable Interconnect. CPE/EE 428, CPE 528: Session #13. Actel Programmable Interconnect. Actel Programmable Interconnect

ECE520 VLSI Design. Lecture 11: Combinational Static Logic. Prof. Payman Zarkesh-Ha

FPGA Based System Design

Accurate Timing and Power Characterization of Static Single-Track Full-Buffers

Project Board Game Counter: Digital

Introduction to CMOS VLSI Design (E158) Lecture 5: Logic

ELEC Digital Logic Circuits Fall 2015 Delay and Power

Microcircuit Electrical Issues

SRV ENGINEERING COLLEGE SEMBODAI RUKMANI VARATHARAJAN ENGINEERING COLLEGE SEMBODAI

PE713 FPGA Based System Design

MICROWIND2 DSCH2 8. Converters /11/00

IJITKMI Volume 6 Number 2 July-December 2013 pp FPGA-based implementation of UART

Multi-Channel FIR Filters

! Sequential Logic. ! Timing Hazards. ! Dynamic Logic. ! Add state elements (registers, latches) ! Compute. " From state elements

1/19/2012. Timing in Asynchronous Circuits

PRESENTATION OF THE PROJECTX-FINAL LEVEL 1.

Mohit Arora. The Art of Hardware Architecture. Design Methods and Techniques. for Digital Circuits. Springer

EE584 Introduction to VLSI Design Final Project Document Group 9 Ring Oscillator with Frequency selector

Lecture 11: Clocking

EC4205 Microprocessor and Microcontroller

EE382M VLSI- II. EDP- TC: Early Design Planning for Timing Closure. Spring Mark McDermoF. EE382M- 8 Class Notes

CSE241 VLSI Digital Circuits Winter Lecture 06: Timing

PROGRAMMABLE ASICs. Antifuse SRAM EPROM

EECS150 - Digital Design Lecture 19 CMOS Implementation Technologies. Recap and Outline

ECEN 720 High-Speed Links Circuits and Systems

ACEX 1K. Features... Programmable Logic Device Family. Tools

LOGIC DIAGRAM: HALF ADDER TRUTH TABLE: A B CARRY SUM. 2012/ODD/III/ECE/DE/LM Page No. 1

ICCAD 2014 Contest Incremental Timing-driven Placement: Timing Modeling and File Formats v1.1 April 14 th, 2014

Maximum data rate: 50 MBaud Data rate range: ±15% Lock-in time: 1 bit

ACEX 1K. Features... Programmable Logic Family. Tools. Table 1. ACEX TM 1K Device Features

Lecture 1. Tinoosh Mohsenin

UNIT-II LOW POWER VLSI DESIGN APPROACHES

FPGA IMPLEMENTATION OF POWER EFFICIENT ALL DIGITAL PHASE LOCKED LOOP

CS 61C: Great Ideas in Computer Architecture Finite State Machines, Functional Units

EE 434 ASIC and Digital Systems. Prof. Dae Hyun Kim School of Electrical Engineering and Computer Science Washington State University.

Controller Implementation--Part I. Cascading Edge-triggered Flip-Flops

VLSI IMPLEMENTATION OF MODIFIED DISTRIBUTED ARITHMETIC BASED LOW POWER AND HIGH PERFORMANCE DIGITAL FIR FILTER Dr. S.Satheeskumaran 1 K.

PV SYSTEM BASED FPGA: ANALYSIS OF POWER CONSUMPTION IN XILINX XPOWER TOOL

A Digital Clock Multiplier for Globally Asynchronous Locally Synchronous Designs

Transcription:

Static Timing Overview with intro to FPGAs Prof. MacDonald

Static Timing In the 70 s timing was performed with Spice simulation In the 80 s timing was included in Verilog simulation to determine if design was sufficiently fast. Two problems with either approach (Dynamic Timing): 1) Analysis was only as good as simulations a problem was only found if exercised by the sim 2) Logic simulations were 5-10 times slower Static Timing is more comprehensive. Calculate the delay for every possible logical path in the design. The worst case path determines the max freq.

Setup Timing flop to flop Check that signal arrives in time for the clock Can be solved by: 1) simply slowing down the clock, 2) reducing logic delay between flops 3) using faster flip-flops http://www.xilinx.com/support/sw_manuals/2_1i/download/timing.pdf

Setup requirement calculation Setup requirement is the time that data should be valid before the capture clock edge. Calculate the required arrival time (RAT) and the actual arrival time. The actual should be before the required. How much before is your slack. Actual RAT RAT == Actual -> thus zero slack Passing but barely http://www.xilinx.com/support/sw_manuals/2_1i/download/timing.pdf

Hold requirement calculation Hold time requirement is the time after the edge that data should remain valid Calculate the required arrival time (RAT) and the actual arrival time. The actual should be after the required. How much after is your positive slack. Only occurs with clock skew. Nothing to do with clock period or frequency Logic delay is usually only the launch delay of the launch ff. Worst case is a shift register with no logic between flip-flops. Launch clock Launched data at capture FF early. Capture clock actual RAT logic skew Th http://www.xilinx.com/support/sw_manuals/2_1i/download/timing.pdf

Setup Timing flop to output Check that signal arrives in time for the clock Can solve by: 1) simply slowing down the clock, 2) reducing logic delay 3) reducing external requirements http://www.xilinx.com/support/sw_manuals/2_1i/download/timing.pdf

Setup Timing input to flop Check that signal arrives in time for the clock Can solve by: 1) simply slowing down the clock, 2) reducing logic delay 3) Improving arrival time at input pin http://www.xilinx.com/support/sw_manuals/2_1i/download/timing.pdf

Clock skew affect on setup time Clock skew can hurt or help setup times. Negative clock skew reduces the full period of operation and therefore hurts setup times as well as the maximum frequency Xilinx ignores positive skew for setup calculations. http://www.xilinx.com/support/sw_manuals/2_1i/download/timing.pdf

Bad for setups Case 1 Slow path Launch = 1ns Logic delay = 7ns slow Setup = 1ns Hold = 0ns (not used) Skew = 2ns Period = 10ns Setup Slack = -1 ns Good for holds Case 2 Fast path Launch = 1ns Logic delay = 0ns fast Setup = 1ns (not used) Skew = +3ns Hold Slack = +4 ns http://www.xilinx.com/support/sw_manuals/2_1i/download/timing.pdf

Clock skew affect on hold time Hold times violations are only possible due to the positive clock skew. Sinister problem. Fabricated chip cannot be fixed by slowing down the clock. Worst cases are paths with low logic delay such as shift registers. Fixed prior to fabrication by balancing the clock tree or introducing buffer delay in http://www.xilinx.com/support/sw_manuals/2_1i/download/timing.pdf logic

I ve seen this done intentionally, but can cause problems in next layer of logic Clock skew affect on hold time Good for setups: Launch = 1ns Logic = 9ns Setup = 1ns Skew = -2ns Period = 10ns Setup Slack = +1nS Bad for holds: Launch = 1ns Setup = 1ns (used?) Skew = -3ns Hold Slack = -1 ns http://www.xilinx.com/support/sw_manuals/2_1i/download/timing.pdf

Cycle Stealing to help Setups Clock period = 10ns Tlaunch= 1ns Logic Delay = 9ns Tlaunch= 1ns Tsu=1ns Logic Delay = 7.0 ns Tsu=1ns D FF D FF D FF Original Clock arrives at 0ns fail Clock arrives at 0ns pass Clock arrives at 0ns After theft Clock arrives at 0ns pass Intentionally delay clock to middle FF by 1.5 ns fail Clock arrives at 0ns After Optimization Clock arrives at 0ns pass Delay clock to middle FF by 1.0 ns pass Clock arrives at 0ns

Fixing hold violations Clock period = 10ns - irrelevant Tlaunch= 1ns Logic Delay = 0ns Tlaunch= 1ns Tsu=1ns D FF D FF Original Clock arrives At 2ns fail Clock arrives At -1 ns After buffer Clock arrives At 2ns pass Add 4ns buffer delay in path. No change in logic. Don t make the mistake of using inverters in pairs I ve seen odd numbers introduced which changes function.

Clock Tree Design and Synthesis Clock fanout one source millions of flip-flops need a buffer tree to reduce fanout and balanced Clock delay time between clock introduction to arrival at flipflops important for synchronizing to other chips Clock skew difference in arrival between any two flip-flops Clock power clock is fast and most active signal with huge load easily consumes 20-30% of power http://www.xilinx.com/support/sw_manuals/2_1i/download/timing.pdf

Example of paths Tlaunch=1ns D FF Tp=2ns Tp=1ns Tsu=1ns Tlaunch=1ns Tp=2ns D FF D FF Tlaunch=1ns Tp=0.5ns D FF How fast can this logic run? Tp actually exist for rising and falling logic

Setup violation interpret report Output report from Cadence Script (part of a full synthesis script): define_clock nam vclk -period 50 clk external_delay input 0 clock vclk [find / -prot ports_in/*] external_delay output 0 clock vclk [find / -prot ports_out/*] report timing > timing.rpt

Setup violation interpret report

Xilinx Spartan FPGA Logic Architecture Four input Look Up Table (LUT) to provide logic function. Example: Y = A*B + C*D Bypassable flipflop to select ABCD Y sequential or 0000 0 combinatorial logic 0001 0 0010 0 0011 1 0100 0 0101 0 0110 0 0111 1 1000 0 1001 0 1010 0 1011 1 1100 1 1101 1 1110 1 1111 1

These two lines of the ucf file indicate how fast we need the design to operate. In this example, the period of clk is 10ns so we are targe=ng 100MHz.

ABer synthesis, the sta=c =ming report iden=fies the one logical path which is the slowest. This dictates the maximum frequency of opera=on. In this example the max frequency is 165MHz which exceeded the requirement in the ucf file

Example of failing =ming. Clock set to 1GHz Design can run at ~160MHz Nega=ve slack is bad

Multi-cycle Paths data_in[31:0] D Q Q Large, slow operation valid D Q Q data_out[31:0] start_calc D Q D Q D Q valid

False Paths Many logic paths will never be exercised for a given functional application. - Test pins and structures - Mode pins that are used but never change - Paths between two cores that never communicate - Paths between two asynchronous clock domains TIMESPEC "TS_false" = FROM "clockdomain1" TO "clockdomain2" TIG; TIMESPEC tsid=from source_group TO destination_group time [unit] NET net_name TIG

Histograms for Slack