Interconnect. Physical Entities

Similar documents
ESE534: Computer Organization. Previously. Wires and VLSI. Today. Visually: Wires and VLSI. Preclass 1

Lecture Perspectives. Administrivia

Lecture 30. Perspectives. Digital Integrated Circuits Perspectives

Very Large Scale Integration (VLSI)

Design of Parallel Algorithms. Communication Algorithms

Reference. Wayne Wolf, FPGA-Based System Design Pearson Education, N Krishna Prakash,, Amrita School of Engineering

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

Interconnect-Power Dissipation in a Microprocessor

Memory Basics. historically defined as memory array with individual bit access refers to memory with both Read and Write capabilities

Lecture 3, Handouts Page 1. Introduction. EECE 353: Digital Systems Design Lecture 3: Digital Design Flows, Simulation Techniques.

What this paper is about:

Nanowire-Based Programmable Architectures

Lecture 11: Clocking

ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems. Today. Variation. Variation. Process Corners.

PHYSICAL STRUCTURE OF CMOS INTEGRATED CIRCUITS. Dr. Mohammed M. Farag

ESE532: System-on-a-Chip Architecture. Today. Message. Crossbar. Interconnect Concerns

CSE548, AMS542: Analysis of Algorithms, Fall 2016 Date: Sep 25. Homework #1. ( Due: Oct 10 ) Figure 1: The laser game.

2009 Spring CS211 Digital Systems & Lab 1 CHAPTER 3: TECHNOLOGY (PART 2)

Ramon Canal NCD Master MIRI. NCD Master MIRI 1

The backend duplication method

VLSI Physical Design Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

ECE380 Digital Logic

DAV Institute of Engineering & Technology Department of ECE. Course Outcomes

PROGRAMMABLE ASIC INTERCONNECT

37 Game Theory. Bebe b1 b2 b3. a Abe a a A Two-Person Zero-Sum Game

Digital Logic, Algorithms, and Functions for the CEBAF Upgrade LLRF System Hai Dong, Curt Hovater, John Musson, and Tomasz Plawski

Overview ECE 553: TESTING AND TESTABLE DESIGN OF DIGITAL SYSTES. Motivation. Modeling Levels. Hierarchical Model: A Full-Adder 9/6/2002

A New network multiplier using modified high order encoder and optimized hybrid adder in CMOS technology

Engr354: Digital Logic Circuits

Methodologies for Tolerating Cell and Interconnect Faults in FPGAs

Implementing Logic with the Embedded Array

Single Event Transient Effects on Microsemi ProASIC Flash-based FPGAs: analysis and possible solutions

Course Overview. Course Overview

Introduction to CMOS VLSI Design (E158) Lecture 5: Logic

EECS 427 Lecture 21: Design for Test (DFT) Reminders

Design and Implementation of an Efficient Vedic Multiplier for High Performance and Low Power Applications

Chapter 3. H/w s/w interface. hardware software Vijaykumar ECE495K Lecture Notes: Chapter 3 1

Technology Timeline. Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs. FPGAs. The Design Warrior s Guide to.

PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS

NanoFabrics: : Spatial Computing Using Molecular Electronics

Lecture Outline. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. Previously: Two XOR Gates. Pass Transistor Logic. Cascaded Pass Gates

Field Programmable Gate Array

Synthesis of Combinational Logic

Design and Analysis of Algorithms Prof. Madhavan Mukund Chennai Mathematical Institute. Module 6 Lecture - 37 Divide and Conquer: Counting Inversions

Chapter 4 Combinational Logic Circuits

White Paper Stratix III Programmable Power

Layers. Layers. Layers. Transistor Manufacturing COMP375 1

Disseny físic. Disseny en Standard Cells. Enric Pastor Rosa M. Badia Ramon Canal DM Tardor DM, Tardor

UNIT III VLSI CIRCUIT DESIGN PROCESSES. In this chapter we will be studying how to get the schematic into stick diagrams or layouts.

Towards PVT-Tolerant Glitch-Free Operation in FPGAs

LOW-POWER SOFTWARE-DEFINED RADIO DESIGN USING FPGAS

UNIT-III POWER ESTIMATION AND ANALYSIS

Application-Independent Defect-Tolerant Crossbar Nano-Architectures

Lecture 6: Electronics Beyond the Logic Switches Xufeng Kou School of Information Science and Technology ShanghaiTech University

Vesselin K. Vassilev South Bank University London Dominic Job Napier University Edinburgh Julian F. Miller The University of Birmingham Birmingham

The Game-Theoretic Approach to Machine Learning and Adaptation

EC 1354-Principles of VLSI Design

Introduction to CMOS VLSI Design (E158) Lecture 9: Cell Design

Game Theory and Randomized Algorithms

Variation-Aware Design for Nanometer Generation LSI

Chapter 3 Chip Planning

Design of Digital FIR Filter using Modified MAC Unit

Implementing FIR Filters and FFTs with 28-nm Variable-Precision DSP Architecture

Lecture Outline. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. Teaser. Pass Transistor Logic. Identify Function.

EE 434 ASIC and Digital Systems. Prof. Dae Hyun Kim School of Electrical Engineering and Computer Science Washington State University.

ECE 546 Introduction

The dynamic power dissipated by a CMOS node is given by the equation:

ECE380 Digital Logic

5. CMOS Gates: DC and Transient Behavior

LAB 1 AN EXAMPLE MECHATRONIC SYSTEM: THE FURBY

Design Methodologies. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

LOGIC GATES AND LOGIC CIRCUITS A logic gate is an elementary building block of a Digital Circuit. Most logic gates have two inputs and one output.

Announcements. Advanced Digital Integrated Circuits. Project proposals due today. Homework 1. Lecture 8: Gate delays,

PROGRAMMABLE ASIC INTERCONNECT

JDT EFFECTIVE METHOD FOR IMPLEMENTATION OF WALLACE TREE MULTIPLIER USING FAST ADDERS

Hypercube Networks-III

Single-channel power supply monitor with remote temperature sense, Part 1

CMOS VLSI Design (A3425)

Lecture 20: Combinatorial Search (1997) Steven Skiena. skiena

Chapter 4 Combinational Logic Circuits

Layers. Layers. Layers. Transistor Manufacturing COMP375 1

Logic and Computer Design Fundamentals. Chapter 6 Selected Design Topics. Part 1 The Design Space

Intelligent Systems Group Department of Electronics. An Evolvable, Field-Programmable Full Custom Analogue Transistor Array (FPTA)

AND9100/D. Paralleling of IGBTs APPLICATION NOTE. Isothermal point

ENGR-4300 Electronic Instrumentation Quiz 3 Spring 2011 Name Section

Digital Systems Laboratory

Digital Systems Design

CHAPTER 3 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED ADDER TOPOLOGIES

Vol. 5, No. 6 June 2014 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

! Review: MOS IV Curves and Switch Model. ! MOS Device Layout. ! Inverter Layout. ! Gate Layout and Stick Diagrams. ! Design Rules. !

In this lecture: Lecture 8: ROM & Programmable Logic Devices

DESIGNING powerful and versatile computing systems is

Programmable Interconnect. CPE/EE 428, CPE 528: Session #13. Actel Programmable Interconnect. Actel Programmable Interconnect

ALPS: An Automatic Layouter for Pass-Transistor Cell Synthesis

CprE 583 Reconfigurable Computing

DESIGN OF LOW POWER HIGH SPEED ERROR TOLERANT ADDERS USING FPGA

The Design of SET-CMOS Hybrid Logic Style of 1-Bit Comparator

ECE 172 Digital Systems. Chapter 2 Digital Hardware. Herbert G. Mayer, PSU Status 6/30/2018

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors

Digital Microelectronic Circuits ( ) CMOS Digital Logic. Lecture 6: Presented by: Adam Teman

Transcription:

Interconnect André DeHon <andre@cs.caltech.edu> Thursday, June 20, 2002 Physical Entities Idea: Computations take up space Bigger/smaller computations Size resources cost Size distance delay 1

Impact Consequence is: Properties of the physical world ultimately affect our computations Delay = Distance / Speed Scattering, mean-free-path Thermodynamics (reversibility, kt, ) Interconnect Perhaps nowhere is this more present than in interconnect Speed of light delay Finite size of devices Ultimate limits (Feynman s Bottom ) What we can pattern and control today How well we can localize phenomena (tunneling) Area and geometry of wires 2

Interconnect Today Wires and VLSI Dominance of Interconnect Implications for physical computing systems Physical Interconnect Anything that allows one physical component of the computer to communicate with another Wires that connect transistors or gates Traces on printed circuit boards that connect components Cables and backplanes that connect boards Ethernet and video cables that connect workstations, switches, and IO Fibers that connect our building routers 3

Interconnect Today, let s concentrate on gates and wires Modern component contains millions of gates (e.g. 2-input nor gate) Each gate takes up finite space To work together, these gates need to communicate with each other Need wires for interconnect Last Time We saw that Modest size programmable gates Connected by programmable interconnect Are more efficient than Tiny programmable gates Large LUTs Even though the interconnect may take up most of the area! 4

Small Example Physical Layout 5

Larger Example More typically, we have a very large number of gates that need to be connected. DES Circuit Larger Example (DES) Routed Must find place for all those wires. 6

Closeup (DES Routed) Wires can take up significant space. For Claim Sufficiently large computations arrary design (and many particular) with finite size wires Area associated with interconnect will dominate that required for gates. Natural consequence of physical geometry in two-dimensional space (any finite dimensions) 7

Wires and VLSI Simple VLSI model Gates have fixed size (A gate ) Wires have finite spacing (W wire ) Have a small, finite number of wiring layers E.g. one for horizontal wiring one for vertical wiring nand2 Assume wires can run over gates Visually: Wires and VLSI or2 and2 inv inv xor2 nand2 or2 xnor2 nor2 8

Important Consequence A set of wires crossing a line take up space: W = (N x W wire ) / N layers W = 7 W wire Thompson s Argument The minimum area of a VLSI component is bounded by the larger of: The area to hold all the gates A chip N A gate The area required by the wiring A chip N horizontal W wire N vertical W wire 9

How many wires? We can get a lower bound on the total number of horizontal (vertical) wires by considering the bisection of the computational graph: Cut the graph of gates in half Minimize connections between halves Count number of connections in cut Gives a lower bound on number of wires Bisection Bisection Width 3 10

Next Question In general, if we: Cut design in half Minimizing cut wires How many wires will be in the bisection? N/2 cutsize N/2 Arrary Graph Graph with N nodes Cut in half N/2 gates on each side Worst-case: Every gate output on each side Is used somewhere on other side Cut contains N wires 11

Arrary Graph For a random graph Something proportional to this is likely That is: Given a random graph with N nodes The number of wires in the bisection is likely to be: c N Particular Computational Graphs Some important computations have exactly this property FFT (Fast Fourier Transform) Sorting 12

FFT FFT Can implement with N/2 nodes Group row together Any bisection will cut N/2 wire bundles True for any reordering 13

Assembling what we know A chip N A gate A chip N horizontal W wire N vertical W wire N horizontal = c N N vertical = c N [bound true recursively in graph] A chip cn W wire c N W wire Assembling A chip N A gate A chip cn W wire cn W wire A chip (cn W wire ) 2 A chip N 2 c 14

A chip N A gate A chip N 2 c Result Wire area grows faster than gate area Wire area grows with the square of gate area For sufficiently large N, Wire area dominates gate area Intuitive Version Consider a ion of a chip Gate capacity in the ion goes as area (s 2 ) Wiring capacity into ion goes as perimeter (4s) Perimeter grows more slowly than area Wire capacity saturates before gate 15

A chip N 2 c Result Wire area grows with the square of gate area Troubling: To double the size of our computation Must quadruple the size of our chip! Interlude 16

Miles of Wire Consider FPGA Programmable Gate Arrays Today providing ~1 Million gate capacity devices What we really sell is miles of wiring. Clive McCarthy (Altera) circa 1998 15mm die 15mm/0.5µm wire spacing (450m/layer) 5 layers > 2 km So what? What do we do with this observation? 17

First Observation Not all designs have this large of a bisection Architecture is about understanding structure What is typical? Array Multiplier Bit Bisection Width Sqrt(N) 18

Shift Register Bisection Width 1 Regardless of size Bisection Width Trying to assess wiring or total area requirements on gates alone is short sighted. But most people try to do this Bisection width is an important, first order property of a design. 19

Rent s Rule In the world of circuit design, an empirical relationship to capture: IO = c N p 0 p 1 p characterizes interconnect richness Typical: 0.5 p 0.7 High-Speed Logic p=0.67 Empirical Characterization of Bisection IO C=7 P=0.68 Fit: IO=cN p Log-log plot N 20

As a function of Bisection A chip N A gate A chip N horizontal W wire N vertical W wire N horizontal = N vertical = IO = cn p A chip (cn) 2p If p<0.5 If p>0.5 A chip N A chip N 2p In terms of Rent s Rule If p<0.5, If p>0.5, A chip N A chip N 2p Typical designs have p>0.5 interconnect dominates 21

Programmable Machine Impact Design of Multiprocessors, FPGAs Impact on Programmables? What does this mean for our programmable devices? Devices which may solve any problem? E.g. multiprocessors, FPGAs Do we design for worst case? Put N 2 area into interconnect And guarantee can use all the gates? Or design to use the wires? Wasting gates (processors) as necessary? 22

Interconnect: Experiment VLSI area model Mapping procedure Benchmark set MCNC 4-LUT mapped Details: FPGA 99 Parameterizable network tree of meshes/fattree bisection bw = Cn P bisection bw = Cn P Effects of P on Area 0.25 P=0.5 0.37 P=0.67 1.00 P=0.75 1024 LUT Area Comparison 23

Resources Area Model Area Picking Network Design Point Must provide reasonable level of interconnect; but don t guarantee 100% compute utilization. 24

Single Design Previous is for a set of designs What about a single design? Do we minimize the area by providing enough wires to use all the gates for that single design? Gate Utilization predict Area? Single design 25

Consequences Even for a single design We do not, necessarily, win by maximizing gate utilization Are better off focusing on efficiently using the wires Focus on using the most expensive resource! Key Ideas Matter Computes our computing machines are built out of physical phenomena physical effects ultimately determine landscape for computations Interconnect requirements may dominate all other requirements Compute, memory Direct consequence of physical properties Efficient computations May waste gates (compute) to use wires efficiently and minimize total area 26

Admin Project Discussion 4:30pm here Pitch projects, discuss ideas 27