Digital Integrated CircuitDesign

Similar documents
CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES

CHAPTER 1 INTRODUCTION

A New High Speed Low Power Performance of 8- Bit Parallel Multiplier-Accumulator Using Modified Radix-2 Booth Encoded Algorithm

IJCSIET--International Journal of Computer Science information and Engg., Technologies ISSN

Mahendra Engineering College, Namakkal, Tamilnadu, India.

Review of Booth Algorithm for Design of Multiplier

HIGH PERFORMANCE BAUGH WOOLEY MULTIPLIER USING CARRY SKIP ADDER STRUCTURE

A New network multiplier using modified high order encoder and optimized hybrid adder in CMOS technology

ISSN Vol.03,Issue.02, February-2014, Pages:

Design and Characterization of 16 Bit Multiplier Accumulator Based on Radix-2 Modified Booth Algorithm

FPGA Implementation of Wallace Tree Multiplier using CSLA / CLA

Modified Booth Multiplier Based Low-Cost FIR Filter Design Shelja Jose, Shereena Mytheen

Implementation of Parallel Multiplier-Accumulator using Radix- 2 Modified Booth Algorithm and SPST

Modified Booth Encoding Multiplier for both Signed and Unsigned Radix Based Multi-Modulus Multiplier

Structural VHDL Implementation of Wallace Multiplier

ENHANCING SPEED AND REDUCING POWER OF SHIFT AND ADD MULTIPLIER

A Survey on Power Reduction Techniques in FIR Filter

Abstract. 1. Introduction. Department of Electronics and Communication Engineering Coimbatore Institute of Engineering and Technology

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS

An Optimized Design for Parallel MAC based on Radix-4 MBA

High performance Radix-16 Booth Partial Product Generator for 64-bit Binary Multipliers

Performance Analysis of a 64-bit signed Multiplier with a Carry Select Adder Using VHDL

JDT LOW POWER FIR FILTER ARCHITECTURE USING ACCUMULATOR BASED RADIX-2 MULTIPLIER

Design of a Power Optimal Reversible FIR Filter ASIC Speech Signal Processing

Computer Arithmetic (2)

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors

A Review on Different Multiplier Techniques

Chapter 11. Digital Integrated Circuit Design II. $Date: 2016/04/21 01:22:37 $ ECE 426/526, Chapter 11.

Combinational Circuits DC-IV (Part I) Notes

High Speed Vedic Multiplier Designs Using Novel Carry Select Adder

A New Architecture for Signed Radix-2 m Pure Array Multipliers

Multiplier Design and Performance Estimation with Distributed Arithmetic Algorithm

Unit 3. Logic Design

Implementing Logic with the Embedded Array

JDT EFFECTIVE METHOD FOR IMPLEMENTATION OF WALLACE TREE MULTIPLIER USING FAST ADDERS

CHAPTER 3 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED ADDER TOPOLOGIES

Performance Analysis of Multipliers in VLSI Design

PERFORMANCE COMPARISON OF HIGHER RADIX BOOTH MULTIPLIER USING 45nm TECHNOLOGY

Adder (electronics) - Wikipedia, the free encyclopedia

An Optimized Implementation of CSLA and CLLA for 32-bit Unsigned Multiplier Using Verilog

CHAPTER 5 DESIGN OF COMBINATIONAL LOGIC CIRCUITS IN QCA

DESIGN OF LOW POWER MULTIPLIERS

Tirupur, Tamilnadu, India 1 2

(CSC-3501) Lecture 6 (31 Jan 2008) Seung-Jong Park (Jay) CSC S.J. Park. Announcement

A MODIFIED ARCHITECTURE OF MULTIPLIER AND ACCUMULATOR USING SPURIOUS POWER SUPPRESSION TECHNIQUE

Reduced Complexity Wallace Tree Mulplier and Enhanced Carry Look-Ahead Adder for Digital FIR Filter

Design of Area and Power Efficient FIR Filter Using Truncated Multiplier Technique

An Design of Radix-4 Modified Booth Encoded Multiplier and Optimised Carry Select Adder Design for Efficient Area and Delay

A Novel High Performance 64-bit MAC Unit with Modified Wallace Tree Multiplier

Modified Partial Product Generator for Redundant Binary Multiplier with High Modularity and Carry-Free Addition

[Krishna, 2(9): September, 2013] ISSN: Impact Factor: INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

AREA EFFICIENT DISTRIBUTED ARITHMETIC DISCRETE COSINE TRANSFORM USING MODIFIED WALLACE TREE MULTIPLIER

High-speed Multiplier Design Using Multi-Operand Multipliers

FAST MULTIPLICATION: ALGORITHMS AND IMPLEMENTATION

Design of a Power Optimal Reversible FIR Filter for Speech Signal Processing

Ajmer, Sikar Road Ajmer,Rajasthan,India. Ajmer, Sikar Road Ajmer,Rajasthan,India.

International Journal Of Scientific Research And Education Volume 3 Issue 6 Pages June-2015 ISSN (e): Website:

Keywords: Column bypassing multiplier, Modified booth algorithm, Spartan-3AN.

Low Power Approach for Fir Filter Using Modified Booth Multiprecision Multiplier

EECS150 - Digital Design Lecture 23 - Arithmetic and Logic Circuits Part 4. Outline

Techniques for Implementing Multipliers in Stratix, Stratix GX & Cyclone Devices

Comparison of Conventional Multiplier with Bypass Zero Multiplier

International Journal of Advanced Research in Biology Engineering Science and Technology (IJARBEST)

High Performance Low-Power Signed Multiplier

Combinational Logic Circuits. Combinational Logic

Design and Implementation Radix-8 High Performance Multiplier Using High Speed Compressors

Optimized FIR filter design using Truncated Multiplier Technique

High Performance 128 Bits Multiplexer Based MBE Multiplier for Signed-Unsigned Number Operating at 1GHz

AN EFFICIENT MAC DESIGN IN DIGITAL FILTERS

Design and Implementation of 64-bit MAC Unit for DSP Applications using verilog HDL

Design of ALU and Cache Memory for an 8 bit ALU

Efficient FIR Filter Design Using Modified Carry Select Adder & Wallace Tree Multiplier

Comparative Analysis of Various Adders using VHDL

Chapter 1. Introduction. The tremendous advancements in VLSI technologies in the past few years have

UNIT-IV Combinational Logic

S.Nagaraj 1, R.Mallikarjuna Reddy 2

Design and Analysis of Row Bypass Multiplier using various logic Full Adders

A Parallel Multiplier - Accumulator Based On Radix 4 Modified Booth Algorithms by Using Spurious Power Suppression Technique

Efficient Dedicated Multiplication Blocks for 2 s Complement Radix-2m Array Multipliers

Design and Simulation of Convolution Using Booth Encoded Wallace Tree Multiplier

DESIGN OF HIGH PERFORMANCE MODIFIED RADIX8 BOOTH MULTIPLIER

Design and Simulation of Low Power and Area Efficient 16x16 bit Hybrid Multiplier

Reconfigurable High Performance Baugh-Wooley Multiplier for DSP Applications

A Survey on A High Performance Approximate Adder And Two High Performance Approximate Multipliers

Compressors Based High Speed 8 Bit Multipliers Using Urdhava Tiryakbhyam Method

Design and Implementation of Truncated Multipliers for Precision Improvement and Its Application to a Filter Structure

Highly Versatile DSP Blocks for Improved FPGA Arithmetic Performance

Wallace and Dadda Multipliers. Implemented Using Carry Lookahead. Adders

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) STUDY ON COMPARISON OF VARIOUS MULTIPLIERS

Signal Processing Using Digital Technology

Arithmetic Circuits. (Part II) Randy H. Katz University of California, Berkeley. Fall Overview BCD Circuits. Combinational Multiplier Circuit

Design A Redundant Binary Multiplier Using Dual Logic Level Technique

ADVANCES in NATURAL and APPLIED SCIENCES

DIGITAL ELECTRONICS QUESTION BANK

International Journal of Advanced Research in Computer Science and Software Engineering

AUTOMATIC IMPLEMENTATION OF FIR FILTERS ON FIELD PROGRAMMABLE GATE ARRAYS

An Analysis of Multipliers in a New Binary System

Design and Implementation of High Radix Booth Multiplier using Koggestone Adder and Carry Select Adder

Design and Implementation of Wallace Tree Multiplier Using Kogge Stone Adder and Brent Kung Adder

Transcription:

Digital Integrated CircuitDesign Lecture 13 Building Blocks (Multipliers) Register Adder Shift Register Adib Abrishamifar EE Department IUST

Acknowledgement This lecture note has been summarized and categorized from lecture note on Introduction to VLSI Design and VLSI Circuit Design all over the world. I can t remember where those slide come from. However, I d like to thank all professors who create such a good work on those lecture notes. Without those lectures, this slide can t be finished. 2/76

Contents Outline Introduction Signed Integers Review -Multiplication Multipliers Sequential (Serial) Multiplier Array Multipliers Combinational Multiplier Booth Multiplier Pipelined Multiplier Wallace Tree Multiplier Carry Save Multiplier Serial-Parallel Multiplier Summary 3/76

Outline Building Blocks for Digital Architectures Arithmetic unit Bit-sliced datapath (Adders, Multipliers, Shifters, Comparators, etc.) Memory RAM, ROM, Buffers, Shift registers Control Finite state machine (PLA, random logic) Counters Interconnect Switches Arbiters Bus 4/76

Contents Outline Introduction Signed Integers Review -Multiplication Multipliers Sequential (Serial) Multiplier Array Multipliers Combinational Multiplier Booth Multiplier Pipelined Multiplier Wallace Tree Multiplier Carry Save Multiplier Serial-Parallel Multiplier Summary 5/76

Introduction Multipliers are used in a lot of DSP applications Vector product, matrix multiplication Convolution Filtering (tap filters, FIR, )... At least one good reason for studying multiplication and division is that there is an infinite number of ways of performing these operations 6/76

Contents Outline Introduction Signed Integers Review -Multiplication Multipliers Sequential (Serial) Multiplier Array Multipliers Combinational Multiplier Booth Multiplier Pipelined Multiplier Wallace Tree Multiplier Carry Save Multiplier Serial-Parallel Multiplier Summary 7/76

Signed Integers What Not to Do Use fixed length binary representation Use left-most bit (called most significant bit or MSB) for sign: Example: 0 for positive 1 for negative +18 ten = 00010010 two 18 ten = 10010010 two 8/76

Signed Integers Why Not to Use Sign Bit Sign and magnitude bits should be differently treated in arithmetic operations Addition and subtraction require different logic circuits Overflow is difficult to detect Zero has two representations: + 0 ten = 00000000 two 0 ten = 10000000 two Signed-integers are not used in modern computers 9/76

Signed Integers Integers With Sign Other Ways Use fixed-length representation, but no sign bit 1 s complement: To form a negative number, complement each bit in the given number 2 s complement: To form a negative number, start with the given number, subtract one, and then complement each bit, or first complement each bit, and then add 1 2 s complement is the preferred representation 10/76

Signed Integers 2 s-complement Why not 1 s-complement? Don t like two zeros Add 1 to 1 s-complement representation Some properties: Only one representation for 0 Exactly as many positive numbers as negative numbers Slight asymmetry there is one negative number with no positive counterpart 11/76

Signed Integers Three Systems 1111 7 0000 0 0010 2 1111 0 0000 0 1111 10000 0000 0 5 2 7 0101 6 6 5 7 1010 7 0110 0 1010 0111 7 8 1010 1000 1000 1000 1010 = 2 1010 = 5 1010 = 6 Signed integers 12/76 1 s complement integers 2 s complement integers

Signed Integers Three Representations 2 s complement Sign-magnitude 000 = +0 001 = +1 010 = +2 011 = +3 100 = - 0 101 = - 1 110 = - 2 111 = - 3 1 s complement 000 = +0 001 = +1 010 = +2 011 = +3 100 = - 3 101 = - 2 110 = - 1 111 = - 0 13/76 000 = +0 001 = +1 010 = +2 011 = +3 100 = - 4 101 = - 3 110 = - 2 111 = - 1 (Preferred)

Signed Integers 2 s Complement n-bit Numbers Range: 2 n 1 through 2 n 1 1 Unique zero: 00000000..... 0 Expansion of bit length: stretch the left-most bit all the way, e.g., 11111101 is still 3. Overflow rule: If two numbers with the same sign bit (both positive or both negative) are added, the overflow occurs if and only if the result has the opposite sign 14/76

Signed Integers 2 s-compliment to Decimal Conversion n-2 a n-1 a n-2... a 1 a 0 = -2 n-1 a n-1 + Σ 2 i a i i=0 8-bit conversion box -128 64 32 16 8 4 2 1 Example -128 64 32 16 8 4 2 1 1 1 1 1 1 1 0 1 15/76-128+64+32+16+8+4+1 = -128 + 125 = -3

Contents Outline Introduction Signed Integers Review -Multiplication Multipliers Sequential (Serial) Multiplier Array Multipliers Combinational Multiplier Booth Multiplier Pipelined Multiplier Wallace Tree Multiplier Carry Save Multiplier Serial-Parallel Multiplier Summary 16/76

Review - Multiplication Basic algorithm analogous to decimal multiplication Break multiplier into digits Multiply one digit at a time; shift multiplicand to form partial products Create product as sum of partial products Multiplicand 0110 (6) Multiplier X 0011 (3) 0110 0110 0000 0000 Partial Products Product 00010010 (18) n bit multiplicand X m bit multiplier = (n+m) bit product 17/76

Review - Multiplication 2 s complement Multiplier positive, Multiplicand +/- : Sign extend the partial products when adding up Example: 0101 +5x 0011 +3 0101 0101 0000 0000 0001111 +15 18/76 1011-5x 0011 +3 1 1 1 1011 1 1 1011 0 0000 0000 1110001-15

Review - Multiplication 2 s complement (cont.) Mplier negative, Mcand +/- : convert negative Mplier to positive, do the multiplication, negate the result Example: 1011-5x 1101-3 1011-5x 0011 +3 1 1 1 1011 1 1 1011 0 0000 0000 1110001-15 19/76 0001111 +15

Review - Multiplication Example 0010 two 0011 two = 0110 two, i.e., 2 ten 3 ten = 6 ten Iteration Step Multiplicand Product 0 Initial values 0010 0000 0011 1 LSB=1 => Prod=Prod+Mcand 0010 0010 0011 Right shift product 0010 0001 0001 2 LSB=1 => Prod=Prod+Mcand 0010 0011 0001 Right shift product 0010 0001 1000 3 LSB=0 => no operation 0010 0001 1000 Right shift product 0010 0000 1100 4 LSB=0 => no operation 0010 0000 1100 Right shift product 0010 0000 0110 20/76

Review - Multiplication Example 1010 two 0011 two = 101110 two, i.e., -6 ten 3 ten = -18 ten Iteration Step Multiplicand Product 0 Initial values 11010 00000 0011 1 LSB=1 => Prod=Prod+Mcand 11010 11010 0011 Right shift product 11010 11101 0001 2 LSB=1 => Prod=Prod+Mcand 11010 10111 0001 Right shift product 11010 11011 1000 3 LSB=0 => no operation 11010 11011 1000 Right shift product 11010 11101 1100 4 LSB=0 => no operation 11010 11101 1100 Right shift product 11010 11110 1110 21/76

Review - Multiplication Example 1010 two 1011 two = 011110 two, i.e., -6 ten (-5 ten ) = 30 ten Iteration Step Multiplicand Product 0 Initial values 11010 00000 1011 1 LSB=1 => Prod=Prod+Mcand 11010 11010 1011 Right shift product 11010 11101 0101 2 LSB=1 => Prod=Prod+Mcand 11010 10111 0101 Right shift product 11010 11011 1010 3 LSB=0 => no operation 11010 11011 1010 Right shift product 11010 11101 1101 4 LSB=1 => Prod=Prod Mcand* 00110 00011 1101 Right shift product 11010 00001 1110 *Last iteration with a negative multiplier in 2 s 2 s complement 22/76

Contents Outline Introduction Signed Integers Review -Multiplication Multipliers Sequential (Serial) Multiplier Array Multipliers Combinational Multiplier Booth Multiplier Pipelined Multiplier Wallace Tree Multiplier Carry Save Multiplier Serial-Parallel Multiplier Summary 23/76

Multipliers There are many different circuits for multiplication Each one has a different balance between speed (performance) and amount of logic (cost) 24/76

Contents Outline Introduction Signed Integers Review -Multiplication Multipliers Sequential (Serial) Multiplier Array Multipliers Combinational Multiplier Booth Multiplier Pipelined Multiplier Wallace Tree Multiplier Carry Save Multiplier Serial-Parallel Multiplier Summary 25/76

Sequential Multiplier Compute the sums as a sequence of separate steps Main benefit: only requires one 2:1 adder Much lower hardware cost for large n Use a register to store the partial products Use two registers to store the multiplier and multiplicand Requires a state machine (with the corresponding control logic) to control the sequence of additions used 26/76

Sequential Multiplier Shift register Originally holds multiplicand Shifts it left for each partial product One bit of multiplier at a time presented to the AND gates Initialized w/mcand, shifts it left 2N bits Shift Register Adder Register 0 One bit of mplier applied each cycle 27/76

Sequential Multiplier Resource Requirements Adder: 2N-bit Registers: 2N-bit wide A state machine Register Adder Shift Register 28/76

Sequential Multiplier Better design: Shift result register to right Uses N AND gates Uses N-bit adder Register Adder Shift Register 29/76

Contents Outline Introduction Signed Integers Review -Multiplication Multipliers Sequential (Serial) Multiplier Array Multipliers Combinational Multiplier Booth Multiplier Pipelined Multiplier Wallace Tree Multiplier Carry Save Multiplier Serial-Parallel Multiplier Summary 30/76

Array Multipliers Adding Partial Products y3 y2 y1 y0 multiplicand x3 x2 x1 x0 multiplier x0y3 x0y2 x0y1 x0y0 four carry x1y3 x1y2 x1y1 x1y0 partial carry x2y3 x2y2 x2y1 x2y0 products carry x3y3 x3y2 x3y1 x3y0 to be summed p7 p6 p5 p4 p3 p2 p1 p0 Requires three 4-bit additions. Slow. 31/76

Array Multipliers Carry Forward y3 y2 y1 y0 multiplicand x3 x2 x1 x0 multiplier x0y3 x0y2 x0y1 x0y0 four x1y3 x1y2 x1y1 x1y0 partial x2y3 x2y2 x2y1 x2y0 products x3y3 x3y2 x3y1 x3y0 to be summed p7 p6 p5 p4 p3 p2 p1 p0 Note: Carry is added to the next partial product (carry-save addition). Adding the carry from the final stage needs an extra (ripple-carry stage. These additions are faster but we need four stages. 32/76

Array Multipliers Structure x0 y3 y2 y1 y0 ppk yj FA xi ci x2 x3 0 Critical path 0 x1 0 0 0 0 0 0 co ppk+1 FA FA FA FA 0 p7 p6 p5 p4 p3 p2 p1 p0 33/76

Contents Outline Introduction Signed Integers Review -Multiplication Multipliers Sequential (Serial) Multiplier Array Multipliers Combinational Multiplier Booth Multiplier Pipelined Multiplier Wallace Tree Multiplier Carry Save Multiplier Serial-Parallel Multiplier Summary 34/76

Combinational Multiplier Also referred to as a parallel multiplier Implement multiplication using a 2-dimensional array of 1-bit full adders Basic structure is the same as the addition array Can also be implemented using a linear array of CSA Each partial product P i = 2 i a i B At first CSA level, add three partial products At all subsequent CSA levels, add one more partial product A CLA (or other type of 2:1, 2-input/1-output, adder) is required after the final CSA level 35/76

Combinational Multiplier Idea Use an array of AND gates to generate the partial products in parallel LSB 1 multiplier 1 multiplicand 1 1 0 1 1 0 LSB 0 1 36/76 1 1 0 0 0 0 0 0 1 0

Combinational Multiplier Adding PProds X 3 X 2 X 1 X 0 Y 0 X 3 X 2 X 1 X 0 Y 1 Z 0 HA FA FA HA X 3 X 2 X 1 X 0 Y 2 Z 1 FA FA FA HA X 3 X 2 X 1 X 0 Y 3 Z 2 FA FA FA HA Z 7 Z 6 37/76 Z 5 Z 4 Z 3

Combinational Multiplier Critical Path A lot of critical paths, same delay (AND gates not shown) MxN Multiplier M N HA FA FA HA FA FA FA HA FA FA FA HA Delay=(M+N-2)t carry +(N-1)t sum +t AND Critical Path 1 Critical Path 2 38/76

Combinational Multiplier MxN Critical Paths HA FA FA HA FA FA FA HA Critical Path 1 Critical Path 2 Critical Path 1 & 2 FA FA FA HA ( 1) ( 2) ( 1) ( 1) t = M + N t + N t + N t mult carry sum and 39/76

Combinational Multiplier Better floorplan for compact layout Send partial product diagonally Results in better area AND gates and hence the first row not shown HA FA FA HA FA FA FA HA FA FA FA HA 40/76

Contents Outline Introduction Signed Integers Review -Multiplication Multipliers Sequential (Serial) Multiplier Array Multipliers Combinational Multiplier Booth Multiplier Pipelined Multiplier Wallace Tree Multiplier Carry Save Multiplier Serial-Parallel Multiplier Summary 41/76

Booth Multiplier Originally proposed to reduce addition steps Bonus: works for two s complement numbers Encoding scheme to reduce number of stages in multiplication Performs two bits of multiplication at once requires half the stages Each stage is slightly more complex than simple multiplier, but adder/subtracter is almost as small/fast as adder 42/76

Booth Multiplier There are multiple ways to create a product Example: multiply 2 ten by 6 ten (0010 two X 0110 two ) Product = (2 X 2) + (2 X 4) OR Product = (2 X -2) + (2 X 8) Idea Recode each 1 in multiplier as +2-1 Converts sequences of 1 to 10 0(-1) Might reduce the number of 1 s 43/76

Booth Multiplier Example 0 0 1 1 1 1 1 1 0 0 +1-1 +1-1 +1-1 +1-1 +1-1 +1-1 0 1 0 0 0 0 0-1 0 0 44/76

Booth Multiplier Example 0 0 1 1 0 6x 0 1 1 1 0 14 +1 0 0-1 0 0 0 0 0 0 1 1 1 1 1 0 1 0 (-6) 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 0 1 0 0 84 Sign extension 45/76

Booth Multiplier Booth encoding Two s-complement form of multiplier y = -2 n y n + 2 n-1 y n-1 + 2 n-2 y n-2 +... Rewrite using 2 a = 2 a+1-2 a y = -2 n (y n -y n-1 ) + 2 n-1 (y n-2 -y n-1 ) + 2 n-2 (y n-3 -y n-2 ) +... Consider first two terms: by looking at three bits of y, we can determine whether to add x, 2x to partial product 46/76

Booth Multiplier Booth actions y i y i-1 y i-2 increment 0 0 0 0 0 0 1 x 0 1 0 x 0 1 1 2x 1 0 0-2x 1 0 1 -x 1 1 0 -x 1 1 1 0 47/76

Booth Multiplier Booth example x = 011001 (25 10 ), y = 101110 (-18 10 ) y 1 y 0 y -1 = 100, P 1 = P 0 -(10 011001) = 11111001110 Y 3 y 2 y 1 = 111, P 2 = P 1 + 0 = 11111001110 Y 5 y 4 y 3 = 101, P 3 = P 2-0110010000 = 11000111110 48/76

Booth Multiplier Question: How do we know when to subtract? When do we know when to add? Answer: look for runs of 1s in multiplier Example: 001110011 Working from Right to Left, any run of 1 s is equal to: -value of first digit that s one +value of first digit that s zero Example : 001110011 First run: -1 + 4 = 3 Second run: -16 + 128 = 112 Total: 112 + 3 = 115 49/76

Booth Multiplier Scan multiplier bits from right to left Recognize the beginning and in of a run looking at only 2 bits at a time 0 1 1 0 0 1 1 1 0 0 Current bit a i Bit to right of current bit a i-1 End Of Run Middle Of Run Beginning Of Run Bit a i Bit a i-1 Explanation 1 0 Begin Run of 1 s 1 1 Middle of Run of 1 s 0 1 End of Run 0 0 Middle of Run of 0 s 50/76

Booth Multiplier Key idea: test 2 bits of multiplier at once 10 - subtract (beginning of run of 1 s) 01 - add (end of run of 1 s) 00, 11 - do nothing (middle of run of 0 s or 1 s) Multiplicand (32 bits) 32-bit ALU ADD/ SUB Shift Left LHPRODProduct MP/RHPROD (32 bits) (64 bits) (32 bits) Write Bits 1:0 2 Control 51/76

Booth Multiplier Booth Structure 52/76

Booth Multiplier Advantages and Disadvantages Depends on the architecture Potential advantage: might reduce the # of 1 s in multiplier In the multipliers that we have seen so far Doesn t save in speed (still have to wait for the critical path, e.g., the shift-add delay in sequential multiplier) Increases area: recoding circuitry AND subtraction 53/76

Contents Outline Introduction Signed Integers Review -Multiplication Multipliers Sequential (Serial) Multiplier Array Multipliers Combinational Multiplier Booth Multiplier Pipelined Multiplier Wallace Tree Multiplier Carry Save Multiplier Serial-Parallel Multiplier Summary 54/76

Pipelined Multipliers Insert registers (latches) between rows Insert registers for bits of multiplier Schedule MSB bits to arrive later HA FA FA HA FA FA FA HA FA FA FA HA 55/76

Pipelined Multipliers Example Sum/ carry path Latch a 4 a 3 a 2 a 1 a0 x 0 x 1 x 2 x 3 x 4 FA with AND gate and latches (for a i, intermediate sum and carry) FA 56/76 p 8 p 7 p 6 p 9 p 5 p 4 p 3 p 2 p 1 p 0

Contents Outline Introduction Signed Integers Review -Multiplication Multipliers Sequential (Serial) Multiplier Array Multipliers Combinational Multiplier Booth Multiplier Pipelined Multiplier Wallace Tree Multiplier Carry Save Multiplier Serial-Parallel Multiplier Summary 57/76

Wallace Tree Multiplier Idea: divide & conquer Why add the k numbers one by one? Tree structure logarithmic.............................................................................................................................. 58/76

Wallace Tree Multiplier Example Delay = 4 CSA + 1 CLA 59/76

Wallace Tree Multiplier For 7 k-bit [0,k-1][0,k-1][0,k-1] K-bit CSA [1,k] [0,k-1] K-bit CSA [2,k+1] 0,[2,k] [k+1] [2,k+1] [0,k-1][0,k-1][0,k-1] K-bit CSA [1,k] [0,k-1] [1,k] [1,k] K-bit CSA K-bit CPA K-bit CSA [0,k-1] [1,k-1], 0 [1,k+1] [2,k+1] [0,k-1] 60/76 [k+2] [2,k+1] [1] [0]

Wallace Tree Multiplier At each step, # of operands reduces to 2/3 n k-bit numbers CSA CSA CSA CSA CSA CSA CSA CSA CSA (2/3) n CSA CSA CSA CSA CSA nums (2/3) 2 n CSA CSA CSA CSA... CSA (2/3) h n = 2 CSA h levels 61/76

Wallace Tree Multiplier Delay depends on height h h = O ( log n ) Logarithmic delay Max # N of k-bit numbers that can be added using a Wallace tree of height h h N h N h N 0 2 7 28 14 474 1 3 8 42 15 711 2 4 9 63 16 1066 3 6 10 94 17 1599 4 9 11 141 18 2398 5 13 12 211 19 3597 6 19 13 316 20 5395 62/76

Wallace Tree Multiplier Reduces depth of adder chain Built from carry-save adders: Three inputs a, b, c Produces two outputs y, z such that y + z = a + b + c Carry-save equations: y i = parity(a i,b i,c i ) z i = majority(a i,b i,c i ) 63/76

Wallace Tree Multiplier Wallace Tree Multiplier Partial products First stage 6 5 4 3 2 1 0 6 5 4 3 2 1 0 Bit position (a) (b) Second stage Final adder 6 5 4 3 2 1 0 6 5 4 3 2 1 0 FA 64/76 (c) HA (d)

Wallace Tree Multiplier Wallace Tree Multiplier Partial products x 3 y 3 x 3 y 2 x 2 y 2 x 3 y 1 x 1 y 2 x 3 y 0 x 1 y 1 x 2 y 0 x 0 y 1 x 2 y 3 x1 y 3 x 0 y 3 x 2 y 1 x 0 y 2 x 1 y 0 x 0 y 0 First stage HA HA Second stage FA FA FA FA Final adder z 7 z 6 z 5 z 4 z 3 z 2 z 1 z 0 65/76

Wallace Tree Multiplier At each stage, i numbers are combined to form ceil(2i/3) sums Final adder completes the summation Wiring is more complex Can build a Booth-encoded Wallace tree multiplier 66/76

Contents Outline Introduction Signed Integers Review -Multiplication Multipliers Sequential (Serial) Multiplier Array Multipliers Combinational Multiplier Booth Multiplier Pipelined Multiplier Wallace Tree Multiplier Carry Save Multiplier Serial-Parallel Multiplier Summary 67/76

Carry Save Multiplier Speeding up multiplication is a matter of speeding up the summing of the partial products Carry-save addition can help Carry-save addition passes (saves) the carries to the output, rather than propagating them In general, carry-save addition takes in 3 numbers and produces 2 Whereas, carry-propagate takes 2 and produces 1 With this technique, we can avoid carry propagation until final addition 68/76

Carry Save Multiplier Sum three numbers, 3 10 = 0011, 2 10 = 0010, 3 10 = 0011 3 10 0011 + 2 10 0010 c 0100 = 4 10 s 0001 = 1 10 carry-save add 3 10 0011 carry-save add carry-propagate add c 0010 = 2 10 s 0110 = 6 10 1000 = 8 10 69/76

Carry Save Multiplier Carry Save Adder HA HA HA HA HA FA FA FA HA FA FA FA HA FA FA HA 70/76 Vector Merging Adder ( 1) ( 1) t = N t + N t + t mult carry and merge

Carry Save Multiplier Carry Save Multiplier Floorplan X 3 X 2 X 1 X 0 Y 0 HA Multiplier Cell Y 1 C S C S C S C S Z 0 FA Multiplier Cell Vector Merging Cell Y 2 C S C S C S C S Z 1 X and Y signals are broadcasted through the complete array. Y 3 C S C S C S C S Z 2 C S C S C S C S 71/76 Z 7 Z 6 Z 5 Z 4 Z 3

Contents Outline Introduction Signed Integers Review -Multiplication Multipliers Sequential (Serial) Multiplier Array Multipliers Combinational Multiplier Booth Multiplier Pipelined Multiplier Wallace Tree Multiplier Carry Save Multiplier Serial-Parallel Multiplier Summary 72/76

Serial-Parallel Multiplier Used in serial-arithmetic operations Multiplicand can be held in place by register Multiplier is shfited into array 73/76

Serial-Parallel Multiplier Structure 74/76

Contents Outline Introduction Signed Integers Review -Multiplication Multipliers Sequential (Serial) Multiplier Array Multipliers Combinational Multiplier Booth Multiplier Pipelined Multiplier Wallace Tree Multiplier Carry Save Multiplier Serial-Parallel Multiplier Summary 75/76

Summary Goals different than addition In some structures, sum and carry delay equal Analysis more difficult : Multiple critical paths Different levels of optimization Data encoding (Booth) Architecture-level: Wallace Tree Gate-level: pipelining Transistor-level: equal sum, carry delays More to cover Constant multiplication Floating point, precision 76/76