A New High Speed Low Power Performance of 8- Bit Parallel Multiplier-Accumulator Using Modified Radix-2 Booth Encoded Algorithm

Similar documents
Mahendra Engineering College, Namakkal, Tamilnadu, India.

Design and Characterization of 16 Bit Multiplier Accumulator Based on Radix-2 Modified Booth Algorithm

Modified Booth Encoding Multiplier for both Signed and Unsigned Radix Based Multi-Modulus Multiplier

Implementation of Parallel Multiplier-Accumulator using Radix- 2 Modified Booth Algorithm and SPST

IJCSIET--International Journal of Computer Science information and Engg., Technologies ISSN

An Optimized Design for Parallel MAC based on Radix-4 MBA

Design and Implementation of 64-bit MAC Unit for DSP Applications using verilog HDL

JDT LOW POWER FIR FILTER ARCHITECTURE USING ACCUMULATOR BASED RADIX-2 MULTIPLIER

A Parallel Multiplier - Accumulator Based On Radix 4 Modified Booth Algorithms by Using Spurious Power Suppression Technique

CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES

Design of Parallel MAC Based On Radix-4 & Radix-8 Modified Booth Algorithm

Multiplier Design and Performance Estimation with Distributed Arithmetic Algorithm

Ajmer, Sikar Road Ajmer,Rajasthan,India. Ajmer, Sikar Road Ajmer,Rajasthan,India.

/$ IEEE

A MODIFIED ARCHITECTURE OF MULTIPLIER AND ACCUMULATOR USING SPURIOUS POWER SUPPRESSION TECHNIQUE

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS

DESIGNING OF MODIFIED BOOTH ENCODER WITH POWER SUPPRESSION TECHNIQUE

High Speed Vedic Multiplier Designs Using Novel Carry Select Adder

Design and Implementation Radix-8 High Performance Multiplier Using High Speed Compressors

CHAPTER 1 INTRODUCTION

A Novel High Performance 64-bit MAC Unit with Modified Wallace Tree Multiplier

Digital Integrated CircuitDesign

Performance Analysis of a 64-bit signed Multiplier with a Carry Select Adder Using VHDL

Review of Booth Algorithm for Design of Multiplier

DESIGN OF LOW POWER / HIGH SPEED MULTIPLIER USING SPURIOUS POWER SUPPRESSION TECHNIQUE (SPST)

Design of Area and Power Efficient FIR Filter Using Truncated Multiplier Technique

Novel Architecture of High Speed Parallel MAC using Carry Select Adder

FOR HIGH SPEED LOW POWER APPLICATIONS USING RADIX-4 MODIFIED BOOTH ENCODER

VLSI Designing of High Speed Parallel Multiplier Accumulator Based On Radix4 Booths Multiplier

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

International Journal of Advanced Research in Computer Science and Software Engineering

AN EFFICIENT MAC DESIGN IN DIGITAL FILTERS

A Survey on Power Reduction Techniques in FIR Filter

International Journal Of Scientific Research And Education Volume 3 Issue 6 Pages June-2015 ISSN (e): Website:

ISSN Vol.03,Issue.02, February-2014, Pages:

PERFORMANCE COMPARISON OF HIGHER RADIX BOOTH MULTIPLIER USING 45nm TECHNOLOGY

Reduced Complexity Wallace Tree Mulplier and Enhanced Carry Look-Ahead Adder for Digital FIR Filter

An Optimized Implementation of CSLA and CLLA for 32-bit Unsigned Multiplier Using Verilog

Modified Booth Multiplier Based Low-Cost FIR Filter Design Shelja Jose, Shereena Mytheen

A Review on Different Multiplier Techniques

ENHANCING SPEED AND REDUCING POWER OF SHIFT AND ADD MULTIPLIER

A New network multiplier using modified high order encoder and optimized hybrid adder in CMOS technology

Design of a Power Optimal Reversible FIR Filter ASIC Speech Signal Processing

Design and Implementation of FPGA Radix-4 Booth Multiplication Algorithm

Design A Redundant Binary Multiplier Using Dual Logic Level Technique

Implementation of FPGA based Design for Digital Signal Processing

An Design of Radix-4 Modified Booth Encoded Multiplier and Optimised Carry Select Adder Design for Efficient Area and Delay

IMPLEMENTATION OF AREA EFFICIENT MULTIPLIER AND ADDER ARCHITECTURE IN DIGITAL FIR FILTER

Performance Analysis of Multipliers in VLSI Design

AN ADVANCED VLSI ARCHITECTURE OF PARALLEL MULTIPLIER BASED ON HIGHER ORDER MODIFIED BOOTH ALGORITHM

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) STUDY ON COMPARISON OF VARIOUS MULTIPLIERS

Reconfigurable High Performance Baugh-Wooley Multiplier for DSP Applications

DESIGN OF MULTIPLE CONSTANT MULTIPLICATION ALGORITHM FOR FIR FILTER

Design and Simulation of Convolution Using Booth Encoded Wallace Tree Multiplier

Optimized FIR filter design using Truncated Multiplier Technique

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors

[Krishna, 2(9): September, 2013] ISSN: Impact Factor: INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

Design of Efficient 64 Bit Mac Unit Using Vedic Multiplier

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Modified Partial Product Generator for Redundant Binary Multiplier with High Modularity and Carry-Free Addition

DESIGN OF LOW POWER MULTIPLIERS

Design and Simulation of 16x16 Hybrid Multiplier based on Modified Booth algorithm and Wallace tree Structure

Implementation of Efficient 16-Bit MAC Using Modified Booth Algorithm and Different Adders

Low Power Approach for Fir Filter Using Modified Booth Multiprecision Multiplier

Design of an optimized multiplier based on approximation logic

Design of high speed multiplier using Modified Booth Algorithm with hybrid carry look-ahead adder

ISSN Vol.07,Issue.08, July-2015, Pages:

AREA EFFICIENT DISTRIBUTED ARITHMETIC DISCRETE COSINE TRANSFORM USING MODIFIED WALLACE TREE MULTIPLIER

Design and Implementation of High Radix Booth Multiplier using Koggestone Adder and Carry Select Adder

Compressors Based High Speed 8 Bit Multipliers Using Urdhava Tiryakbhyam Method

S.Nagaraj 1, R.Mallikarjuna Reddy 2

MODIFIED BOOTH ALGORITHM FOR HIGH SPEED MULTIPLIER USING HYBRID CARRY LOOK-AHEAD ADDER

High performance Radix-16 Booth Partial Product Generator for 64-bit Binary Multipliers

Implementation of Parallel MAC Unit in 8*8 Pre- Encoded NR4SD Multipliers

Vector Arithmetic Logic Unit Amit Kumar Dutta JIS College of Engineering, Kalyani, WB, India

ASIC Design and Implementation of SPST in FIR Filter

A Fixed-Width Modified Baugh-Wooley Multiplier Using Verilog

Design and Simulation of Low Power and Area Efficient 16x16 bit Hybrid Multiplier

Design of a Power Optimal Reversible FIR Filter for Speech Signal Processing

Tirupur, Tamilnadu, India 1 2

Abstract. 1. Introduction. Department of Electronics and Communication Engineering Coimbatore Institute of Engineering and Technology

Low Power FIR Filter Structure Design Using Reversible Logic Gates for Speech Signal Processing

ADVANCES in NATURAL and APPLIED SCIENCES

Comparison of Conventional Multiplier with Bypass Zero Multiplier

Design and Performance Analysis of a Reconfigurable Fir Filter

FPGA Implementation of Wallace Tree Multiplier using CSLA / CLA

Comparative Analysis of 16 X 16 Bit Vedic and Booth Multipliers

Research Article Volume 6 Issue No. 5

Structural VHDL Implementation of Wallace Multiplier

PERFORMANCE COMPARISION OF CONVENTIONAL MULTIPLIER WITH VEDIC MULTIPLIER USING ISE SIMULATOR

A High Speed Wallace Tree Multiplier Using Modified Booth Algorithm for Fast Arithmetic Circuits

DESIGN OF A HIGH SPEED MULTIPLIER BY USING ANCIENT VEDIC MATHEMATICS APPROACH FOR DIGITAL ARITHMETIC

Data Word Length Reduction for Low-Power DSP Software

Design and Implementation of Truncated Multipliers for Precision Improvement and Its Application to a Filter Structure

Review On Design Of Low Power Multiply And Accumulate Unit Using Baugh-Wooley Based Multiplier

DESIGN OF AREA EFFICIENT TRUNCATED MULTIPLIER FOR DIGITAL SIGNAL PROCESSING APPLICATIONS

REVIEW ARTICLE: EFFICIENT MULTIPLIER ARCHITECTURE IN VLSI DESIGN

HIGH SPEED FIXED-WIDTH MODIFIED BOOTH MULTIPLIERS

Area and Power Efficient Booth s Multipliers Based on Non Redundant Radix-4 Signed- Digit Encoding

IJCSIET-- International Journal of Computer Science information and Engg., Technologies ISSN

Transcription:

A New High Speed Low Power Performance of 8- Bit Parallel Multiplier-Accumulator Using Modified Radix-2 Booth Encoded Algorithm V.Sandeep Kumar Assistant Professor, Indur Institute Of Engineering & Technology,Siddipet V.Swathi Assistant Professor, Institute Of Aeronautical Engineering,Dundigal Abstract - There are different entities that one would like to optimize when designing a VLSI circuit. These entities can often not be optimized simultaneously, only improve one entity at the expense of one or more others. The design of an efficient integrated circuit in terms of power, area, and speed simultaneously, has become a very challenging problem. Power dissipation is recognized as a critical parameter in modern VLSI field. In Very Large Scale Integration, Low power VLSI design is necessary to meet MOORE S law and to produce consumer electronics with more back up and less processing systems. Multiplication occurs frequently in finite impulse response filters, fast Fourier transforms, discrete cosine transforms, convolution, and other important DSP and media processing took off. In the past multiplication were multimedia kernels. The objective of a good multiplier is to provide a physically compact, good speed and low power consuming chip. To save significant power consumption of a VLSI design, it is a good direction to reduce its dynamic power that is the major part of power dissipation. we proposed a new architecture of performance of the entire calculation. Because the multiplier-and-accumulator (MAC) for high-speed arithmetic. By combining multiplication with accumulation and generate hybrid type of carry save adder (CSA), the performance was improved. Since the accumulator that has the largest delay in MAC was merged into CSA, the overall performance was elevated. Keywords: array multiplier, booth encoder, carry save adder, accumulation, MAC I. INTRODUCTION Power dissipation is recognized as a critical parameter in modern VLSI design field. To satisfy MOORE S law and to produce consumer electronics goods with more backup and less weight, low power VLSI design is necessary. Fast multipliers are essential parts of digital signal processing systems.the speed of multiply operation is of great importance in digital signal processing as well as in the general purpose processors today, especially since the media processing took off. In the past multiplication was generally implemented via a sequence of addition, subtraction, and shift operations. Multiplication can be considered as a series of repeated additions. The number to be added is the multiplicand, the number of times that it is added is the multiplier, and the result is the product. Each step of addition generates a partial product. In most computers, the operand usually contains the same number of bits. When the operands are interpreted as integers, the product is generally twice the length of operands in order to preserve the information content. This repeated addition method that is suggested by the arithmetic definition is slow that it is almost always replaced by an algorithm that makes use of positional representation. It is possible to decompose multipliers into two parts. The first part is dedicated to the generation of partial products, and the second one collects and adds them. The basic multiplication principle is twofold i.e. evaluation of partial products and accumulation of the shifted partial products. It is performed by the successive additions of the columns of the shifted partial product matrix. The multiplier is successfully shifted and gates the appropriate bit of the multiplicand. The delayed, gated instance of the multiplicand must all be in the same column of the shifted partial product matrix. They are then added to form the product bit for the particular form. To extend the multiplication to both signed and unsigned numbers, a convenient number system would be the representation of numbers in two s complement format. The MAC (Multiplier and Accumulator Unit) is used for image processing and digital signal processing (DSP) in a DSP processor. Algorithm of MAC is Booth s radix-2 algorithm; Modified Booth Multiplier improves speed and reduces the power. In the binary number system the digits, called bits, are limited to the set [0, 1]. The result of Vol. 2 Issue 3 May 2013 249 ISSN: 2278-621X

multiplying any binary number by a single binary bit is either 0, or the original number. This makes forming the intermediate partial-products simple and efficient. Summing these partial-products is the time consuming task for binary multipliers. One logical approach is to form the partial-products one at a time and sum them as they are generated. Often implemented by software on processors that do not have a hardware multiplier, this technique works fine, but is slow because at least one machine cycle is required to sum each additional partial-product Fig 1:arithemetic steps of multiplier and accumulation For applications where this approach does not provide enough performance, multipliers can be implemented directly in hardware. The two main categories of binary multiplication include signed and unsigned numbers. Digit multiplication is a series of bit shifts and series of bit additions, where the two numbers, the multiplicand and the multiplier are combined into the result. Considering the bit representation of the multiplicand x = xn-1..x1 x0 and the multiplier y = yn-1..y1y0 in order to form the product up to n shifted copies of the multiplicand are to be added for unsigned multiplication. The entire process consists of three steps, partial product generation, partial product reduction and final addition. In the majority of digital signal processing (DSP) applications the critical operations are the multiplication and accumulation. Real-time signal processing requires high speed and high throughput Multiplier-Accumulator (MAC) unit that consumes low power, which is always a key to achieve a high performance digital signal processing system. The purpose of this work is to design and implementation of a low power MAC unit with block enabling technique to save power. Firstly, a 1-bit MAC unit is designed, with appropriate geometries that give optimized power, area and delay. Similarly, the N-bit MAC unit is designed and controlled for low power using a control logic that enables the pipelined stages at appropriate time. The adder cell designed has advantage of high operational speed, small Gate count and low power. Fig 2: Hardware architecture of MAC Multiplier mainly consists of the three parts: Booth encoder, a tree to compress the partial products such as Wallace tree, and final adder. Because Wallace tree is to add the partial products from encoder as parallel as possible, its operation time is proportional to, where is the number of inputs. It uses the fact that counting the number of 1' s Vol. 2 Issue 3 May 2013 250 ISSN: 2278-621X

among the inputs reduces the number of outputs into. In real implementation. The most effective way to increase the speed of a multiplier is to reduce the number of the partial products. II. PROPOSED MAC ARCHITECTURE In this section, the expression for the new arithmetic will be derived from equations of the standard design. From this result, VLSI architecture for the new MAC will be proposed. In addi-tion, a hybrid-typed CSA architecture that can satisfy the oper-ation of the proposed MAC will be proposed. A. Derivation of MAC Arithmetic 1) Basic Concept: If an operation to multiply two bit numbers and accumulate into a 2 -bit number is considered,the critical path is determined by the 2 - bit accumulation operation. If a pipeline scheme is applied for each step in the standard design of Fig. 1, the delay of the last accumulator must be reduced in order to improve the performance of the MAC. The overall performance of the proposed MAC is improved by eliminating the accumulator itself by combining it with the CSA function. If the accumulator has been eliminated, the critical path is then determined by the final adder in the multiplier. basic method to improve the performance of the final adder is to decrease the number of input bits. In order to reduce this number of input bits, the multiple partial products are compressed into a sum and a carry by CSA. The number of bits of sums and carries to be transferred to the final adder is reduced by adding the lower bits of sums and carries in advance within the range in which the overall performance will not be degraded. A 2-bit CLA is used to add the lower bits in the CSA. In addition, to increase the output rate when pipelining is applied, the sums and carrys from the CSA are accumulated instead of the outputs from the final adder in the manner that the sum and carry from the CSA in the previous cycle are inputted to CSA. Due to this feedback of both sum and carry, the number of inputs to CSA increases, compared to the standard design. In order to efficiently solve the increase in the amount of data, a CSA architecture is the value that is fed back as the addition result for the sum and modified to treat the sign bit. Fig. 3. Proposed arithmetic operation of multiplication and accumulation Vol. 2 Issue 3 May 2013 251 ISSN: 2278-621X

Fig. 4. Hardware architecture of the proposed MAC III.MODIFIED BOOTH ALGORITHM In order to achieve high-speed multiplication, adopt the other implementing approach of control signal multiplication algorithms using parallel counters, such as assertion circuit using AND gate. the modified Booth algorithm has been proposed, and some multipliers based on the algorithms have been implemented for practical use Fig 5:The grouping of bits from the multiplier term for use in modified booth encoding Fig6:Booth partial product selector logic Booth multiplication is a technique that allows for smaller, faster multiplication circuits, by recoding the numbers that are multiplied. It is possible to reduce the number of partial products by half, by using the technique of radix-4 Booth recoding [9]. The basic idea is that, instead of shifting and adding for every column of the multiplier term and multiplying by 1 or 0, we only take every second column, and multiply by ±1, ±2, or 0, to obtain the same results. The advantage of this method is the halving of the number of partial products. To Booth recode the multiplier term, Vol. 2 Issue 3 May 2013 252 ISSN: 2278-621X

we consider the bits in blocks of three, such that each block overlaps the previous block by one bit. Grouping starts from the LSB, and the first block only uses two bits of the multiplier. IV.PROPOSED CSA ARCHITECHURE The architecture of the hybrid-type CSA that complies with the operation of the proposed MAC. Which performs 8 8-bit operations? It is to simplify the expansion and is to compensate 1' s complement number into 2' s complement number and correspond to the th bit of the feedback sum and carry. V.RESULTS In this project, we propose a high speed low-power proposed high speed low power multiplier by comparing multiplier adopting the booth multiplier implementing this design with a conventional array multiplier. This multiplier is designed by equipping the multipliers can be implemented using Verilog coding. In MAC with CSA gets in order to get the power report and delay report we are modified Booth encoder which is controlled by a synthesizing these multipliers using Xilinx. Fig 7: simulation of MAC ESTIMATION OF GATE SIZE BY SYNTHESIS Vol. 2 Issue 3 May 2013 253 ISSN: 2278-621X

VI.CONCLUSION In this paper, a new MAC architecture to execute the multiplication-accumulation operation, which is the key operation for digital signal processing and multimedia information processing ef ciently, was proposed. By removing the independent accumulation process that has the largest delay and merging it to the compression process of the partial products, the overall MAC performance has been improved almost twice as much as in the previous work. extending of this is, proposed high speed low power multiplier adopting the new SPST implementing approach. This multiplier is designed by equipping the Spurious Power Suppression Technique (SPST) on a modified Booth encoder which is controlled by a Simulation detection unit using an AND gate. The modified booth encoder will reduce the number of partial products generated by a factor of 2.The SPST adder will avoid the unwanted addition and thus minimize the switching power dissipation. This facilitates the robustness of SPST can attain 30% speed improvement and 22% power reduction in the modified booth encoder when compared with the conventional tree multipliers. REFERENCES [1] J. J. F. Cavanaugh, Digital Computer Arithmetic. New York: McGraw-Hill, 1984. [2] Information Technology-Coding of Moving Picture and Associated Autio, MPEG-2 ISO/IEC 13818-1, 2, 3, 1994. Dong-Wook Kim (S' 82 M' 85) received the B.S. [3] JPEG 2000 Part I Fina1119l Draft, ISO/IEC JTC1/SC29 WG1 [4] O. L. MacSorley, High speed arithmetic in binary computers, Proc.IRE, vol. 49, pp. 67 91, Jan. 1961 [5] [5]A. R. Cooper, Parallel architecture modi ed Booth multiplier, Proc.Inst. Electr. Eng. G, vol. 135, pp. 125 128, 1988. Vol. 2 Issue 3 May 2013 254 ISSN: 2278-621X