Comparative Analysis of different Algorithm for Design of High-Speed Multiplier Accumulator Unit (MAC)

Similar documents
Review On Design Of Low Power Multiply And Accumulate Unit Using Baugh-Wooley Based Multiplier

High Speed Vedic Multiplier Designs Using Novel Carry Select Adder

Performance Analysis Comparison of a Conventional Wallace Multiplier and a Reduced Complexity Wallace multiplier

HIGH PERFORMANCE BAUGH WOOLEY MULTIPLIER USING CARRY SKIP ADDER STRUCTURE

ASIC Implementation of High Speed Area Efficient Arithmetic Unit using GDI based Vedic Multiplier

Modified Booth Encoding Multiplier for both Signed and Unsigned Radix Based Multi-Modulus Multiplier

Design and Implementation of Wallace Tree Multiplier Using Kogge Stone Adder and Brent Kung Adder

A Survey on Design of Pipelined Single Precision Floating Point Multiplier Based On Vedic Mathematic Technique

IJCSIET--International Journal of Computer Science information and Engg., Technologies ISSN

Design and Analysis of CMOS Based DADDA Multiplier

Design and Characterization of 16 Bit Multiplier Accumulator Based on Radix-2 Modified Booth Algorithm

Implementation of Parallel Multiplier-Accumulator using Radix- 2 Modified Booth Algorithm and SPST

Design and Simulation of Convolution Using Booth Encoded Wallace Tree Multiplier

AN EFFICIENT MAC DESIGN IN DIGITAL FILTERS

Design and Analysis of Row Bypass Multiplier using various logic Full Adders

DESIGN AND ANALYSIS OF VEDIC MULTIPLIER USING MICROWIND

Design of Efficient 64 Bit Mac Unit Using Vedic Multiplier

A New network multiplier using modified high order encoder and optimized hybrid adder in CMOS technology

[Krishna, 2(9): September, 2013] ISSN: Impact Factor: INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

DESIGN OF LOW POWER MULTIPLIER USING COMPOUND CONSTANT DELAY LOGIC STYLE

IMPLEMENTATION OF AREA EFFICIENT MULTIPLIER AND ADDER ARCHITECTURE IN DIGITAL FIR FILTER

Design of Baugh Wooley Multiplier with Adaptive Hold Logic. M.Kavia, V.Meenakshi

Low Power Realization of Subthreshold Digital Logic Circuits using Body Bias Technique

A High Speed Wallace Tree Multiplier Using Modified Booth Algorithm for Fast Arithmetic Circuits

Optimized high performance multiplier using Vedic mathematics

ISSN Vol.07,Issue.08, July-2015, Pages:

PERFORMANCE COMPARISION OF CONVENTIONAL MULTIPLIER WITH VEDIC MULTIPLIER USING ISE SIMULATOR

Design of Low Power Baugh Wooley Multiplier Using CNTFET

Research Article Design of a Novel Optimized MAC Unit using Modified Fault Tolerant Vedic Multiplier

International Journal Of Scientific Research And Education Volume 3 Issue 6 Pages June-2015 ISSN (e): Website:

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors

Design and Implementation of 8x8 VEDIC Multiplier Using Submicron Technology

Design A Redundant Binary Multiplier Using Dual Logic Level Technique

I. INTRODUCTION II. RELATED WORK. Page 171

Implementation of FPGA based Design for Digital Signal Processing

Optimization of Speed using Compressors

A New High Speed Low Power Performance of 8- Bit Parallel Multiplier-Accumulator Using Modified Radix-2 Booth Encoded Algorithm

COMPARISION OF LOW POWER AND DELAY USING BAUGH WOOLEY AND WALLACE TREE MULTIPLIERS

A Review on Different Multiplier Techniques

Mahendra Engineering College, Namakkal, Tamilnadu, India.

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS

ADVANCES in NATURAL and APPLIED SCIENCES

A Low Power Array Multiplier Design using Modified Gate Diffusion Input (GDI)

JDT LOW POWER FIR FILTER ARCHITECTURE USING ACCUMULATOR BASED RADIX-2 MULTIPLIER

An Optimized Design for Parallel MAC based on Radix-4 MBA

An Efficient Baugh-WooleyArchitecture forbothsigned & Unsigned Multiplication

Design and Implementation Radix-8 High Performance Multiplier Using High Speed Compressors

Globally Asynchronous Locally Synchronous (GALS) Microprogrammed Parallel FIR Filter

AREA EFFICIENT DISTRIBUTED ARITHMETIC DISCRETE COSINE TRANSFORM USING MODIFIED WALLACE TREE MULTIPLIER

International Journal of Advanced Research in Biology Engineering Science and Technology (IJARBEST)

Design and Implementation of a delay and area efficient 32x32bit Vedic Multiplier using Brent Kung Adder

Low Power 32-bit Improved Carry Select Adder based on MTCMOS Technique

PIPELINED VEDIC MULTIPLIER

Modified Booth Multiplier Based Low-Cost FIR Filter Design Shelja Jose, Shereena Mytheen

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) STUDY ON COMPARISON OF VARIOUS MULTIPLIERS

Comparative Study of Different Variable Truncated Multipliers

Reconfigurable High Performance Baugh-Wooley Multiplier for DSP Applications

DESIGN OF LOW POWER MULTIPLIERS

High Speed Binary Counters Based on Wallace Tree Multiplier in VHDL

DESIGN OF MULTIPLE CONSTANT MULTIPLICATION ALGORITHM FOR FIR FILTER

DESIGN OF FIR FILTER ARCHITECTURE USING VARIOUS EFFICIENT MULTIPLIERS Indumathi M #1, Vijaya Bala V #2

Modified Design of High Speed Baugh Wooley Multiplier

Design of an optimized multiplier based on approximation logic

A Novel High Performance 64-bit MAC Unit with Modified Wallace Tree Multiplier

CHAPTER 1 INTRODUCTION

Comparative Analysis of 16 X 16 Bit Vedic and Booth Multipliers

Design and Simulation of 16x16 Hybrid Multiplier based on Modified Booth algorithm and Wallace tree Structure

An Efficient Low Power and High Speed carry select adder using D-Flip Flop

Performance Analysis of Multipliers in VLSI Design

II. Previous Work. III. New 8T Adder Design

A Survey on Power Reduction Techniques in FIR Filter

IMPLEMENTATION OF UNSIGNED MULTIPLIER USING MODIFIED CSLA

Efficient FIR Filter Design Using Modified Carry Select Adder & Wallace Tree Multiplier

ENHANCING SPEED AND REDUCING POWER OF SHIFT AND ADD MULTIPLIER

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

VLSI IMPLEMENTATION OF ARITHMETIC OPERATION

Design and Implementation of 64-bit MAC Unit for DSP Applications using verilog HDL

A Low-Power High-speed Pipelined Accumulator Design Using CMOS Logic for DSP Applications

Design of an Energy Efficient 4-2 Compressor

Performance Analysis Comparison of 4-2 Compressors in 180nm CMOS Technology

DESIGNING OF MODIFIED BOOTH ENCODER WITH POWER SUPPRESSION TECHNIQUE

EXPLORATION ON POWER DELAY PRODUCT OF VARIOUS VLSI MULTIPLIER ARCHITECTURES

VLSI Design and FPGA Implementation of N Binary Multiplier Using N-1 Binary Multipliers

Faster and Low Power Twin Precision Multiplier

Comparison of Multiplier Design with Various Full Adders

Design and Analyse Low Power Wallace Multiplier Using GDI Technique

PROMINENT SPEED ARITHMETIC UNIT ARCHITECTURE FOR PROFICIENT ALU

International Journal of Engineering, Management & Medical Research (IJEMMR) Vol- 1, Issue- 7, JULY -2015

High Speed 16- Bit Vedic Multiplier Using Modified Carry Select Adder

DESIGN AND IMPLEMENTATION OF 128-BIT MAC UNIT USING ANALOG CADENCE TOOLS

Design of a High Speed FIR Filter on FPGA by Using DA-OBC Algorithm

High Speed IIR Notch Filter Using Pipelined Technique

Multiplier and Accumulator Using Csla

Wallace Tree Multiplier Designs: A Performance Comparison Review

Design and Performance Analysis of 64 bit Multiplier using Carry Save Adder and its DSP Application using Cadence

Jayaprakash et al., International Journal of Advanced Engineering Technology E-ISSN

Design of High Speed Power Efficient Combinational and Sequential Circuits Using Reversible Logic

Performance Analysis of an Efficient Reconfigurable Multiplier for Multirate Systems

Design of high speed multiplier using Modified Booth Algorithm with hybrid carry look-ahead adder

International Journal of Advance Engineering and Research Development

Transcription:

Indian Journal of Science and Technology, Vol 9(8), DOI: 10.17485/ijst/2016/v9i8/83614, February 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Comparative Analysis of different Algorithm for Design of High-Speed Multiplier Accumulator Unit (MAC) P. A. Irfan Khan * and Ravi Shankar Mishra Lovely Professional University, Phagwara - 144411, Punjab, India; p.a.irfankhan@gmail.com, ravi.19053@lpu.co.in Abstract Background/Objectives: Power consumption is one of the important designsin many digital signal processing applications, the main building blocks of the processor is Multiplier-Accumulator (MAC) unit. Methods/Statistical Analysis: In the present work, the Baugh-Wooley multiplier is implemented for improving the performance of MAC unit. The Baugh wooley multiplier is faster than the other multipliers like Array multiplier, Wallace tree multiplier, Booth multiplier. The MAC unit using Baugh-Wooley multiplier is implemented using 180nm technology in cadence virtuoso. Findings: The speed of MAC unit using Wallace tree multiplier is 93.6MHz and with Baugh wooley multiplier is 99.1MHz. The power consumption of the MAC unit using Wallace tree multiplier is 2.265mW and with Baugh wooley multiplier is 4.628mW. The results show that the MAC unit using Baugh wooley multiplier is faster than the Wallace tree multiplier. Application/Improvements: MAC unit processors. In future, we can implement MAC unit using Baugh wooley multiplier withapipelining technique such that the total power consumption will be less. Keywords: Accumulator, Baugh-Wooley Algorithm, High Speed, Low Power, Multipliers, Pipelining 1. Introduction There has been high demand now a days for high speed but low power consuming devices. To achieve this Multiplier- Accumulator unit is needed 1. The multiply-accumulate operation is the main user defined accelerator routine in digital signal processing architectures. It determines the speed of the overall system as it is critical path. To increase the performance of digital signal processing,we need a high-speed Multiplier-Accumulator unit for real-time applications 2,3. The multiply accumulate unit performs the critical operations in many of the processing applications. Low-power and high-speed circuitry are playing a crucial role for VLSI systems 4. The main objective of this work is to investigate how to increase thespeed of multiplier and accumulator unit and suitable algorithms which are more efficiently suitable for implementation the high throughput signal processing algorithms and also to achieve the low power consumption 5. This is because the speed and throughput rates are always been concerned with the digital signal processing systems. These MAC units become the essential building blocks for the applications as digital filtering, speech processing, video encoding and cellular phone in the digital signal processing. A variety of approaches to implementing the multiplication and addition of the MAC functions are possible. A conventional MAC unit is of the combination of themultiplier, adder and an accumulator that contains the sum of the previous consecutive products. The MAC is designed using Baugh-Wooley multiplier. These applications include filtering, convolution and the inner products. The Baugh-Wooley multiplier is signed multiplier having the less delay 6. The function of the MAC unit is used for high-speed filtering and other processing units typically for digital signal processing. The Mac unit designed by the multiplier and accumulator consists of the sum of the previous successive products. The MAC * Author for correspondence

Comparative Analysis of different Algorithm for Design of High-Speed Multiplier Accumulator Unit (MAC) inputs are taken from the memory location and give to themultiplier. The Mac unit multiplies 2 values, then adding the result to the previously accumulated value and restored in the register for future accumulations. Here the MAC is designed with the Baugh-Wooley multiplier and with the pipelining technique. using the half adders and full adders depend upon on the number of inputs 8. The architecture of Array multiplier is given in Figure 2. 2. Multipliers Different kinds of multipliers have been used to design the MAC unit, which is given below. 2.1 Booth Multiplier Booth multiplier isacombination of multiplicand and multiplier in addition with the partial product generator and booth encoder 7. The partial products are to be generated by using booth encoding technique. Hence, the total number of partial products willreducemore as compared to the conventional multiplier techniques. Finally, all the partial products are to be added by using Carry Save Adder (or) Ripple Carry Adder and the results are gathered in the Accumulator. However, the structure of booth multiplier is increased due to thecomplexity of the booth encoder. The architecture of booth multiplier is given in Figure1. Figure 2. Architecture of array multiplier. 2.3 Wallace Tree Multiplier Wallace tree multiplier uses the Carry Save Addition to add the partial products generated in each stage. Hence, carry generated in the present state is saved and added to the next state. Hence, the delay will be reduced because carry will be reduced 9. The design of Wallace tree multiplier is shown in Figure 3 Figure 1. Architecture of booth multiplier 2.2 Array Multiplier Array multiplier is one of the best conventional multipliers where the partial products are to be generated by AND gate logic. All the partial products are then added by Figure 3. Wallace tree multiplier 2.4 Baugh-Wooley Multiplier The Baugh-Wooley multiplier is faster than a simple array multiplier, Wallace tree multiplier and booth 2 Vol 9 (8) February 2016 www.indjst.org Indian Journal of Science and Technology

P. A. Irfan Khan and Ravi Shankar Mishra multiplier because it used for the complex circuits and it uses scomplement so the number of partial products will be reduced. The Baugh-Wooley multiplication is one of the best methods to multiply the signed numbers 10. The Baugh-Wooley multiplier is signed multiplier having less delay. The Baugh-Wooley multiplier uses the ripple carry adder in the final stage. The block diagram of the 4X4 Baugh-Wooley multiplier is shown in Figure 4.Algorithm for 4 X 4 Baugh-Wooley multiplier is given in Figure 5. a great impact on the speed of theprocessor. MAC is consists of theadder, multiplier and an accumulator 5. The MAC inputs are taken from the memory location and give to themultiplier, which performs the multiplication and result are given to the adder and adder will add the multiplication results are accumulated by using accumulator and then will stores the result into a memory location 2. The block diagram of MAC unit is shown in Figure 6.MAC unit mainly consists two parts, Multiplier and Accumulator. Figure 6. Block diagram of MAC unit. Figure 4. 4 X 4 Baugh-Wooley multiplier. Figure 5. Algorithm for 4X4 Baugh Wooley multiplier 3. Conventional Mac Unit MAC is the building blocks of a processor and having 3.1 Multiplier Multiplication is one of the basicoperations in the signal processing algorithms. Multipliers are having alarge area, long latency and they consumes considerable power. So low power multiplier design has an important need in low power VLSI systems 8. There is extensive work on low power multiplier s system performance is determined by the performance of the multiplier because of themultiplier is generally the slowest element in the system. Hence, the speed of the multiplier is a major concept in design systems 6. However, area and speed are usually the conflicting constraints to improve the speed in larger areas. So, a whole spectrum of multipliers with different area-speed constraints has been designed.basically, themultiplier is divided into three steps. The first step is booth encoding in which partial products are generated from the multiplier and multiplicand. The second step is adder to add all partial products and converts in the form ofthesum and carry. The last is the final step is addition 1,8 where the final multiplication results are generated by adding the sum and carry. So to generate the sum and Vol 9 (8) February 2016 www.indjst.org Indian Journal of Science and Technology 3

Comparative Analysis of different Algorithm for Design of High-Speed Multiplier Accumulator Unit (MAC) Figure 7. 1-bit accumulator. carry of the products we use the full adders which are adding the 3 input bits data and gives sum and carry. The full adders can be implemented in many ways like in 28T, 20T, 16T, 14T, 10T and 6T which causes the less power consumption 11. In this work, we use the conventional full adder with 46T. 3.2 Accumulator The accumulator is also called as a register. Register holds the output of the previous clock from the adder. Holding the outputs in theregister will reduce the additional adding operations 7. The architecture of 1-bit accumulator is shown in Figure7.The accumulatorhas used the D flip-flop, two AND gates and one OR gate. The cell is consistsof three inputs and one output. The three inputs are write select, read select and D input and Q is the output. The D-flip-flop will store the value of the input signal and whenever to write select and read select are equal to 1 thenthe input will pass to the output trough a tri - state buffer which allows controlling the current passed through the device. A tri-state buffer has two inputs, data input x and control input. Whenever the data input is high the output will be the input and if the data input is low the output will be Z and finally for fast response of the accumulator we need to implement the one fastest adder like carry select adder 12. 4. Simulation Results The simulation results of MAC unit usingbaugh- WooleyMultiplier is given in Figure 8. Table 1. Comparison of power consumption and speed of MAC unit with respect the different techniques Technique MAC unit using Wallace Tree Multiplier MAC unit using Baugh-Wooley Multiplier Power Speed Consumption (mw) 1.399 93.6 MHz 2.743 99.1 MHz All the above results are getting at input voltage 1.8 V. The Table 1 shows that the speed of MAC unit using baughwooley multiplier is more than the Wallace tree multiplier. 5. Conclusion Analysis of the MAC unit with the Wallace tree multiplier, Baugh-Wooley multiplier and pipelining technique has done by using cadence virtuoso 180nm technology. The Baugh-Wooley algorithm is a relatively straightforward way of doing signed multiplications. The Baugh-Wooley 4 Vol 9 (8) February 2016 www.indjst.org Indian Journal of Science and Technology

P. A. Irfan Khan and Ravi Shankar Mishra Figure 8. Output waveforms of MAC unit using Baugh-Wooley multiplier. multiplier reduces the partial products to the MAC unit. So the power consumption is low. By using Baugh-Wooley multiplier the speed of the circuit has been increased and due to pipelining technique the power consumption also very less as compared to the Wallace tree multiplier. In future if it is possible to reduce the power consumption of the Baugh-Wooley multiplier so that the MAC unit speed will increase. 6. References 1. NarendraCP, Kumar RKM. Low power MAC architecture for DSP applications. 14th International Conference of IEEE, Circuits Communication, Control and Computing;Bangalore. 2014 Nov. p. 404 7. 2. Warrier R, Vun CH, Zhang W. A low-power pipelined MAC architecture using Baugh-Wooley based multiplier. 3rd Global Conference of IEEE Consumer Electronics;Tokyo. 2014 Oct. p. 505 6. 3. Tu JH, Van LD. Power-efficient pipelined reconfigurable fixed-width Baugh-Wooley multipliers. IEEE Transactions Computers.2009 Oct; 58(10):1346 55. 4. Jaina D, Sethi K, Panda R. Vedic mathematics based multiply and accumulate unit.international Conference of IEEE Computational Intelligence and Communication Networks, (CICN); Gwalior. 2011 Oct 7-9. p. 754 7. 5. Jagadeesh P, Ravi S, Mallikarjun KH. Design of high performance 64 bit MAC unit. International Conference of IEEE, Circuits, Power and Computing Technologies; Nagercoil.2013 Mar 20-21. p. 782 6. 6. Mukherjee A, Asati A. Generic modified baugh wooley multiplier. International Conference of IEEE, Circuits, Power and Computing Technologies; Nagercoil.2013 Mar 20-21. p. 746 51. 7. Kumar MS, Kumar DA, Samundiswary P. Design and performance analysis of Multiply-Accumulate (MAC) unit. 14 th International Conference of IEEE, Circuits Power and Computing Technologies; Nagercoil. 2014 Mar 20-21. p. 1084 9. 8. Rahman SA, Khanna G. Performance metrics analysis of 4 bit Array multiplier circuit using 2 PASCL logic. 2014 International Conference of IEEE, Green Computing Communication and Electrical Engineering (ICGCCEE); Coimbatore.2014 Mar 6-8. p. 1 5. 9. KakdeS, Khan S, Dakhole P, Badwaik S. Design of area and power aware reduced complexity Wallace tree multiplier.2015 International Conference of IEEE, Pervasive Computing (ICPC); Pune.2015 Jan 8-10. p. 1 6. 10. Sjalander M, Edefors PL. High-speed and low-power multipliers using the Baugh-Wooleyalgorithm and HPM reduction tree. International Conference of IEEE, Electronics and Circuit Systems;St Julien s. 2008 Aug 31-Sep 3. p. 33 6. 11. Gomes SV, Sasipriya P, Bhaaskaran VSK. A low power multiplier using a 24-transistor latch adder. Indian Journal of Science and Technology. 2015 Aug; 8(18). DOI: 10.17485/ ijst/2015/v8i19/76866. 12. Senthilpari C, Diwakar K, Singh AK. High speed and high throughput 8x8 bit multiplier using a shannon-based adder cell.tencon IEEE Region 10 Conference;Singapore. 2009 Jan 23-26. p. 1 5. Vol 9 (8) February 2016 www.indjst.org Indian Journal of Science and Technology 5