Efficient Multi-Operand Adders in VLSI Technology

Similar documents
ISSN Vol.07,Issue.08, July-2015, Pages:

IJCSIET-- International Journal of Computer Science information and Engg., Technologies ISSN

An Optimized Implementation of CSLA and CLLA for 32-bit Unsigned Multiplier Using Verilog

ISSN Vol.04, Issue.06, June-2016, Pages:

Area Power and Delay Efficient Carry Select Adder (CSLA) Using Bit Excess Technique

An Optimized Design for Parallel MAC based on Radix-4 MBA

Design and Implementation of Complex Multiplier Using Compressors

Design of a Power Optimal Reversible FIR Filter ASIC Speech Signal Processing

A Novel Approach For Designing A Low Power Parallel Prefix Adders

Implementation of High Performance Carry Save Adder Using Domino Logic

Techniques to Optimize 32 Bit Wallace Tree Multiplier

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS

A New High Speed Low Power Performance of 8- Bit Parallel Multiplier-Accumulator Using Modified Radix-2 Booth Encoded Algorithm

Modified Booth Encoding Multiplier for both Signed and Unsigned Radix Based Multi-Modulus Multiplier

A NOVEL WALLACE TREE MULTIPLIER FOR USING FAST ADDERS

A Review on Different Multiplier Techniques

Modified Design of High Speed Baugh Wooley Multiplier

IJCSIET--International Journal of Computer Science information and Engg., Technologies ISSN

Implementation of Parallel Multiplier-Accumulator using Radix- 2 Modified Booth Algorithm and SPST

JDT LOW POWER FIR FILTER ARCHITECTURE USING ACCUMULATOR BASED RADIX-2 MULTIPLIER

Performance Analysis of a 64-bit signed Multiplier with a Carry Select Adder Using VHDL

Design and Implementation of 64-bit MAC Unit for DSP Applications using verilog HDL

Design and Implementation Radix-8 High Performance Multiplier Using High Speed Compressors

2 Assoc Prof, Dept of ECE, George Institute of Engineering & Technology, Markapur, AP, India,

DESIGN OF LOW POWER MULTIPLIERS

Design and Implementation of High Speed Carry Select Adder

A Survey on Power Reduction Techniques in FIR Filter

Design and Implementation of Carry Select Adder Using Binary to Excess-One Converter

Design and Analysis of CMOS Based DADDA Multiplier

ENHANCING SPEED AND REDUCING POWER OF SHIFT AND ADD MULTIPLIER

ISSN Vol.02, Issue.11, December-2014, Pages:

Self-Checking Carry-Select Adder Design Based on Two-Pair Two-Rail Checker

Design and Implementation of a delay and area efficient 32x32bit Vedic Multiplier using Brent Kung Adder

FPGA Implementation of Wallace Tree Multiplier using CSLA / CLA

Low-Power Multipliers with Data Wordlength Reduction

Reduced Complexity Wallace Tree Mulplier and Enhanced Carry Look-Ahead Adder for Digital FIR Filter

DESIGN OF LOW POWER MULTIPLIER USING COMPOUND CONSTANT DELAY LOGIC STYLE

AREA AND DELAY EFFICIENT DESIGN FOR PARALLEL PREFIX FINITE FIELD MULTIPLIER

Design and Implementation of High Speed Carry Select Adder Korrapatti Mohammed Ghouse 1 K.Bala. 2

An Optimized Design of High-Speed and Energy- Efficient Carry Skip Adder with Variable Latency Extension

Analysis of Parallel Prefix Adders

International Journal of Advance Engineering and Research Development

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors

A High Speed Wallace Tree Multiplier Using Modified Booth Algorithm for Fast Arithmetic Circuits

Design of an optimized multiplier based on approximation logic

High Speed Vedic Multiplier Designs Using Novel Carry Select Adder

DESIGN AND IMPLEMENTATION OF 64- BIT CARRY SELECT ADDER IN FPGA

Investigation on Performance of high speed CMOS Full adder Circuits

AN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER

Design of 16-bit Heterogeneous Adder Architectures Using Different Homogeneous Adders

Efficient Implementation of Parallel Prefix Adders Using Verilog HDL

A HIGH SPEED DYNAMIC RIPPLE CARRY ADDER

COMPARISION OF LOW POWER AND DELAY USING BAUGH WOOLEY AND WALLACE TREE MULTIPLIERS

Design and Simulation of Convolution Using Booth Encoded Wallace Tree Multiplier

A Fixed-Width Modified Baugh-Wooley Multiplier Using Verilog

Design and Analysis of RNS Based FIR Filter Using Verilog Language

Anitha R 1, Alekhya Nelapati 2, Lincy Jesima W 3, V. Bagyaveereswaran 4, IEEE member, VIT University, Vellore

CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES

Mahendra Engineering College, Namakkal, Tamilnadu, India.

A Novel High-Speed, Higher-Order 128 bit Adders for Digital Signal Processing Applications Using Advanced EDA Tools

PROMINENT SPEED ARITHMETIC UNIT ARCHITECTURE FOR PROFICIENT ALU

A NOVEL IMPLEMENTATION OF HIGH SPEED MULTIPLIER USING BRENT KUNG CARRY SELECT ADDER K. Golda Hepzibha 1 and Subha 2

An Efficient SQRT Architecture of Carry Select Adder Design by HA and Common Boolean Logic PinnikaVenkateswarlu 1, Ragutla Kalpana 2

ASIC Implementation of High Speed Area Efficient Arithmetic Unit using GDI based Vedic Multiplier

A New network multiplier using modified high order encoder and optimized hybrid adder in CMOS technology

Design and Performance Analysis of a Reconfigurable Fir Filter

ISSN:

AN EFFICIENT CARRY SELECT ADDER WITH LESS DELAY AND REDUCED AREA USING FPGA QUARTUS II VERILOG DESIGN

Performance Analysis of an Efficient Reconfigurable Multiplier for Multirate Systems

A Novel Approach for High Speed and Low Power 4-Bit Multiplier

HIGH PERFORMANCE BAUGH WOOLEY MULTIPLIER USING CARRY SKIP ADDER STRUCTURE

Design A Redundant Binary Multiplier Using Dual Logic Level Technique

Modified Booth Multiplier Based Low-Cost FIR Filter Design Shelja Jose, Shereena Mytheen

VLSI IMPLEMENTATION OF MODIFIED DISTRIBUTED ARITHMETIC BASED LOW POWER AND HIGH PERFORMANCE DIGITAL FIR FILTER Dr. S.Satheeskumaran 1 K.

JDT EFFECTIVE METHOD FOR IMPLEMENTATION OF WALLACE TREE MULTIPLIER USING FAST ADDERS

Design of Digital FIR Filter using Modified MAC Unit

CHAPTER 1 INTRODUCTION

Design and Analysis of Row Bypass Multiplier using various logic Full Adders

ASIC Design and Implementation of SPST in FIR Filter

Implementation and Performance Analysis of different Multipliers

[Devi*, 5(4): April, 2016] ISSN: (I2OR), Publication Impact Factor: 3.785

Design & Analysis of Low Power Full Adder

Design of High Speed Hybrid Sqrt Carry Select Adder

Design of a High Speed FIR Filter on FPGA by Using DA-OBC Algorithm

A SUBSTRATE BIASED FULL ADDER CIRCUIT

Data Word Length Reduction for Low-Power DSP Software

Compressor Based Area-Efficient Low-Power 8x8 Vedic Multiplier

An Efficient Reconfigurable Fir Filter based on Twin Precision Multiplier and Low Power Adder

Implementation of FPGA based Design for Digital Signal Processing

FOR HIGH SPEED LOW POWER APPLICATIONS USING RADIX-4 MODIFIED BOOTH ENCODER

High Speed and Reduced Power Radix-2 Booth Multiplier

SQRT CSLA with Less Delay and Reduced Area Using FPGA

FPGA Implementation of Area-Delay and Power Efficient Carry Select Adder

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) STUDY ON COMPARISON OF VARIOUS MULTIPLIERS

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP ( 1

Design and Implementation of Wallace Tree Multiplier Using Kogge Stone Adder and Brent Kung Adder

Design and Performance Analysis of 64 bit Multiplier using Carry Save Adder and its DSP Application using Cadence

A Novel High Performance 64-bit MAC Unit with Modified Wallace Tree Multiplier

Low Power FIR Filter Structure Design Using Reversible Logic Gates for Speech Signal Processing

IMPLEMENTATION OF AREA EFFICIENT AND LOW POWER CARRY SELECT ADDER USING BEC-1 CONVERTER

Transcription:

Efficient Multi-Operand Adders in VLSI Technology K.Priyanka M.Tech-VLSI, D.Chandra Mohan Assistant Professor, Dr.S.Balaji, M.E, Ph.D Dean, Department of ECE, Abstract: This paper presents different approaches to the efficient implementation of generic carry-save compressor trees on FPGAs. They present a fast critical path, independent of bit width, with practically no area overhead compared to CPA trees. Along with the classic carry-save compressor tree, we present a novel linear array structure, which efficiently uses the fast carry-chain resources. This approach is defined in a HDL code based on CPAs, which makes it compatible with any FPGA (field programmable gate arrays) family or vendor. A detailed study is provided for a wide range of bit widths and large number of operands. Keywords: Compressor, CPA, Linear array structure, Mullti-operand addition I.INTRODUCTION: As the scale of integration keeps growing, more and more sophisticated signal processing systems are being implemented on a VLSI chip. These signal processing applications not only demand great computation capacity but also consume considerable amount of energy. While performance and Area remain to be the two major design tolls, power consumption has become a critical concern in today s VLSI system design[1]. The need for low-power VLSI system arises from two main forces. First, with the steady growth of operating frequency and processing capacity per chip, large currents have to be delivered and the heat due to large power consumption must be removed by proper cooling techniques. Second, battery life in portable electronic devices is limited. Low power design directly leads to prolonged operation time in these portable devices. Addition usually impacts widely the overall performance of digital systems and a crucial arithmetic function. In electronic applications adders are most widely used. Applications where these are used are multipliers, DSP to execute various algorithms like FFT, FIR and IIR. So, speed of operation is the most important constraint to be considered while designing multipliers. Due to device portability miniaturization of device should be high and power consumption should be low. Devices like Mobile, Laptops etc. require more battery backup. So, a VLSI designer has to optimize these three parameters in a design. These constraints are very difficult to achieve so depending on demand or application some compromise between constraints has to be made. Ripple carry adders exhibits the most compact design but the slowest in speed. Whereas carry look ahead is the fastest one but consumes more area. Carry select adders act as a compromise between the two adders. In 2002, a new concept of hybrid adders is presented to speed up addition process by Wang et al. that gives hybrid carry look-ahead/carry select adders design. In 2008, low power multipliers based on new hybrid full adders is presented. II.NEED FOR LOW POWER DESIGN: The design of portable devices requires consideration for peak power consumption to ensure reliability and proper operation. However, the time averaged power is often more critical as it is linearly related to the battery life. There are four sources of power dissipation in digital CMOS circuits: switching power, short-circuit power, leakage power and static power.low Voltage Power consumption is linearly proportional to voltage swing (Vs) and supply voltage (Vdd) as indicated in Eq. (2.5). For most CMOS logic families, the swing is typically rail-torail. Hence, power consumption is also said to be proportional to the square of the supply voltage, Vdd. Therefore, lowering the Vdd is an efficient approach to reduce both energy and power, presuming that the signal voltage swing can be freely chosen. This is, however, at the expense of the delay of circuits. The delay, td, can be shown to be proportional to. The exponent is between 1 and 2. It tends to be closer to 1 for MOS transistors that are in deep sub-micrometer region, where carrier velocity saturation may occur. www.ijmetmr.com Page 576

The current technology trends are to reduce feature size and lower supply voltage. Lowering Vdd leads to increased circuit delays and therefore lower functional throughput. Smaller feature size, however, reduces gate delay, as it is inversely proportional to the square of the effective channel length of the devices. In addition, thinner gate oxides impose voltage limitation for reliability reasons. Hence, the supply voltage must be lowered for smaller geometries. The net effect is that circuit performance improves as CMOS technologies scale down, despite of the Vdd reduction. Therefore, the new technology has made it possible to fulfill the contradicting requirements of low power and high throughput. III.EXISTING: In this paper, we study the efficient implementation of multi-operand redundant compressor trees in modern FPGAs by using their fast carry resources. Deign Implementation the classic design of a multi-operand CS compressor tree attempts to reduce the number of levels in its structure. The 3:2 counters or the 4:2 compressors are the most widely known building blocks to implement it. We select a 4:2 compressor as the basic building block, because it could be efficiently implemented on Xilinx FPGAs. The implementation of a generic CS compressor tree requires [Nop/2]-1, 4:2 compressors (because each one eliminates two signals), whereas a carry-propagate tree uses. To optimize the use of the carry resources, we propose a compressor tree structure similar to the classic linear array of CSAs. ]. However, in our case, given the two output words of each adder (sum-word and carryword), only the carry-word is connected from each CSA to the next, whereas the sum words are connected to lower levels of the array.first, the two regular inputs on each CSA are used to add all the input operands (Ii). When all the input operands have been introduced in the array, the partial sum-words (Si) previously generated are then added in order (i.e., the first generated partial sums are added first). IV.PROPOSED: In this section, we present different approaches to efficiently map CS compressor trees on FPGA devices. In addition, approximate area and delay analysis are conducted for the general case. Let us consider a generic compressor tree of Nop input operands with N bit width each. We also assume the same bit width for input and output operands. Thus, input operands should have previously been zero or sign extended to guarantee that no overflow occurs. A detailed analysis of the number of leading guard bits required for multi operand CS addition is provided. Applications: Digital signal processors Microprocessors Controllers Objective: The area overhead of the redundant adders when they are implemented on FPGAs Present a fast critical path, independent of bit width, with practically no area overhead compared to CPA trees.in this 18:2 compressor is used so that the main theme is even though we are increasing the multiple bits or whether the architecture is increasing or not the delay will gets reduced and increases efficiency. The main point is even though architecture increases the compressors which re of different types used inside the architecture is very efficient for total design.so, that s why even though architecture was increasing delay is reducing and increasing the performance. Figure 2: 6:2 Compressor based Adder Figure 1:N-bit width CS 9:2 compressor tree based on a linear array of CSAs. www.ijmetmr.com Page 577

V.IMPLEMENTATION RESULTS & COM- PARISION: To measure the effectiveness of the designs presented in this paper, we have developed two generic VERILOG modules implementing the proposed compressor tree structures: First, the linear array implemented by using CPAs (binary and ternary) and, second, the 4:2 compressor tree using the design of the compressor presented in [28]. Both modules provide the output result in CS format and allow the selection of different parameters such as: The number of operands (Nop), the number of bits per operand (N), and the basic building blocks (i.e., binary or ternary adder) for the linear array. For the purposes of comparison, similar modules, which implement classic adder tree structures based on binary CPAs and ternary CPAs, have also been developed. All these modules were simulated using ISIMULATOR and they were synthesized using Xilinx ISE 14.2, targeting Spartan-3A, Virtex-4, and Virtex-5 devices. Figure 3: Existing system for Linear array Block Diagram, RTL Schematic, Waveform Figure 4: Existing system for Linear CPA Block Diagram, RTL Schematic Waveform www.ijmetmr.com Page 578

Figure-6: Proposed Method using 18:2 Block Diagram, RTL Schematic, Waveform. Area & Delay Reports: Figure 5: Existing System using 11:2 Compressor Block Diagram, RTL Schematic Waveform. Figure 6: Existing Method using 11:2 Area, Delay www.ijmetmr.com Page 579

REFERENCES: [1] B. Cope, P. Cheung, W. Luk, and L. Howes, PerformanceComparison of Graphics Processors to Reconfigurable Logic: A Case Study, IEEE Trans. Computers, vol. 59, no. 4, pp. 433-448, Apr. 2010. Figure7: Proposed Method using 18:2 Area, Delay VI.CONCLUSION: Efficiently implementing CS compressor trees on FPGA, in terms of area and speed, is made possible by using the specialized carry-chains of these devices in a novel way. Similar to what happens when using ASIC technology, the proposed CS linear array compressor trees lead to marked improvements in speed compared to CPA approaches and, in general, with no additional hardware cost. Furthermore, the proposed high-level definition of CSA arrays based on CPAs facilitates ease-of-use and portability, even in relation to future FPGA architectures, because CPAs will probably remain a key element in the next generations of FPGA. We have compared our architectures, implemented on different FPGA families, to several designs and have provided a qualitative and quantitative study of the benefits of our proposals. [2] S. Dikmese, A. Kavak, K. Kucuk, S. Sahin, A. Tangel, and H. Dincer, Digital Signal Processor against Field Programmable Gate Array Implementations of Space- Code Correlator Beamformer for Smart Antennas, IET Microwaves, Antennas Propagation, vol. 4, no. 5,pp. 593-599, May 2010. [3] S. Roy and P. Banerjee, An Algorithm for Trading off Quantization Error with Hardware Resources for MAT- LAB-based FPGA Design, IEEE Trans. Computers, vol. 54, no. 7, pp. 886-896, July 2005. [4] F. Schneider, A. Agarwal, Y.M. Yoo, T. Fukuoka, and Y. Kim, A Fully Programmable Computing Architecture for Medical Ultrasound Machines, IEEE Trans. Information Technology in Biomedicine, vol. 14, no. 2, pp. 538-540, Mar. 2010. [5] J. Hill, The Soft-Core Discrete-Time Signal Processor Peripheral [Applications Corner], IEEE Signal Processing Magazine, vol. 26, no. 2, pp. 112-115, Mar. 2009. [6] J.S. Kim, L. Deng, P. Mangalagiri, K. Irick, K. Sobti, M. Kandemir, V. Narayanan, C. Chakrabarti, N. Pitsianis, and X. Sun, An Automated Framework for Accelerating Numerical Algorithms on Reconfigurable Platforms Using Algorithmic/Architectural Optimization, IEEE Trans. Computers, vol. 58, no. 12, pp. 1654-1667, Dec. 2009. [7] H. Lange and A. Koch, Architectures and Execution Models for Hardware/Software Compilation and their System-Level Realization, IEEE Trans. Computers, vol. 59, no. 10, pp. 1363-1377, Oct. 2010. www.ijmetmr.com Page 580