IJMIE Volume 2, Issue 5 ISSN:

Similar documents
International Journal of Advanced Research in Computer Science and Software Engineering

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors

Mahendra Engineering College, Namakkal, Tamilnadu, India.

High Speed Vedic Multiplier Designs Using Novel Carry Select Adder

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS

An Optimized Design for Parallel MAC based on Radix-4 MBA

Design and Characterization of 16 Bit Multiplier Accumulator Based on Radix-2 Modified Booth Algorithm

Trade-Offs in Multiplier Block Algorithms for Low Power Digit-Serial FIR Filters

Eight Bit Serial Triangular Compressor Based Multiplier

Design and Implementation of Digit Serial Fir Filter

Design and Simulation of Convolution Using Booth Encoded Wallace Tree Multiplier

HIGH PERFORMANCE BAUGH WOOLEY MULTIPLIER USING CARRY SKIP ADDER STRUCTURE

High Speed Binary Counters Based on Wallace Tree Multiplier in VHDL

IN SEVERAL wireless hand-held systems, the finite-impulse

DESIGN OF MULTIPLE CONSTANT MULTIPLICATION ALGORITHM FOR FIR FILTER

ISSN Vol.07,Issue.08, July-2015, Pages:

A Parallel Multiplier - Accumulator Based On Radix 4 Modified Booth Algorithms by Using Spurious Power Suppression Technique

Modified Booth Encoding Multiplier for both Signed and Unsigned Radix Based Multi-Modulus Multiplier

A New network multiplier using modified high order encoder and optimized hybrid adder in CMOS technology

A Faster Carry save Adder in Radix-8 Booth Encoded Multiplier

A New High Speed Low Power Performance of 8- Bit Parallel Multiplier-Accumulator Using Modified Radix-2 Booth Encoded Algorithm

[Krishna, 2(9): September, 2013] ISSN: Impact Factor: INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

REVIEW ARTICLE: EFFICIENT MULTIPLIER ARCHITECTURE IN VLSI DESIGN

International Journal Of Scientific Research And Education Volume 3 Issue 6 Pages June-2015 ISSN (e): Website:

AREA EFFICIENT DISTRIBUTED ARITHMETIC DISCRETE COSINE TRANSFORM USING MODIFIED WALLACE TREE MULTIPLIER

JDT LOW POWER FIR FILTER ARCHITECTURE USING ACCUMULATOR BASED RADIX-2 MULTIPLIER

Multiple Constant Multiplication for Digit-Serial Implementation of Low Power FIR Filters

Multiplier Design and Performance Estimation with Distributed Arithmetic Algorithm

A Survey on Power Reduction Techniques in FIR Filter

Design of a Power Optimal Reversible FIR Filter for Speech Signal Processing

Design of a Power Optimal Reversible FIR Filter ASIC Speech Signal Processing

AN ADVANCED VLSI ARCHITECTURE OF PARALLEL MULTIPLIER BASED ON HIGHER ORDER MODIFIED BOOTH ALGORITHM

ENHANCING SPEED AND REDUCING POWER OF SHIFT AND ADD MULTIPLIER

CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES

DESIGN AND IMPLEMENTATION OF AREA EFFICIENT, LOW-POWER AND HIGH SPEED 128-BIT REGULAR SQUARE ROOT CARRY SELECT ADDER

Area Efficient and Low Power Reconfiurable Fir Filter

Design A Redundant Binary Multiplier Using Dual Logic Level Technique

A Review on Different Multiplier Techniques

International Journal of Engineering Research-Online A Peer Reviewed International Journal Articles available online

High performance Radix-16 Booth Partial Product Generator for 64-bit Binary Multipliers

Area Power and Delay Efficient Carry Select Adder (CSLA) Using Bit Excess Technique

IJCSIET--International Journal of Computer Science information and Engg., Technologies ISSN

Design and Implementation of High Radix Booth Multiplier using Koggestone Adder and Carry Select Adder

High Performance Low-Power Signed Multiplier

Design and Implementation of High Speed Carry Select Adder Korrapatti Mohammed Ghouse 1 K.Bala. 2

Comparison of Conventional Multiplier with Bypass Zero Multiplier

Design of Digital FIR Filter using Modified MAC Unit

An Optimized Implementation of CSLA and CLLA for 32-bit Unsigned Multiplier Using Verilog

Implementation of Parallel Multiplier-Accumulator using Radix- 2 Modified Booth Algorithm and SPST

A Novel High-Speed, Higher-Order 128 bit Adders for Digital Signal Processing Applications Using Advanced EDA Tools

Low Power Approach for Fir Filter Using Modified Booth Multiprecision Multiplier

Design and Implementation of Truncated Multipliers for Precision Improvement and Its Application to a Filter Structure

Low-Power Multipliers with Data Wordlength Reduction

An Design of Radix-4 Modified Booth Encoded Multiplier and Optimised Carry Select Adder Design for Efficient Area and Delay

Reduced Complexity Wallace Tree Mulplier and Enhanced Carry Look-Ahead Adder for Digital FIR Filter

CHAPTER 1 INTRODUCTION

Design and Analyse Low Power Wallace Multiplier Using GDI Technique

AUTOMATIC IMPLEMENTATION OF FIR FILTERS ON FIELD PROGRAMMABLE GATE ARRAYS

Techniques to Optimize 32 Bit Wallace Tree Multiplier

DESIGN OF LOW POWER MULTIPLIER USING COMPOUND CONSTANT DELAY LOGIC STYLE

Modified Booth Multiplier Based Low-Cost FIR Filter Design Shelja Jose, Shereena Mytheen

Design and Implementation of 64-bit MAC Unit for DSP Applications using verilog HDL

Implementation of FPGA based Design for Digital Signal Processing

Design and Implementation of Complex Multiplier Using Compressors

Innovative Approach Architecture Designed For Realizing Fixed Point Least Mean Square Adaptive Filter with Less Adaptation Delay

Tirupur, Tamilnadu, India 1 2

Design and Implementation Radix-8 High Performance Multiplier Using High Speed Compressors

An Efficient Design of Parallel Pipelined FFT Architecture

NOVEL HIGH SPEED IMPLEMENTATION OF 32 BIT MULTIPLIER USING CSLA and CLAA

Design and Implementation of High Speed Carry Select Adder

AREA AND DELAY EFFICIENT DESIGN FOR PARALLEL PREFIX FINITE FIELD MULTIPLIER

IJCSIET-- International Journal of Computer Science information and Engg., Technologies ISSN

Data Word Length Reduction for Low-Power DSP Software

Performance Analysis of an Efficient Reconfigurable Multiplier for Multirate Systems

Implementation of Parallel MAC Unit in 8*8 Pre- Encoded NR4SD Multipliers

Low Area Power -Aware FIR Filter for DSP

A Novel Low Power, High Speed 14 Transistor CMOS Full Adder Cell with 50% Improvement in Threshold Loss Problem

Design of an optimized multiplier based on approximation logic

AN EFFICIENT MAC DESIGN IN DIGITAL FILTERS

Performance Analysis of a 64-bit signed Multiplier with a Carry Select Adder Using VHDL

Low Power FIR Filter Structure Design Using Reversible Logic Gates for Speech Signal Processing

A High Speed Wallace Tree Multiplier Using Modified Booth Algorithm for Fast Arithmetic Circuits

ADVANCES in NATURAL and APPLIED SCIENCES

TRANSPOSED FORM OF FOLDED FIR FILTER

DESIGN & IMPLEMENTATION OF FIXED WIDTH MODIFIED BOOTH MULTIPLIER

VHDL Code Generator for Optimized Carry-Save Reduction Strategy in Low Power Computer Arithmetic

Design of Baugh Wooley Multiplier with Adaptive Hold Logic. M.Kavia, V.Meenakshi

A Survey on A High Performance Approximate Adder And Two High Performance Approximate Multipliers

DESIGN AND ANALYSIS OF LOW POWER 10- TRANSISTOR FULL ADDERS USING NOVEL X-NOR GATES

Area and Power Efficient Booth s Multipliers Based on Non Redundant Radix-4 Signed- Digit Encoding

A NOVEL WALLACE TREE MULTIPLIER FOR USING FAST ADDERS

Implementation of 32-Bit Unsigned Multiplier Using CLAA and CSLA

Low-Power Digital CMOS Design: A Survey

Design of Area and Power Efficient FIR Filter Using Truncated Multiplier Technique

A Survey on Design of Pipelined Single Precision Floating Point Multiplier Based On Vedic Mathematic Technique

International Journal of Scientific & Engineering Research, Volume 4, Issue 5, May ISSN

Design of Multiplier Less 32 Tap FIR Filter using VHDL

A Compact Design of 8X8 Bit Vedic Multiplier Using Reversible Logic Based Compressor

FOR HIGH SPEED LOW POWER APPLICATIONS USING RADIX-4 MODIFIED BOOTH ENCODER

Pipelined Linear Convolution Based On Hierarchical Overlay UT Multiplier

Transcription:

Systematic Design of High-Speed and Low- Power Digit-Serial Multipliers VLSI Based Ms.P.J.Tayade* Dr. Prof. A.A.Gurjar** Abstract: Terms of both latency and power Digit-serial implementation styles are best suited for implementation of digital signal processing systems which require moderate sampling rates. Digit-serial architectures obtain using traditional unfolding techniques cannot be pipelined beyond a certain level because of the presence of feedback loops. In this paper, an alternative approach for the design of digit-serial architectures is presented based on a novel design methodology. This methodology permits bit-level pipelining of the digit-serial architectures by moving all feedback loops to the last stage of the design. This enables bitlevel pipelining of digit-serial architectures, thereby achieving sample speeds close to corresponding bitparallel multipliers with lower area. This increased sample speed can be traded with reduction in power supply voltage resulting in significant reduction in power consumption. The proposed approach is applied to the design of various multipliers which form the backbone of digital signal processing computations. The results show that for transformed multipliers with smaller digit sizes (_4), the singly-redundant multiplier consumes the least power, and for larger digit sizes, the type-i multiplier consumes the least power. It is also found that the optimum digit size for least power consumption in type-i and type-iii multipliers is _p2w, where W represents the word length. Among the bit-level pipelined digit-serial multipliers, it is found that the redundant multiplier offers the best choice in consumption The proposed digit-serial multipliers consume on average 20% lower power than the traditional digit-serial architectures for the non pipelined case and about 5 15 times lower power for the bit-level pipelined case. Keywords: Bit-level pipelining, Booth recoding, carry-save arithmetic, digit-serial multiplier, low power, redundant arithmetic. * Electronics & telecommunication, Sipna s C.O.E.T., Amravati, India. ** Electronics & telecommunication, Sipna s C.O.E.T., Amravati, India 439

INTRODUCTION: DIGITAL signal processing (DSP) is used in a wide range of applications such as telephone, radio, video, sonar, etc. The sample rate requirements vary from application to application and can range anywhere from 10 khz to 100 MHz. Most of the DSP computations involve the use of multiply accumulate operations, and therefore the design of fast and efficient multipliers is imperative. Moreover, the demand for portable applications of DSP architectures has dictated the need for low-power designs This is because the power consumption has a direct bearing on the lifetime of the batteries. For example, if the lifetime of the batteries can be reduced by a factor of half, the same number of batteries can now be used for twice the number of hours. Digit-serial multipliers are ideal for moderate-speed DSP computations and find many applications in heterogeneous high-level synthesis environments. Recently, it was found that digit-serial multipliers could be pipelined at the bit level thereby resulting in high processing speeds. However, here the designs were obtained in an ad hoc manner. In this paper, based on and a systematic design methodology for low power digit-serial multipliers is presented. Traditionally, digit-serial multipliers were obtained by either folding the corresponding bit-parallel architectures or unfolding the bit-serial architectures. The architectures obtained in this manner cannot be pipelined at the bit-level. The approach presented in this paper enables the direct design of digit-serial architectures which can be pipelined at the bit-level. The advantage is twofold. First, processing speeds comparable to a bit-parallel system can be obtained with less area. Second, since the critical path is reduced, the power supply voltage can be reduced for a fixed sampling frequency. This causes the proposed bit-level pipelined digit-serial architectures to consume lower power than the traditional digit-serial architectures. Review of work: Yun-Nan Chang has presented a design methodology for a new class of digit-serial multiplier architectures. These architectures can be pipelined at the bit-level, and as a result power can be reduced. For a specified wsf, the clock speed required with a bit serial design is much higher than digit-serial with digit size 4 or 8. As a result, the power consumed by a bit- 440

serial design due to high-speed clock is much higher and this favors digit serial architectures with respect to low power consumption. It should also be noted that for large digit sizes. Digit-Serial Architectures using Unfolding Transformation In this section, we motivate the need for digit-serial architectures, and present systematic design of digit-serial architectures using the unfolding Transformation. Consider the (word-serial) bit-serial implementation of the simple add operation. There are two approaches we can think of. First, we can process two input samples simultaneously, and process each input in a bit-serial manner; this corresponds to a word parallel bit-serial system with block size of two. Alliteratively, we can process the inputs in a word serial manner, but process two bits of a word in parallel; this corresponds to a word-serial digit-serial implementation with digit-size. Motivation: In parallel system speed, power is high and in addition area constraint is also high.to reduce the power and area we prefer bit serial architecture but it may be slow in speed, While we gain the power and area constraint but it loses the speed. So there is need of such a architecture which balance the both things from designer side, that is speed, power and area constraints.this is motivation to design a digit serial architecture. Proposed work: Digit-Serial Architectures using the proposed design methodology is applied to various existing bit-serial multipliers including the type-i, type-ii, type-iii, and the singly-redundant multiplier. A.Type-I Multiplier: Consider the bit-serial type-i multiplier shown in Fig. 1 where the coefficient word length is four bits. This architecture contains four full adders, four multipliers, and some delay elements. In this multiplier, the carry-out signal of every adder is fed back after a delay to the carry-in signal of the same adder. The critical path of this architecture is full-adder delays. The traditional approach for designing the digit-serial architecture involves unfolding this structure by a factor 441

equal to the digit size. However, the resulting critical path would be full-adder delays; which can be further reduced to full-adder delays after pipelining. Reduction in the critical path below fulladder delays is not possible because of the presence of feedback loops. Therefore, in the final stage, a digit-serial adder is required to sum all these outputs. A simple digit-serial 3 : 2 compressor adder can be first used to reduce these three output digits to two digits. A digit-serial carry look-ahead adder or any other fast carry propagate adder is then used to add these two digits to generate the final result. Fig:1-Type-I bit-serial multiplier with word length of 4 bits. Fig. 2. Digit-cell for type-i multiplier. 442

Type-II Multiplier: Consider the bit-serial type-ii multiplier shown in Fig.3. The main difference between this multiplier and the type-i multiplier is that the critical path in this architecture is just two fulladder delays. Moreover, this architecture can be pipelined at the bit-level with an additional latency of only one clock-cycle unlike the type-i multiplier where the increase in latency would depend on the word length. If this architecture is unfolded using the traditional technique, the critical path would be full-adder delays. However, as in the case of the type-i multiplier, reduction in the critical path below is not possible due to the presence of feedback loops. Fig. 5 is replaced with the digit-cell shown in Fig. 6. Here, represents the digit version of. For example, represents the four bits represents and so on. The partial product generator is identical to the one shown in Fig. 3. The entire digit-serial multiplier is designed by cascading these digitcells similar to the type-i multiplier. A digit-serial 3 : 2 compressor and a carry look-ahead adder are required at the output of the last digit-cell to convert the three digit outputs to a single digit output. Fig. 3. Bit-serial type-ii multiplier with word-length of 4 bits 443

Fig. 4. Digit-cell for the type-ii multiplier C. Type-III Multiplier Consider the bit-serial type-iii multiplier shown in Fig. 5. The salient feature of this architecture is that the carry-out signal is not fed back as in the type-i multiplier. If this architecture is unfolded using the traditional technique, the critical path would be full-adder delays. However in this case since there is no carry feedback, the unfolded architecture can also be pipelined at the bit-level. It should be noted that the partial product generator uses two coefficient digits and unlike the previous architectures where only one digit was used. It should also be noted that the carry-save portion generates four outputs at each stage. Therefore, at the output of the final digitcell, a digit-serial 4 : 2 compressor and a fast carry look-ahead adder are required to convert the four digits to one digit. The resulting architecture can be pipelined at the bit-level Fig.5 Bit-serial type-iii multiplier with word-length of 4 bits. 444

Conclusion: This paper has presented a design methodology for a new class of digit-serial multiplier architectures. These architectures can be pipelined at the bit-level, and as a result power can be reduced. It should also be noted that for large digit sizes, the CSA module can be implemented using the Wallace tree algorithm Experiments using HEAT tool showed that about 35% lower power is obtained for the non pipelined architecture using the Wallace tree approach when compared to the CSA-based architecture for a digit size of 8 and a word-length of 16 bits. For a specified wsf, the clock speed required with a bit serial design is much higher than digit-serial with digit size 4 or 8. As a result, the power consumed by a bit-serial design due to high-speed clock is much higher and this favors digitserial architectures with respect to low power consumption. Note that the power consumed by the clock is not accounted for by the HEAT tool. In this paper, comparison of critical path and power consumption of different digit-serial multipliers and their variation with respect to digit sizes have been explored. However, the comparison between the digit-serial and bit-parallel multipliers has not been addressed. References: P. B. Denyer and D. Renshaw, VLSI Signal Processing: A Bit-Serial Approach. Reading, MA: Addison Wesley, 1986. R. I. Hartley and J. R. Jasica, Behavioral to structural translation in a bit-serial silicon compiler, IEEE Trans. Computer-Aided Design, vol. 7, pp. 877 886, Aug. 1988. R. F. Lyon, Two s complement pipelined multipliers, IEEE Trans. Commun., vol. COM- 24, pp. 418 425, Apr. 1976. S. G. Smith and P. B. Denyer, Serial Data Computation. Boston, MA: Kluwer, 1988. L. B. Jackson, J. F. Kaiser, and H. S. McDonald, An approach to implementation of digital filters, IEEE Trans. Audio Electron. Acoust., vol. 16, pp. 413 421, Sept. 1968. R. Jain et al., Custom design of a VLSI PCM-FDM transmultiplexor from system specification to circuit layout using a computer aided design system, IEEE J. Solid-State Circuits, vol. CS-21, pp. 73 85, Feb. 1986. 445

P. R. Cappello and C. W. Wu, Computer aided design of VLSI FIR filters, Proc. IEEE, vol. 75, pp. 1260 1271, Sept. 1987. M. Hatamian and G. Cash, Parallel bit-level pipelined VLSI designs for high-speed signal processing, Proc. IEEE, vol. 75, pp. 1192 1202, Sept. 1987. T. G. Noll et al., A pipelined 330 MHz multiplier, IEEE J. Solid-State Circuits, vol. SC-24, pp. 411 416, June 1986. K. K. Parhi and M. Hatamian, A high sample rate recursive filter chip, in VLSI Signal Processing III, 1988, pp. 3 14. 446