Design of Practical FIR Filter Using Modified Radix-4 Booth Algorithm

Similar documents
High Speed, Low Power And Area Efficient Carry-Select Adder

PERFORMANCE EVALUATION OF BOOTH AND WALLACE MULTIPLIER USING FIR FILTER. Chirala Engineering College, Chirala.

A High-Speed Multiplication Algorithm Using Modified Partial Product Reduction Tree

Efficient Large Integers Arithmetic by Adopting Squaring and Complement Recoding Techniques

Multiple Error Correction Using Reduced Precision Redundancy Technique

Uncertainty in measurements of power and energy on power networks

Modified Booth Multiplier Based Low-Cost FIR Filter Design Shelja Jose, Shereena Mytheen

Design and Implementation of DDFS Based on Quasi-linear Interpolation Algorithm

NATIONAL RADIO ASTRONOMY OBSERVATORY Green Bank, West Virginia SPECTRAL PROCESSOR MEMO NO. 25. MEMORANDUM February 13, 1985

Fast Algorithm of A 64-bit Decimal Logarithmic Converter

HIGH PERFORMANCE ADDER USING VARIABLE THRESHOLD MOSFET IN 45NM TECHNOLOGY

Digital Transmission

Walsh Function Based Synthesis Method of PWM Pattern for Full-Bridge Inverter

Research of Dispatching Method in Elevator Group Control System Based on Fuzzy Neural Network. Yufeng Dai a, Yun Du b

Design of an FPGA based TV-tuner test bench using MFIR structures

Fast Code Detection Using High Speed Time Delay Neural Networks

PRACTICAL, COMPUTATION EFFICIENT HIGH-ORDER NEURAL NETWORK FOR ROTATION AND SHIFT INVARIANT PATTERN RECOGNITION. Evgeny Artyomov and Orly Yadid-Pecht

A Comparison of Two Equivalent Real Formulations for Complex-Valued Linear Systems Part 2: Results

COMPARISON OF VARIOUS RIPPLE CARRY ADDERS: A REVIEW

A High-Sensitivity Oversampling Digital Signal Detection Technique for CMOS Image Sensors Using Non-destructive Intermediate High-Speed Readout Mode

Chaotic Filter Bank for Computer Cryptography

1. Introduction. Key words: FPGA, Picoblaze, PID controller, HDL, Simulink

Calculation of the received voltage due to the radiation from multiple co-frequency sources

TECHNICAL NOTE TERMINATION FOR POINT- TO-POINT SYSTEMS TN TERMINATON FOR POINT-TO-POINT SYSTEMS. Zo = L C. ω - angular frequency = 2πf

antenna antenna (4.139)

IEE Electronics Letters, vol 34, no 17, August 1998, pp ESTIMATING STARTING POINT OF CONDUCTION OF CMOS GATES

Figure 1. DC-DC Boost Converter

FPGA Implementation of Ultrasonic S-Scan Coordinate Conversion Based on Radix-4 CORDIC Algorithm

To: Professor Avitabile Date: February 4, 2003 From: Mechanical Student Subject: Experiment #1 Numerical Methods Using Excel

A study of turbo codes for multilevel modulations in Gaussian and mobile channels

Rejection of PSK Interference in DS-SS/PSK System Using Adaptive Transversal Filter with Conditional Response Recalculation

Performance Analysis of Multi User MIMO System with Block-Diagonalization Precoding Scheme

FFT Spectrum Analyzer

Figure.1. Basic model of an impedance source converter JCHPS Special Issue 12: August Page 13

Parameter Free Iterative Decoding Metrics for Non-Coherent Orthogonal Modulation

Dynamic Optimization. Assignment 1. Sasanka Nagavalli January 29, 2013 Robotics Institute Carnegie Mellon University

MODEL ORDER REDUCTION AND CONTROLLER DESIGN OF DISCRETE SYSTEM EMPLOYING REAL CODED GENETIC ALGORITHM J. S. Yadav, N. P. Patidar, J.

DESIGN OF OPTIMIZED FIXED-POINT WCDMA RECEIVER

Inverse Halftoning Method Using Pattern Substitution Based Data Hiding Scheme

Joint Power Control and Scheduling for Two-Cell Energy Efficient Broadcasting with Network Coding

DESIGN OF OPTIMIZED FIXED-POINT WCDMA RECEIVER

MASTER TIMING AND TOF MODULE-

A MODIFIED DIRECTIONAL FREQUENCY REUSE PLAN BASED ON CHANNEL ALTERNATION AND ROTATION

Learning Ensembles of Convolutional Neural Networks

IIR Filters Using Stochastic Arithmetic

A MODIFIED DIFFERENTIAL EVOLUTION ALGORITHM IN SPARSE LINEAR ANTENNA ARRAY SYNTHESIS

The Performance Improvement of BASK System for Giga-Bit MODEM Using the Fuzzy System

Design of Shunt Active Filter for Harmonic Compensation in a 3 Phase 3 Wire Distribution Network

Research on Controller of Micro-hydro Power System Nan XIE 1,a, Dezhi QI 2,b,Weimin CHEN 2,c, Wei WANG 2,d

High Speed ADC Sampling Transients

Passive Filters. References: Barbow (pp ), Hayes & Horowitz (pp 32-60), Rizzoni (Chap. 6)

problems palette of David Rock and Mary K. Porter 6. A local musician comes to your school to give a performance

Accelerated Modular Multiplication Algorithm of Large Word Length Numbers with a Fixed Module

Fully Redundant Decimal Arithmetic

High Performance Integer DCT Architectures For HEVC

Review: Our Approach 2. CSC310 Information Theory

Hardware Design of Filter Bank-Based Narrowband/Wideband Interference Canceler for Overlaid TDMA/CDMA Systems

c 2009 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,

AN ALL DIGITAL QAM MODULATOR WITH RADIO FREQUENCY OUTPUT

Latency Insertion Method (LIM) for IR Drop Analysis in Power Grid

Space Time Equalization-space time codes System Model for STCM

A NSGA-II algorithm to solve a bi-objective optimization of the redundancy allocation problem for series-parallel systems

Figure 1. DC-DC Boost Converter

Design of IIR digital filter using Simulated Annealing

An Efficient Method for PAPR Reduction of OFDM Signal with Low Complexity

A Simple Yet Efficient Accuracy Configurable Adder Design

Adaptive Modulation for Multiple Antenna Channels

Comparative Analysis of Reuse 1 and 3 in Cellular Network Based On SIR Distribution and Rate

A Mathematical Solution to Power Optimal Pipeline Design by Utilizing Soft Edge Flip-Flops

CMOS Implementation of Lossy Integrator using Current Mirrors Rishu Jain 1, Manveen Singh Chadha 2 1, 2

FPGA Implementation of Fuzzy Inference System for Embedded Applications

熊本大学学術リポジトリ. Kumamoto University Repositor

Topology Control for C-RAN Architecture Based on Complex Network

Understanding the Spike Algorithm

MTBF PREDICTION REPORT

DIMENSIONAL SYNTHESIS FOR WIDE-BAND BAND- PASS FILTERS WITH QUARTER-WAVELENGTH RES- ONATORS

Mooring Cost Sensitivity Study Based on Cost-Optimum Mooring Design

Chapter 2 Two-Degree-of-Freedom PID Controllers Structures

Dynamic Power Consumption in Virtex -II FPGA Family

Prevention of Sequential Message Loss in CAN Systems

Development of virtual instrument motor experiment teaching system based on LabVIEW

Block-wise Extraction of Rent s Exponents for an Extensible Processor

Total Power Minimization in Glitch-Free CMOS Circuits Considering Process Variation

POWER constraints are a well-known challenge in advanced

Mismatch-tolerant Capacitor Array Structure for Junction-splitting SAR Analog-to-digital Conversion

Total Power Minimization in Glitch-Free CMOS Circuits Considering Process Variation

Unit 1. Current and Voltage U 1 VOLTAGE AND CURRENT. Circuit Basics KVL, KCL, Ohm's Law LED Outputs Buttons/Switch Inputs. Current / Voltage Analogy

@IJMTER-2015, All rights Reserved 383

Control Chart. Control Chart - history. Process in control. Developed in 1920 s. By Dr. Walter A. Shewhart

LOCAL DECODING OF WALSH CODES TO REDUCE CDMA DESPREADING COMPUTATION

EE 508 Lecture 6. Degrees of Freedom The Approximation Problem

Graph Method for Solving Switched Capacitors Circuits

Side-Match Vector Quantizers Using Neural Network Based Variance Predictor for Image Coding

The Spectrum Sharing in Cognitive Radio Networks Based on Competitive Price Game

A Novel Soft-Switching Two-Switch Flyback Converter with a Wide Operating Range and Regenerative Clamping

A Novel Optimization of the Distance Source Routing (DSR) Protocol for the Mobile Ad Hoc Networks (MANET)

Performance Analysis of the Weighted Window CFAR Algorithms

ANNUAL OF NAVIGATION 11/2006

Delay Constrained Fuzzy Rate Control for Video Streaming over DVB-H

Adaptive System Control with PID Neural Networks

Transcription:

Desgn of Practcal FIR Flter Usng Modfed Radx-4 Booth Algorthm E Srnvasarao M.Tech Scholar, Department of ECE, AITAM. V. Lokesh Raju Assocate Professor, Department of ECE, AITAM. L Rambabu Assstant Professor, Department of ECE, AITAM. Abstract: Fnte mpulse response (FIR) flters are extensvely used n varous dgtal sgnal processng applcatons such as dgtal audo, mage processng, data transmsson, bomedcal etc. In some applcatons, the FIR flter crcut must be capable to operate at hgh sample rates, whle n other applcatons, the FIR flter crcut must be a low power crcut operatng at moderate sample rates. FIR flters desgn mplementaton consst a large number of multplcatons, whch leads to excessve area and power consumpton. The topology of the multpler crcut affects the resultant area and power consumpton. In ths paper a new area effcent low power FIR flter desgn s proposed usng a carry look ahead adder based modfed Booth multpler realzed n drect form. The practcal flter coeffcents are determned after verfyng dfferent wndowng technques usng matlab. These coeffcents are used n desgn of area effcent desgn to mprove the effcency of FIR flter. The desgn s mplemented usng Xlnx 12.2 ISE tools, programmng n Verlog HDL. Index Terms: Fnte mpulse response (FIR) flters, modfed Booth encodng (MBE) scheme, VLSI desgn. INTRODUCTION: Dgtal flters are very mportant part of Dgtal Sgnal Processng (DSP). In fact ther extraordnary performance s one of the key reasons that DSP has become so popular. Multplcaton s the most basc functon used n dgtal flers. Wth advances n technology, varous technques have been proposed to desgn multplers, whch offer hgh speed, low power consumpton and lesser area. Thus makng them sutable for varous hgh speeds, low power compact VLSI mplementatons. However, these three parameters.e. power, area and speed are always traded off. Multplcaton s the most basc functons used n dgtal sgnal processng. It requres more hardware resources and processng tme than addton and subtracton. In fact, 8.72% of all nstructons n a typcal processng unt are multpler. Snce multplcaton domnates executon tme, there s a need for hgh speed multpler. In the past many novel deas for multplers have been proposed to acheve hgh performance. Nowadays, many fnte mpulse response (FIR) flter desgns amed at ether low area-cost or hgh speed or reduced power consumpton are developed. We can observe that, wth the ncrease n area, hardware cost of these FIR flters are ncreasng. Ths observaton leads me to desgn a low area-cost FIR flter wth the advantages of reduced power consumpton and moderate speed performance. To reduce the hardware cost, the hardware area should be optmzed. In DSP, there are essentally two sorts of channel, IIR and fr channel. The motvaton reacton of the IIR channel s of unbounded span where as t s of lmted term f there Page 227

should arse an occurrence of fr channel.the fr channel requres no crtcsm way and along these lnes t has no recurson and subsequently the fr channel s non-recursve. Fr channels detal ncorporate greatest mddle of the road stop band swell, pass band and stop band edge recurrence. The coeffcents of fr channel requre mpressve measure of fgurng s. Along these lnes t s by and large performed by utlzng dfferent PC supported confguraton apparatuses, for example, channel outlne and examnaton devce of MATLAB. So for a contnuous applcatons, for example, separatng, combnatonal multplers are utlzed n vew of fast.the vast majorty of the equpment manysded qualty s because of multplers, as channels requre expansve no of augmentaton, promptng extreme regon, postpone and control utlzaton regardless of the fact that executed n a full exceptonally coordnated crcuts now the ssue confronted s that how lessen the equpment multfaceted nature of a multpler. The prncple anxety s on the lessenng of multplers n fr channel the real mpedment of hgher request need. The hgher request forces more equpment necesstes, number-crunchng operatons, and terrtory use and power utlzaton when outlnng the channel. Accordngly, mnmzng or dmnshng these parameters s sgnfcant objectve n advanced channel outlne assgnment. It s yearnng to dscover productve calculaton that requre as couple of math operatons as could reasonably be expected, as ths n the zone and mnmzes the gadget sze and vtalty utlzaton. To evacuate the repettve calculaton whch prompts more profcent calculatons the procedures pcked are CSD, MB. Ths s utlzed to streamlne the regon of hgh pass fr channel. In CSD structure the channel coeffcents are settled. In CSD structure multpler regon get lessened. MB s essentally somewhat seral computatonal operaton that structures and nward result of a couple of vectors n a one drect strde the upsde of MB s ts productvty of mechanzaton. Multplers consume the most amount of area n a FIR flter desgn. Product of two numbers has twce the orgnal bt wdth of the multpled numbers. We can truncate the product bts to the requred precson to reduce the area cost. Conventonal multplers are replaced by a modfed Booth multpler here. Modfed Booth s twce as fast as Booth algorthm. It produces only half the number of partal products (PPs) when compared wth an ordnary bnary multplcaton. Modfed Booth encodng (MBE) scheme s dentfed as the most effcent Booth encodng and decodng scheme. The truncaton error for a modfed Booth multplcaton s not more than 1 ulp (unt of last place or unt of least precson). So there s no need of error compensaton crcuts. Prevous desgns used transposed structure to realze the FIR flter. Transposed structures are good for cross-coeffcent sharng. Also, as the flter order s ncreasng, they wll be faster. But, the area of delay elements s larger. So, t s better to use drect form structure for desgnng a low area-cost FIR flter. In ths bref, I present a new low area-cost FIR flter desgn n VLSI usng a modfed Booth encodng (MBE) scheme. Drect form s selected for FIR flter realzaton. Ths bref s organzed as follows. Desgn of FIR flter s gven n secton II. The proposed desgn s descrbed n secton III. Modfed Booth multpler s descrbed n secton IV. Secton V dscusses about the expermental results and comparsons. Fnally, concluson s gven n secton VI. DESIGN OF FIR FILTER Generally, FIR flter can be expressed as Where M represents the flter order, y [n] s the output sgnal and a represents the set of flter coeffcents. If x [n] s the nput sgnal appled, x [n - ] terms are referred as taps or tapped delay lnes. Symmetrc or ant-symmetrc coeffcents can be consdered for a lnear phase FIR flter. Page 228

The mplementaton of a FIR flter requres three basc buldng blocks multplcaton, addton, and sgnal delay. Desgnng of FIR flter conssts of four dfferent stages. Choose a sutable flter order. Fnd the coeffcents for the correspondng flter order. Realze the flter usng a sutable structure v. Optmze the area of the realzed flter to the maxmum extend Then, modfed Booth multpler block wll provde the output sgnal y[n]. Proposed Methodology Block Dagram Fg 2: Practcal FIR flter Fg 1: Drect form of FIR flter structure Number of multply-accumulate (MAC) operatons requred ncreases lnearly wth the flter order. Therefore, most of the desgns used a mnmum flter order. Actually, slghtly ncreasng the flter order mnmzes the total area. Then, flter coeffcents correspondng to the selected flter order must be fnd out. Drect form or transposed form can be used for realzaton of the FIR flter. Optmzng the area-cost of FIR flter desgn to the maxmum extend s the last stage of the flter desgn. Flow Chart Fg 3: Practcal FIR flter PROPOSED DESIGN A system s performance s determned by the performance of the multpler because the multpler s generally the slowest element n the system. So, a modfed Booth multpler s suggested snce t saves more area and t s faster than other conventonal multplers. The proposed new low area-cost FIR flter usng a modfed Booth multpler s shown n Fg. 2. A drect form flter s such that at each clock cycle a new data sample and the correspondng flter coeffcent can be appled to the multpler s nputs. x[n] s gven as the nput sgnal. D-FFs are used as the delay elements. Modfed Booth multpler block s provded for multplyng the nput sgnal wth the set of flter coeffcents correspondng to the selected flter order. Page 229

Realzaton Fg 4: Realzaton flter FLOATING POINT FORMAT Floatng pont representaton works well for numbers wth large dynamc range based on the no. of bts. Ths standard s almost exclusvely used across computng platforms and hardware desgns that support floatng pont arthmetc. In ths standard a normalzed floatng pont number x s stored n three parts: the sgn s, the excess exponent e,and the sgnfcant or mantssa m, and the value of the number n terms of these parts s: x e e ( 1) *1* m* 2 The format s wrtten wth the sgnfcant havng an ndrect nteger bt of value 1 (except for specal data, see the exponent encodng below). Wth the 52 bts of the fracton sgnfcant become vsble n the memory format, the total precson s therefore 53 bts. The bts are lad out as follows: From the MATLAB command wndow the real flter coeffcents are 0.0038,-0.035, -0.2278, 0.610, 0.0037, -.0331, -0.2291.these flter coeffcents are converted to double precson floatng pont number. The converted coeffcents are 0.003775018138711, -0.033541428579110, -0.227792539932163, 0.618026583319437, -0.229088808118480, 0.033110787270075, 0.003737449345339. b The Q n,m format of an N bt number sets n bts to the left and m bts to the rght of the bnary pont. In case of sgned numbers, the MSB s used for the sgn and has negatve weght. A two s complement fxed pont number n Q n,m format s equvalent to b=b n-1 b n-2 b n-3 b n- 4 b 2 b 1 b 0 b -1 b -m. Wth equvalent floatng pont value:-b n-1 2 n-1 +b n-2 2 n-2 +..+b 1 2 1 +b 0 +b -1 2-1 +..b -m 2 -m. A floatng pont number format s smply converted to Q n,m fxed pont format by brngng m fractonal bts of the number to the nteger part and then droppng the rest of the bts wth or wthout roundng. Ths converson translates a floatng pont number to an nteger number wth an mpled decmal the mpled decmal needs to be remembered by the desgner for referral n further processng of the number n dfferent calculatons: Num_fxed = round(num_float*2 m ) Or Num_fxed = fx(num_float*2 m ) The coeffcents are converted to double precson floatng number nto fxed pont format.the coeffcents are n decmal numbers that are 124,-1099,- 7464,20252,-7507,-1085,123.these decmal numbers are converted to bnary then hexadecmal number. CSD ALGORITHM The CSD code s a ternary number system wth the dgt set {1 0 1}, where 1 stands for 1. Gven a constant, the correspondng CSD representaton s unque. CSD representaton of a number can be recursvely computed usng the strng property and has two man propertes: (1)The number of nonzero dgts s mnmal (2) No two consecutve dgts are both nonzero, that s, two nonzero dgts are not adjacent. The frst property mples a mnmal Hammng weght, whch leads to a reducton n the number of addtons n arthmetc operatons. The second property provdes ts unqueness characterstc. However, f ths property s relaxed, ths representaton s called the mnmal Page 230

sgned dgt (MSD) representaton, whch has as many, nonzero as the CSD representaton, but whch provdes multple representatons for a constant. It enables the reducton of the number of partal products that must be calculated fast and also low-power consumpton and low area structure of a multpler for DSP applcatons or self-tmed crcuts. From the practcal pont of vew, the tradtonal approach to generate the CSD representaton. All of these algorthms generate the CSD code recursvely from the least sgnfcant bt (LSB) to the most sgnfcant bt (MSB). CSD representaton of an nteger number s assgned and unque dgt representaton that contans no adjacent non zero dgts. Gven an n-dgt bnary unsgned number X={x0, x1, - - - - - - -, x n 1 } expressed as n 1 n 0 X = x.2, x {0,1} 1 Then the (n+1)-dgt CSD representaton Y= {y0, y1, y n } of X s gven by n 1 Y = x.2 y.2, y {1,0,1} 2 0 n 0 The condton that all non-zero dgts n a CSD number are separated by zero mples that y. y 0, 0 n 1 3 1 From ths property, the probablty that a CSD n-dgt has a non-zero value s gven by n P y 1) 1/3 1/9n[1 ( 1/ 2) ] 4 As n becomes large, ths probablty tends to 1/3 whle ths probablty becomes ½ n a bnary code. Usng ths property, the number of addtons/subtractons s reduced to mnmum n multplers and as a result, an overall speed up can be acheved. Encodng 2 s prefer- able snce t satsfes the followng relaton. y y y d s Where y s represents the sgn bt and y d the data bt. Ths encodng also allows an addtonal vald representaton of 0 when y s =1 and y d =1, whch s useful n some arthmetc mplementatons. In the whole paper, ths encodng s used. 5 CSD representaton for bnary form: h(0)=124 =0000 0000 0111 1100=0000 0000 1000 0100(CSD form) h(1)=1099=0000 0100 0100 1011=0000 0100 0101 0101(CSD form) h(2)=7464=0001 1101 0010 1000=0010 0101 0010 1000(CSD form) h(3)=20252=0100 1111 0001 1011=0101 0001 0010 0101(CSD form) h(4)=7507=0001 1101 0101 0011=0010 0101 0101 0101(CSD form) h(5)=1085=0000 0100 0011 1101=0000 0100 0100 0101(CSD form) h(6)=123 =0000 0000 0111 1011=0000 0000 1000 0101(CSD form) Xn(0)*h(0)+xn(1)*h(1)+xn(2)*h(2)+xn(3)*h(3)+xn(4) *h(2)+xn(5)*h(1)+xn(6)*h(0) MODIFIED BOOTH MULTIPLIER Modfed Radx-4 Booth s Algorthm s made use of for fast multplcaton. The salent feature of ths algorthm s only n/2 clock cycles are needed for n-bt multplcaton as compared to n clock cycles n Booth s algorthm. Ths type of multpler operates faster than an array multpler for longer operands because ts computaton tme s proportonal to the logarthm of the word length of operands. Booth multplcaton s a technque that allows for smaller, faster multplcaton crcuts, by recodng the numbers that are multpled. Modfed Booth multpler conssts of Booth algorthm, ncludng Booth encoder and Booth decoder, Wallace tree compressor (WTC) and carry look-ahead adder (CLA). Archtecture of the modfed Booth multpler s shown n Fg. 2. Multplcand X and multpler Y are the external nputs for Booth algorthm. Usually, a multplcaton ncludes a generaton of the PPs, addton of the generated PPs untl the last two rows are remaned and then computng the fnal multplcaton result by addng the last two rows. Page 231

Booth decoder generates the PPs from the encoded sgnals and multplcand bts. Fg 5: Archtecture of modfed Booth multpler Multplcand bts are dvded nto a combnaton of two bts each wth overlappng after appendng a zero at the LSB of the multplcand X. X-1 represents the appended zero term. Overlappng s done by the MSB of the group on the rght sde wth the LSB of the group on the left sde when two adjacent groups are consdered. Groupng of multplcand bts s shown n Fg. 3. The 8-bt multplcand term s represented as X7 X6 X5 X4 X3 X2 X1 X0. If the frst three bt combnaton selected s X1 X0 X-1, then the next three bt combnaton wll be X3 X2 X1 and so on. The groupng of the multpler bts s shown n Fg. 4. Multpler Y s dvded nto a combnaton of three bts each wth overlappng after appendng a zero at the LSB of multpler Y. Y-1 s the appended zero bt. Overlappng s done by the MSB of the group n the rght sde wth the LSB of the group n the left sde when two adjacent 3-bt combnatons are consdered. The 8-bt multpler term s represented as Y7 Y6 Y5 Y4 Y3 Y2 Y1 Y0. If the frst three bt combnaton selected s Y1 Y0 Y-1, then the next three bt combnaton wll be Y3 Y2 Y1 and so on. Each 3-bt combnaton of the multpler bts s gven to a Booth encoder as shown n Fg. 4. The Booth encoder generates the encoded sgnals for each 3-bt combnaton of the multpler Y. The logc dagram of the Booth encoder s shown n Fg. 6. From the truth table gven below n table I, the encoded sgnals of any 3-bt combnaton of multpler nput can be found out. These encoded sgnals along wth the each 2-bt combnaton of multplcand bts are then gven to a Booth decoder. Fg. 6: Logc dagram of Booth encoder The number of PPs generated by the modfed Booth multplcaton s exactly half the number of PPs generated by the bnary multplcaton. Each step s slghtly more complex compared to the smple multpler, but s almost as fast as the basc multpler stage that t replaces. For an 8 8 multplcaton, the number of PPs generated n a bnary multplcaton s 64. Therefore, only 32 PPs wll be produced by the modfed Booth multpler. An example of modfed Booth multplcaton s gven n Fg. 5. Let the two 8- bt numbers be 10011001 and 01100110. Each of the 3-bt combnaton of multpler 01100110 startng from LSB s multpled wth each of the 2-bt combnaton of the multplcand 10011001. Therefore, a total of 32 PPs are generated. So, the 64 PPs generated n bnary multplcaton are reduced to 32 PPs n modfed Booth multplcaton. Hence, area-cost of the flter desgn wll be reduced. The PPs generated by the Booth decoder are then gven to a Wallace tree structure. Wallace tree reducton always compresses the partal product bts. Wallace tree has been used n order to accelerate multplcaton by compressng the number of partal products. Wallace Tree Structure can be made by usng compressors, full adders and varous other technques. WTC s a technque used to ncrease the speed of partal product addton operaton. A WTC shown n Fg. 8 conssts of a set of full adders (FAs). Sometmes, the FA at LSB s replaced by a half adder (HA). The HA adds two nput bts to produce one sum bt and one carry bt. All the FAs add three nput bts at a tme to produce one sum bt and one carry bt. Page 232

Therefore, the PPs are added n parallel usng the WTC untl two sequences of outputs are generated. One s a sequence of sum bts and the other s a sequence of carry bts. A WTC would save most of the area snce t produces only two outputs. Snce the addton of PPs s done n parallel, the operaton of WTC s fast also. The full adders and half adders replaced by the dfferent compressors speeds up the summaton n general and multplcaton n partcular. Fg 7: Logc dagram of Booth decoder Example of modfed Booth multplcaton Wallace tree compressor fnally, these sequences of sum bts and carry bts are gven to a CLA. The CLA provdes another speed boost to the system. They are the fastest adders. CLA conssts of a set of full adders. A CLA shown n Fg. 8 s dentcal to the half adder except that t has an addtonal nput, Cn, so that a carry from a prevous addton may be passed along. Furthermore, nstead of a carry out, Cout, propagate (P) and generate (G) sgnals are produced. S = A xor B xor Cn - (2) P = A xor B - (3) G = A B - (4) C+1 = G + P C - (5) Fg 8: Carry look-ahead adder CLA calculates the carry sgnals n advance, based on the nput sgnals. Carry generate and propagate sgnals only depend on the nput bts. The carry bts can be computed n parallel wth the sum bts, whch ncreases the speed of the adder compared to a rpple style adder. CLA s used to avod the rpplng carry present n rpple carry adder (RCA). Because, rpplng carry produces an unnecessary delay n the crcut. CLA uses the concepts of generatng and propagatng the carry and t produces the fnal output and ths s the output of the FIR flter. Modfed Booth s algorthm s twce as fast as Booth s algorthm. The modfed Booth algorthm s extensvely used for hgh-speed multpler crcuts. The drawback of MBE scheme s that as the number of stages ncreases, the area and power consumpton wll also ncrease. RESULTS AND COMPARISONS We mplemented three FIR flters for comparson wth the prevous desgn approaches. One FIR flter s desgned usng an older verson of truncated multpler, one usng fathfully rounded truncated multple constant multplcaton/ accumulaton (MCMAT), and one usng modfed Booth multpler. ModelSm s the software used for smulaton and Xlnx 12.2 software s used as the synthess tool. After logc synthess, all the desgned systems are mplemented on the Xlnx Spartan II FPGA. The smulaton results for the three FIR flters obtaned are shown n followng secton. A detaled comparatve study s done n order to analyze how much the desgned low area-cost FIR flter usng modfed Booth multpler s better than the conventonal exstng FIR flter desgns. The comparson s done n terms of area, delay, power consumpton and memory usage. Comparson between desgn summares obtaned from the Xlnx software for the three FIR flters desgned are shown n table II. The area consumpton of the FIR flters s noted wth the help of the area report, whch s avalable as a part of the synthess report whle mplementng n the Spartan II FPGA. The number of slces utlzed among the avalable 1728 slces n the Spartan II FPGA s taken for the comparson. Page 233

The power comparson s also done wth the help of the power report provded by the Xlnx 12.2 software. The power consumpton s represented n mllwatts (mw). Speed comparson s done usng the tmng report obtaned n the synthess report. A detaled report on the nput to output gate delay s avalable n the tmng report. Therefore, when compared all the three desgns, our new proposed FIR flter usng modfed Booth multpler s of low area-cost or more area effcent when compared wth other FIR flters. Smulaton result of FIR flter usng CSD Fg 12: RTL Schematc of FIR wth CSD Smulaton result of FIR flter usng Fg 9: RTL Schematc of FIR flter Fg 13: Smulated output of FIR wth CSD Table 2: Area Utlzaton Summary Fg 10: Smulated output of FIR flter Table 1: Area Utlzaton Summary Fg 14: Power estmator of FIR flter wth CSD algorthm Fg 11: Power estmator of FIR flter Page 234

Smulaton result of FIR flter usng modfed Booth multpler Fg 15: RTL Schematc of FIR flter wth MBM Fg 16: Output of FIR flter wth MBM Table 3: Synthess report of FIR flter wth MBM Fg 17: Power estmator of FIR flter MBM Comparson of Results the Above Three Technques Table 4: Comparson of results n terms of power, area and delay Parameter FIR FIR WITH CSD FIR WITH MBM Area(slces) 128 1918 1757 Area(LUT s) 188 3659 3381 Delay(ns) 18.14 41.68 7.58 Power(mw) 305 286 244 Power*delay 5532.7 11920.48 1849.52 Comparson of Desgn Summares Our new FIR flter s more effcent n terms of power consumpton also. Even though the delay of our proposed desgn s less when compared wth the prevous desgns, the delay of our desgned flter s moderately a large value. But, we focus on a low areacost FIR flter desgn wth moderate speed performance for moble applcatons where area and power are our mportant desgn consderatons. Memory usage of both the prevous FIR flters remans the same. But, the memory usage of our new area effcent FIR flter s ncreased. CONCLUSION The FIR flters are extensvely used n dgtal sgnal processng and can be mplemented usng programmable dgtal processors. Dgtal sgnal processng has become ncreasngly popular over the years wth the advancement n VLSI technology. The hgh speed realzaton of FIR flter wth less power consumpton has become much more demandng. In ths project, the practcal FIR hgh pass flters s desgned by usng hammng wndow and obtaned the frequency response and coeffcents usng MATLAB. After obtanng the response, the FIR flter s realzed and mplemented n VLSI doman. The drect form archtecture conssts of adders, multplers and delay elements. In VLSI normal multplcaton of two numbers consumes more power so nstead of drect multplcaton of nput wth the coeffcents. Page 235

The CSD,MBM and DA algorthm are used for multplcaton process and obtaned less power consumpton. Separately CSD, MBM and DA algorthms are appled for multplcaton process and compared two technques n terms of power. From the comparson of above two technques t s concluded that DA based algorthm s a best technque for reducng power consumpton because of LUT s are used n DA algorthm. Moreover the smulaton and synthess results are analyzed usng Xlnx 12.2 ISE. REFERENCES [1].A. Avzens, Sgned dgt number representaton for fast parallel arthmetc, IRE Transactons on Electronc Computers, 1961, vol. 10, pp. 389 400. [2]. R. Hasheman, A new method for converson of a 2 s complement to canonc sgned dgt number system and ts representaton, n Proceedngs of 30th IEEE Aslomar Conference on Sgnals, Systems and Computers, 1996, pp. 904 907. [3]. H. H. Loomsand B. Snha, Hgh speed recursve dgtal flter realzaton, Crcuts, Systems and Sgnal Processng, 1984, vol. 3, pp. 267 294. [4]. K. K. Parh and D. G. Messerschmtt, Ppelne nterleavng parallelsm n recursve dgtal flters. Pt I: Ppelnng usng look ahead and decomposton, IEEE Transactons on Acoustcs, Speech Sgnal Processng, 1989, vol. 37, pp. 1099 1117. [5]. K. K. Parh and D. G. Messerschmtt, Ppelne nterleavng and parallelsm n recursve dgtal flters. Pt II: Ppelnng ncremental block flterng, IEEE Transactons on Acoustcs, Speech Sgnal Processng, 1989, vol. 37, pp. 1118 1134. [6]. A. V. Oppenhem and R. W. Schafer, Dscrete tme Sgnal Processng, 3rd, 2009, Prentce Hall. [7]. Y. C. Lm and S. R. Parker, FIR flter desgn over a dscrete powers of two coeffcent space, IEEE Transactons on Acoustcs, Speech Sgnal Processng, 1983, vol. 31, pp. 583 691. [8]. H. Samuel, An mproved search algorthm for the desgn of multpler less FIR flters wth powers of two coeffcents, IEEE Transactons on Crcuts and Systems, 1989, vol. 36, pp. 1044 1047. [9]. J. H. Han and I. C. Park, FIR flter synthess consderng multple adder graphs for a coeffcent, IEEE Transactons on Computer Aded Desgn of Integrated Crcuts and Systems, 2008, vol. 27, pp. 958 962. [10]. A. G. Dempster. and M. D. Macleod, Use of mnmum adder multpler blocks n FIR dgtal flters, IEEE Transactons on Crcuts and Systems II, 1995, vol. 42, pp. 569 577. [11]. R. I. Hartley, Sub expresson sharng n flters usng canonc sgned dgt multplers, IEEE Transactons oncrcuts and Systems II, 1996, vol. 43, pp. 677 688. [12]. Y. Jang. and S. Yang, Low power CSD lnear phase FIR flter structure usng vertcal common sub expresson, Electroncs Letters, 2002, vol. 38, pp. 777 779. [13]. A. P. Vnod, E. M. K. La, A. B. Premkuntar and C. T. Lau, FIR flter mplementaton by effcent sharng of horzontal and vertcal sub expressons, Electroncs Letters, 2003, vol. 39, pp. 251 253. [14]. A. Hosnagad, F. Fallah and R. Kastner, Common sub expresson elmnaton nvolvng multple varables for lnear DSP synthess, n Proceedngs of 15th IEEE Internatonal Conference on Applcaton specfc Systems, Archtectures and Processors, Washngton 2004, pp. 202 212. [15]. P. K. Meher, S. Chandrasekaran and A. Amra, FPGA realzaton of FIR flters by effcent and Page 236

flexble systolzaton usng dstrbuted arthmetc, IEEE Transactons on Sgnal Processng, 2008, vol. 56, pp. 3009 3017. Author Profle: Etcherla srnvasarao presently pursung hs M.Tech n VLSI system desgn n Electroncs and Communcaton Engneerng Department, AITAM, Tekkal. Hs areas of nterest are Low Power VLSI system desgn and dgtal flter optmzaton. He has attended for one natonal level workshop. He s membershp n GSM IEEE. The author may be reached at srnu1050@gmal.com V. LokeshRaju presently workng as Assocate Professor n Electroncs and Communcaton Engneerng Department, AITAM, Tekkal. He has 12 years experence n teachng and research. Now he s dong Research n Reconfgurable Antennas also. He publshed more than 10 research papers n Natonal/ Internatonal Journals and Conferences. L.Rambabu presently workng as Assstant Professor n Electroncs and Communcaton Engneerng Department, AITAM, Tekkal. He has 9 years experence n teachng. He publshed 5 research papers n Natonal/ Internatonal Journals and Conferences. Page 237