DA based Efficient Parallel Digital FIR Filter Implementation for DDC and ERT Applications

Similar documents
An area optimized FIR Digital filter using DA Algorithm based on FPGA

A Survey on Power Reduction Techniques in FIR Filter

VLSI Implementation of Digital Down Converter (DDC)

Area Efficient and Low Power Reconfiurable Fir Filter

Tirupur, Tamilnadu, India 1 2

Globally Asynchronous Locally Synchronous (GALS) Microprogrammed Parallel FIR Filter

FIR Filter Design on Chip Using VHDL

AREA EFFICIENT DISTRIBUTED ARITHMETIC DISCRETE COSINE TRANSFORM USING MODIFIED WALLACE TREE MULTIPLIER

Design of a Power Optimal Reversible FIR Filter ASIC Speech Signal Processing

Design of a High Speed FIR Filter on FPGA by Using DA-OBC Algorithm

Design and Implementation of Truncated Multipliers for Precision Improvement and Its Application to a Filter Structure

Implementation of Parallel Multiplier-Accumulator using Radix- 2 Modified Booth Algorithm and SPST

Design of an optimized multiplier based on approximation logic

Low-Power Multipliers with Data Wordlength Reduction

Low Power Approach for Fir Filter Using Modified Booth Multiprecision Multiplier

Pipelined Linear Convolution Based On Hierarchical Overlay UT Multiplier

Design of Multiplier Less 32 Tap FIR Filter using VHDL

Design and Implementation of Reconfigurable FIR Filter

High Speed Binary Counters Based on Wallace Tree Multiplier in VHDL

SDR Applications using VLSI Design of Reconfigurable Devices

Innovative Approach Architecture Designed For Realizing Fixed Point Least Mean Square Adaptive Filter with Less Adaptation Delay

A Fixed-Width Modified Baugh-Wooley Multiplier Using Verilog

Using Soft Multipliers with Stratix & Stratix GX

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors

DESIGN OF MULTIPLE CONSTANT MULTIPLICATION ALGORITHM FOR FIR FILTER

Appendix B. Design Implementation Description For The Digital Frequency Demodulator

Implementation of FPGA based Design for Digital Signal Processing

Design of Digital FIR Filter using Modified MAC Unit

[Devi*, 5(4): April, 2016] ISSN: (I2OR), Publication Impact Factor: 3.785

An Area Efficient FFT Implementation for OFDM

Design and Implementation of High Speed Carry Select Adder

Efficient FIR Filter Design Using Modified Carry Select Adder & Wallace Tree Multiplier

JDT LOW POWER FIR FILTER ARCHITECTURE USING ACCUMULATOR BASED RADIX-2 MULTIPLIER

Design of Area and Power Efficient FIR Filter Using Truncated Multiplier Technique

Design of a Power Optimal Reversible FIR Filter for Speech Signal Processing

VLSI DESIGN OF RECONFIGURABLE FILTER FOR HIGH SPEED APPLICATION

Multiplierless sigma-delta modulation beam forming for ultrasound nondestructive testing

Design and Performance Analysis of a Reconfigurable Fir Filter

REALIAZATION OF LOW POWER VLSI ARCHITECTURE FOR RECONFIGURABLE FIR FILTER USING DYNAMIC SWITCHING ACITIVITY OF MULTIPLIERS

Keywords: CIC Filter, Field Programmable Gate Array (FPGA), Decimator, Interpolator, Modelsim and Chipscope.

Word length Optimization for Fir Filter Coefficient in Electrocardiogram Filtering

Design of FIR Filter on FPGAs using IP cores

Resource Efficient Reconfigurable Processor for DSP Applications

Channelization and Frequency Tuning using FPGA for UMTS Baseband Application

A Comparative Study on Direct form -1, Broadcast and Fine grain structure of FIR digital filter

Fixed Point Lms Adaptive Filter Using Partial Product Generator

High Speed Vedic Multiplier Designs Using Novel Carry Select Adder

Multiplier Design and Performance Estimation with Distributed Arithmetic Algorithm

DESIGN AND IMPLEMENTATION OF AREA EFFICIENT, LOW-POWER AND HIGH SPEED 128-BIT REGULAR SQUARE ROOT CARRY SELECT ADDER

High Speed Speculative Multiplier Using 3 Step Speculative Carry Save Reduction Tree

An Design of Radix-4 Modified Booth Encoded Multiplier and Optimised Carry Select Adder Design for Efficient Area and Delay

Architecture for Canonic RFFT based on Canonic Sign Digit Multiplier and Carry Select Adder

FPGA Implementation of High Speed FIR Filters and less power consumption structure

Design and Implementation of Scalable Micro Programmed Fir Filter Using Wallace Tree and Birecoder

DESIGN OF AREA EFFICIENT TRUNCATED MULTIPLIER FOR DIGITAL SIGNAL PROCESSING APPLICATIONS

HIGH SPEED FINITE IMPULSE RESPONSE FILTER FOR LOW POWER DEVICES

A Hardware Efficient FIR Filter for Wireless Sensor Networks

Real-Time Digital Down-Conversion with Equalization

Design and Analysis of RNS Based FIR Filter Using Verilog Language

Video Enhancement Algorithms on System on Chip

AN EFFICIENT MULTI RESOLUTION FILTER BANK BASED ON DA BASED MULTIPLICATION

International Journal of Scientific & Engineering Research, Volume 7, Issue 3, March-2016 ISSN

AN ADVANCED VLSI ARCHITECTURE OF PARALLEL MULTIPLIER BASED ON HIGHER ORDER MODIFIED BOOTH ALGORITHM

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Performance Analysis of FIR Filter Design Using Reconfigurable Mac Unit

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES

Single Chip FPGA Based Realization of Arbitrary Waveform Generator using Rademacher and Walsh Functions

Design and Implementation of 64-bit MAC Unit for DSP Applications using verilog HDL

2 Assistant Professor, Dept of ECE, Universal College of Engineering & Technology, AP, India,

The Comparative Study of FPGA based FIR Filter Design Using Optimized Convolution Method and Overlap Save Method

Modified Booth Multiplier Based Low-Cost FIR Filter Design Shelja Jose, Shereena Mytheen

Design and Implementation of Digit Serial Fir Filter

FPGA Implementation of Wallace Tree Multiplier using CSLA / CLA

Trade-Offs in Multiplier Block Algorithms for Low Power Digit-Serial FIR Filters

Implementation and Comparison of Low Pass FIR Filter on FPGA Using Different Techniques

Review On Design Of Low Power Multiply And Accumulate Unit Using Baugh-Wooley Based Multiplier

Keywords SEFDM, OFDM, FFT, CORDIC, FPGA.

An Efficient Reconfigurable Fir Filter based on Twin Precision Multiplier and Low Power Adder

Digital Logic, Algorithms, and Functions for the CEBAF Upgrade LLRF System Hai Dong, Curt Hovater, John Musson, and Tomasz Plawski

HIGH SPEED FIXED-WIDTH MODIFIED BOOTH MULTIPLIERS

An Efficient Method for Implementation of Convolution

A MODIFIED ARCHITECTURE OF MULTIPLIER AND ACCUMULATOR USING SPURIOUS POWER SUPPRESSION TECHNIQUE

A New High Speed Low Power Performance of 8- Bit Parallel Multiplier-Accumulator Using Modified Radix-2 Booth Encoded Algorithm

An Optimized Design for Parallel MAC based on Radix-4 MBA

AUTOMATIC IMPLEMENTATION OF FIR FILTERS ON FIELD PROGRAMMABLE GATE ARRAYS

Design and Implementation of Parallel Micro-programmed FIR Filter Using Efficient Multipliers on FPGA

Design and Implementation of High Speed Carry Select Adder Korrapatti Mohammed Ghouse 1 K.Bala. 2

FPGA Based 70MHz Digital Receiver for RADAR Applications

International Journal of Advanced Research in Computer Science and Software Engineering

MULTIRATE IIR LINEAR DIGITAL FILTER DESIGN FOR POWER SYSTEM SUBSTATION

Design and Implementation of Complex Multiplier Using Compressors

Index Terms. Adaptive filters, Reconfigurable filter, circuit optimization, fixed-point arithmetic, least mean square (LMS) algorithms. 1.

Area Power and Delay Efficient Carry Select Adder (CSLA) Using Bit Excess Technique

Optimized FIR filter design using Truncated Multiplier Technique

Design and Performance Analysis of 64 bit Multiplier using Carry Save Adder and its DSP Application using Cadence

Digital Integrated CircuitDesign

A Parallel Multiplier - Accumulator Based On Radix 4 Modified Booth Algorithms by Using Spurious Power Suppression Technique

Multistage Implementation of 64x Interpolator

Multiple Constant Multiplication for Digit-Serial Implementation of Low Power FIR Filters

Transcription:

DA ased Efficient Parallel Digital FIR Filter Implementation for DDC and ERT Applications E. Chitra 1, T. Vigneswaran 2 1 Asst. Prof., SRM University, Dept. of Electronics and Communication Engineering, SRM University,Chennai, IDIA 2 Professor, Dept. of Electronics and Communication Engineering, VIT University, Chennai IDIA Astract This paper discusses FPGA implementation of finite impulse response (FIR) filters using their application in Digital Down-Converters (DDCs) for software radio and in (Electrical Resistance Tomography) ERT The implementation is ased on distriuted arithmetic (DA) which sustitute multiply and accumulate operations with a series of look-up-tale () accesses. Distriuted arithmetic provides a multiplication-free method for calculating inner products of fixed-point data, ased on tale lookups of pre calculated partial products. The implementation results are provided to demonstrate a high-speed and low power proposed architecture. The proposed DDC is implemented in VHDL and verified via simulation. The proposed method offers average reductions of 3% in the nume, 42% reduction in occupied slices and 38% reduction in the numer gates needed for low pass FIR filter implementation method. The proposed DA ased FIR filter can e used in electrical resistance tomography (ERT) system: it is the time delay of the filter that affects the real-time performance of the conventional ERT system. The proposed design shows 14% reduction in delay as compared to conventional logic ased DA architecture. Though there is power trade off ut there is significant improvement in area and delay parameters. Keywords: Digital down converters, Distriuted arithmetic,, Software radio, Finite impulse response and Electrical resistance tomography system. I. Introduction Finite impulse response (FIR) digital filters are common components in many digital signal processing (DSP) systems and are used to perform signal preconditioning, anti-aliasing, and selection, decimation/ interpolation, low-pass filtering, and video convolution functions [1-3]. In FIR filter applications, arithmetic elements for operations such as addition, multiplication and delay (storage) are commonly required. Digital signal processing algorithms rely heavily on the efficient computation of inner products. Very efficient methods have een developed for implementation of digital filters in FPGAs or custom ICs. Digital filtering is the main task in IF processing. The computational complexity of finite impulse response (FIR) filters used in the IF processing lock is dominated y the nume adders (sutractors) employed in the multipliers. The use of SDR technology is predicted to replace many of the traditional methods of implementing transmitters and receivers while offering a wide range of advantages including adaptaility, reconfiguraility, and multifunctionality encompassing modes of operation, radio frequency ands, air interfaces, and waveforms [4]. Research in this field is mainly directed towards improving the architecture and the computational efficiency of SDR systems. The most computationally intensive part of an SDR receiver is the channelizer since it operates at the highest sampling rate [5]. The key functional units in a digital filter are delay, adder, and multiplier out of which multiplier dominates the hardware complexity. The complexity of the FIR multiplier is dominated y the nume adders (sutractors) employed in the coefficient multipliers. The contriutions of this paper can e summarized as follows: An efficient scheme using DA ased implementation for FIR filters in DDC and ERT is proposed. y employing this technique, it is shown that the delay, area and power consumption of the filters can e minimized. This paper is organized as follows: In section II, a rief ackground DA and parallel FIR filters. In section III, the DDC example system and FIR filters for ERT are explained. The DA for implementation of FIR filters is discussed in section IV. In section V, The multiplexer ased DA scheme is presented. The results are illustrated in section VI. Section VII provides our conclusions. ISS : 975-424 Vol 7 o 2 Apr-May 215 727

A. Distriuted arithmetic II. ackground study Distriuted arithmetic is a multiplication free method applicale to fixed-point data, and is ased on tale lookups of pre-calculated partial products [6]. Distriuted Arithmetic (DA) [7] is a method often preferred since it eliminates the need for hardware multipliers and is capale of implementing large filters with very high throughput. Also, DA filters achieve these advantages while retaining full precision, unlike filters using reduced sums and differences of powers of two. Fig. 1 illustrates asic concept of DA. DA provides multiplier free multiplication y using it serial computation y storing all possile comination sums of filter weights in. Distriuted arithmetic a possile candidate for low power applications ecause it allows replacement of costly multiplies with shifts and tale lookups [6]. The attery lifetime of portale electronics has ecome a major design concern as more functionality is incorporated into these devices. Therefore, the shrinking power udget of modern portale devices requires the use of low-power circuits for signal processing applications. The signal processing functions employed in these devices include finite-impulse response (FIR) filters, discrete cosine transforms (DCTs), and discrete Fourier transforms (DFTs). The common feature of these functions is that they are all ased on the inner product. Digital signal processing (DSP) implementations typically make use of multiply-and-accumulate (MAC) units for the calculation of these operations, and the computation time increases linearly as the length of the input vector grows. Fig. 1 asic concept of distriuted arithmetic. Parallel FIR filters A FIR filter can e mathematically expressed y the equation (1) [8]. y[ = 1 i= i] x[ n i] where x represents the input signal, h the filter coefficients, y the output signal, y[ is the current output sample, and is the nume taps of the filter. This is a convolution operation of the filter coefficients along with the signal. In the sequential implementation a set of multiply-and-accumulate (MAC) operations is performed for each sample of the input data signal, multiplying the delayed input samples y coefficients and summing up the results together to generate the output signal. In parallel implementations, have two main architectures. The first one consists of unrolling of MAC loop where we have several delayed versions of the input signal entering in a fully parallel multiplier lock, followed y a summation lock. The other one consists of a multiplier lock, which takes the same input signal and delivers each output to an input of a delayed summation lock. Fig. 2 shows the asic lock diagram of parallel FIR filtering. (1) Fig. 2 lock diagram of parallel FIR filtering ISS : 975-424 Vol 7 o 2 Apr-May 215 728

A. Digital down converter III. Applications Software radio receivers [9] require mixing, filtering and down sampling of received signals to allow data to e processed at a suitale rate. Part of this process can e achieved in FPGAs using a Digital Down- Converter (DDC). As well as mixing the incoming real signal from the ADC to extract the complex signal, a DDC must filter the complex signal to reject image components introduced y the mixing process and then down sample. For maximum software radio flexiility, the ADC, mixer and filters should sample as quickly as possile. Hence, if the DDC is implemented on an FPGA, full-parallel techniques can e used to reach the required sampling rates. The calculation of low pass filter coefficients for DDC specifications used in this paper are calculated using MATLA, sampling frequency 2MHz with cutoff frequency of 4Mhz and attenuation and 6d using Kaiser window. The phase and magnitude response of 4-tap and 8-tap filters are shown in Fig.3. (a) () (c) (d) Fig. 3 FIR filter responses for DDC (a) 4-tap low pass FIR filter magnitude response () 4-tap low pass FIR filter phase response (c) 8-tap low pass FIR filter magnitude response (d) 8-tap low pass FIR filter phase response ISS : 975-424 Vol 7 o 2 Apr-May 215 729

. FIR filters for Electrical resistance tomography system ERT is used to achieve visual detection through oundary sensors array to otain the real-time distriution of the sensing field. For the use of the sinusoidal signal as the inject current, the demodulation and low-pass filter are needed in the data acquisition system, which were always implemented y analog devices. This not only complicates the structure ut also weakens the real-time performance [1]. The time delay of the analog filter and demodulation is the main prolem that affects the data acquisition speed. As the development of the integrated circuit, digital technology has ecome the main method for signal processing. owadays the digital FIR filter is widely used in electronic instruments, for it can solve the prolem caused y the time delay with well dynamic response. Fig. 4 descries the magnitude and phase response of the low pass FIR filter used in ERT. Hence, in this system, the FIR filter and the demodulation can also e implemented in FPGA digitally. For this the simulation is done using Spartan 3 FPGA device. (a) () Fig.4 FIR filter response for ERT system (a)magnitude response of low pass FIR filter () Phase response of low pass FIR filter III. Distriuted arithmetic ased filtering scheme Distriuted Arithmetic was first rought up y Croisier [11], and was extended to cover the signed data system y Liu, and then was introduced into FPGA design to save MAC locks with the development of FPGA technology. Fig. 5 illustrates the concept of distriuted arithmetic. If is the filter coefficient and x[ is the input sequence to e processed, the -length FIR filter can e descried as: >= y =< h, x 1 x[ Distriuted Arithmetic is introduced into the design of FIR filters as follows. In the two's complement system, x[ can e descried as: x[ = 2 x [ 1 + = 2 x [ Sustitute eq.(3) into eq.(2) yields: y = 2 x [ + 1 = 1 2 x [ The (5) can e changed into another form: (2) (3) (4) ISS : 975-424 Vol 7 o 2 Apr-May 215 73

1 = 1 2 x [ = 1 = 2 1 x [ Sustituting (6) into (5) yields to the final form of Distriuted Arithmetic: y = 2 x [ + 1 1 2 = = It is conserve that the values of n = x [ to the input data to save MAC locks. And then the weighted sum of n = 1 1 1 h [ n ] x [ n ] into a unit and then callout the relevant value according 1 (5) (6) n ] x [ n ] is calculated through shift 2 x[ registers, the result is = =. In signed system, the signed it should e taken into consideration so 2 x [ is also added. As a result, the final form of Distriuted Arithmetic is defined as (6) and the implementation can e achieved on FPGA through units. IV. Proposed DA ased filtering scheme using multiplexer Fig. 5 shows proposed multiplexer ased DA filtering scheme. The asic -DA scheme on an FPGA would consist of three main components: the input registers, the 4-input unit and the shifter/accumulator unit. Additionally, it would require a control unit to manipulate the filter operation, and an adder tree unit to perform addition on partial filter results. Applying this approach in (4) the 4-input unit will not e directly accessed instead 2-input is used ased on multiplexer select. The particular 2-input is selected which represent all the possile sum cominations of filter coefficients. Though there is a power trade off ut it implies aout 5% reduction in the nume used with increased speed. To evaluate the performance of the proposed scheme, 4-tap and 8-tap low pass FIR filters for DDC are implemented using VHDL and synthesis is carried out in XILIX-ISE8.1i. Fig.5 Multiplexer ased DA filtering scheme VI. Results and discussion The simulation has een done using MODEL SIM 6.4 and XILIX Integrated Software Environment (ISE) is used for performing synthesis and implementation of designs using Spartan-3 device. The power analysis has een done using XILIX XPOWER tool. The filter coefficients for the DDC low pass filter application are calculated using MATLA. The evaluation of device utilization using proposed DA architecture can e comprehended easily with the help of the results in Tale I. 1) Tale I shows the XILIX device utilization for 4-tap, 8-tap, 16-tap and 32-tap FIR implementation, it is oserved that the proposed gate ased architecture implies 3% reduced, 45% reduced slices utilization and 4% reduced nume gates. ISS : 975-424 Vol 7 o 2 Apr-May 215 731

2) Fig. 6I represents the delay comparison for 4-tap, 8-tap, 16-tap and 32-tap filter designed using conventional DA and proposed gate ased DA method. The proposed method outperforms y15% speed improvement. Compared with the traditional algorithm, distriuted algorithm can greatly reduce the size of the hardware circuit, as well as it is easy to implement pipelining technology and improve the operation speed of the circuit. The key factor that affects the data acquisition rate of the conventional ERT system is the time delay of filter, which is reduced using proposed logic shown in Figure 6. Also compared with the analog filter, the time delay is reduced greatly y using the digital filter. As for the ERT system, the inject current has a frequency of 5k Hz and a sample frequency of 9k Hz. Hence, the cut-off frequency of the low-pass FIR filter would e 1k Hz, which could entirely meet the needs of the data acquisition system. And also it should have well frequency response and good cut-off capacity and performance improvement. The FPGA implementation of proposed DA ased FIR filter using Spartan 3 device and the power consumption results are shown in Figure 7. The proposed method can e easily comprehended for the higher order filters. Tale I Device utilization results for FIR filter (XILIX FPGA XC3S2-4FT256) 4-tap Low pass FIR filter ume d slices Gates implementation 267 19 213 implementation 225 169 1817 8-tap Low pass FIR filter ume d slices Gates implementation 358 248 2853 implementation 319 21 2552 16-tap Low pass FIR filter ume d slices Gates implementation 443 335 3541 implementation 43 33 3312 32-tap Low pass FIR filter ume d slices Gates implementation 535 41 4161 implementation 489 378 3997 ISS : 975-424 Vol 7 o 2 Apr-May 215 732

Fig. 6 Delay Results for low pass FIR filter (XILIX FPGA XC3S2-4FT256) Fig. 7 Power Results for low pass FIR filter (XILIX FPGA XC3S2-4FT256) VII. Conclusion In this paper, presented an efficient DA ased scheme which is used to implement FIR filters in DDC and ERT systems. The device utilization of the proposed architecture is relatively less since it used split technique with multiplexer select logic. Our method is implemented for till 32 tap and can e even extended more. A high speed and less area implementation is achieved. The test results indicate that the designed filter using proposed distriuted arithmetic can work stale with high speed and can save almost 4 percent hardware resources. The delay improvement turns out very useful for the ERT systems. Meanwhile, it is very easy to transplant the filter to other applications through modifying the order parameter and other parameters, and therefore have great practical applications in digit signal processing. References [1] S.. Merchant and. V. Rao, Distriuted arithmetic architecture for image coding, Proc. IEEE Int. Conf. TECO 89,1989. [2] H. Q. Cao and W. Li, VLSI implementation of vector quantization using distriuted arithmetic, Proc. IEEE Int. Symp. Circuits Syst., 1996. [3] S. A. White, Applications of Distriuted Arithmetic to Digital Signal Processing, A Tutorial Review-IEEE ASSP Magazine, pp. 4-19, 1989. [4] W. H. W. Tuttleee, Software Defined Radio: Enaling Technologies, ew York, Wiley, 22. [5] J. Mitola, Software Radio Architecture. ew York: Wiley,2 [6] ew, A distriuted arithmetic approach to designing scalale DSP chips, ED, pp. 17-114, 1995. [7] W. P. urleson, L. L. Scharf, A VLSI Design Method for Distriuted Arithmetic, VLSI Sig. Proc., Vol. 2, pp. 235-252, 1991 [8] Cheng and K. K. Parhi., Further complexity reduction of parallel FIR filters. Proc.IEEE Int. Symp. Circuits Syst., Koe, Japan, 25, pp. 1835-1838, 25. [9] K. S. Yeung and S. C. Chan, The design and multiplier-less realization of software radio receivers with reduced system delay, IEEE Trans. Circuits Syst. I, vol. 51, no. 12, pp. 2444-2459, 24. [1] Dickin, and M. Wang, Electrical Resistance Tomography for Process Applications, Measurment Science and Technology, vol.7, pp.247-26, January 1996. [11] Uwe Meyer-aese,Digital signal processing with FPGA, eijing:tsinghua University Press,5, 51, 26. ISS : 975-424 Vol 7 o 2 Apr-May 215 733