IN SEVERAL wireless hand-held systems, the finite-impulse

Similar documents
Design and Implementation of Reconfigurable FIR Filter

DESIGN OF MULTIPLE CONSTANT MULTIPLICATION ALGORITHM FOR FIR FILTER

On the design and efficient implementation of the Farrow structure. Citation Ieee Signal Processing Letters, 2003, v. 10 n. 7, p.

Design and Performance Analysis of a Reconfigurable Fir Filter

Low Power Approach for Fir Filter Using Modified Booth Multiprecision Multiplier

Trade-Offs in Multiplier Block Algorithms for Low Power Digit-Serial FIR Filters

Design of Area and Power Efficient FIR Filter Using Truncated Multiplier Technique

Hardware Efficient Reconfigurable FIR Filter

A Hardware Efficient FIR Filter for Wireless Sensor Networks

FINITE-impulse response (FIR) filters play a crucial role

Optimized FIR filter design using Truncated Multiplier Technique

IJMIE Volume 2, Issue 5 ISSN:

Design and Implementation of Truncated Multipliers for Precision Improvement and Its Application to a Filter Structure

Low Power FIR Filter Design Based on Bitonic Sorting of an Hardware Optimized Multiplier S. KAVITHA POORNIMA 1, D.RAHUL.M.S 2

Multiple Constant Multiplication for Digit-Serial Implementation of Low Power FIR Filters

Area Efficient and Low Power Reconfiurable Fir Filter

Tirupur, Tamilnadu, India 1 2

A Reconfigurable FIR Filter Architecture to Trade Off Filter Performance for Dynamic Power Consumption

Data Word Length Reduction for Low-Power DSP Software

2 Assistant Professor, Dept of ECE, Universal College of Engineering & Technology, AP, India,

Multiplierless Multi-Standard SDR Channel Filters

Modified Booth Multiplier Based Low-Cost FIR Filter Design Shelja Jose, Shereena Mytheen

A Survey on Power Reduction Techniques in FIR Filter

JDT LOW POWER FIR FILTER ARCHITECTURE USING ACCUMULATOR BASED RADIX-2 MULTIPLIER

An Efficient Reconfigurable Fir Filter based on Twin Precision Multiplier and Low Power Adder

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 5, Sep-Oct 2014

Modified Booth Encoding Multiplier for both Signed and Unsigned Radix Based Multi-Modulus Multiplier

DESIGN & FPGA IMPLEMENTATION OF RECONFIGURABLE FIR FILTER ARCHITECTURE FOR DSP APPLICATIONS

An area optimized FIR Digital filter using DA Algorithm based on FPGA

Globally Asynchronous Locally Synchronous (GALS) Microprogrammed Parallel FIR Filter

High Speed Binary Counters Based on Wallace Tree Multiplier in VHDL

DESIGN & IMPLEMENTATION OF FIXED WIDTH MODIFIED BOOTH MULTIPLIER

High Speed Programmable FIR Filters for FPGA

REALIAZATION OF LOW POWER VLSI ARCHITECTURE FOR RECONFIGURABLE FIR FILTER USING DYNAMIC SWITCHING ACITIVITY OF MULTIPLIERS

Innovative Approach Architecture Designed For Realizing Fixed Point Least Mean Square Adaptive Filter with Less Adaptation Delay

Design and Implementation of Digit Serial Fir Filter

INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

Design of Digital FIR Filter using Modified MAC Unit

Architecture for Canonic RFFT based on Canonic Sign Digit Multiplier and Carry Select Adder

VLSI Implementation of Digital Down Converter (DDC)

Power-conscious High Level Synthesis Using Loop Folding

Implementation of Parallel Multiplier-Accumulator using Radix- 2 Modified Booth Algorithm and SPST

A High-Speed Low-Complexity Modified Processor for High Rate WPAN Applications

Design of a Power Optimal Reversible FIR Filter ASIC Speech Signal Processing

TRANSPOSED FORM OF FOLDED FIR FILTER

Area Power and Delay Efficient Carry Select Adder (CSLA) Using Bit Excess Technique

EFFICIENT SHIFT ADD IMPLEMENTATION OF FIR FILTER USING VARIABLE PARTITION HYBRID FORM STRUCTURE

Multiplier Design and Performance Estimation with Distributed Arithmetic Algorithm

[Devi*, 5(4): April, 2016] ISSN: (I2OR), Publication Impact Factor: 3.785

THIS brief addresses the problem of hardware synthesis

Area and Power Efficient Booth s Multipliers Based on Non Redundant Radix-4 Signed- Digit Encoding

Design of a Field-Programmable Digital Filter Chip Using Multiple-Valued Current-Mode Logic

Low-Power Multipliers with Data Wordlength Reduction

An Efficient VLSI Architecture of a Reconfigurable Pulse- Shaping FIR Interpolation Filter for Multi standard DUC

THE serial advanced technology attachment (SATA) is becoming

Design of a High Speed FIR Filter on FPGA by Using DA-OBC Algorithm

Design of a Power Optimal Reversible FIR Filter for Speech Signal Processing

IMPLEMENTATION OF UNSIGNED MULTIPLIER USING MODIFIED CSLA

VLSI Implementation of Reconfigurable Low Power Fir Filter Architecture

MULTIRATE IIR LINEAR DIGITAL FILTER DESIGN FOR POWER SYSTEM SUBSTATION

AREA EFFICIENT DISTRIBUTED ARITHMETIC DISCRETE COSINE TRANSFORM USING MODIFIED WALLACE TREE MULTIPLIER

Implementation of Parallel MAC Unit in 8*8 Pre- Encoded NR4SD Multipliers

Lecture 3. FIR Design and Decision Feedback Equalization

EE290C - Spring 2004 Advanced Topics in Circuit Design High-Speed Electrical Interfaces. Agenda

Index Terms. Adaptive filters, Reconfigurable filter, circuit optimization, fixed-point arithmetic, least mean square (LMS) algorithms. 1.

Design of an optimized multiplier based on approximation logic

Lecture 3. FIR Design and Decision Feedback Equalization

Implementation of FPGA based Design for Digital Signal Processing

VLSI Design and FPGA Implementation of N Binary Multiplier Using N-1 Binary Multipliers

Methods for Reducing the Activity Switching Factor

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors

A New network multiplier using modified high order encoder and optimized hybrid adder in CMOS technology

ISSN Vol.03,Issue.11, December-2015, Pages:

A Survey on A High Performance Approximate Adder And Two High Performance Approximate Multipliers

Fixed Point Lms Adaptive Filter Using Partial Product Generator

Pipelined Linear Convolution Based On Hierarchical Overlay UT Multiplier

Design and Implementation of 64-bit MAC Unit for DSP Applications using verilog HDL

FOR HIGH SPEED LOW POWER APPLICATIONS USING RADIX-4 MODIFIED BOOTH ENCODER

Reduced Complexity Wallace Tree Mulplier and Enhanced Carry Look-Ahead Adder for Digital FIR Filter

Word length Optimization for Fir Filter Coefficient in Electrocardiogram Filtering

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS

High performance Radix-16 Booth Partial Product Generator for 64-bit Binary Multipliers

DESIGN AND IMPLEMENTATION OF FFT ARCHITECTURE FOR REAL-VALUED SIGNALS BASED ON RADIX-2 3 ALGORITHM

Implementation of 32-Bit Unsigned Multiplier Using CLAA and CSLA

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

AREA AND DELAY EFFICIENT DESIGN FOR PARALLEL PREFIX FINITE FIELD MULTIPLIER

A NOVEL APPROACH FOR AREA -POWER- ENERGY REDUCTION IN LMS ADAPTIVE FILTER

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

International Journal of Scientific & Engineering Research, Volume 4, Issue 5, May ISSN

An Area Efficient FFT Implementation for OFDM

LOW POWER AND AREA EFFICIENT PARALLEL FIR DIGITAL FILTER STRUCTURE USING MODIFIED SQRT CARRY SELECT ADDER

DESIGN AND IMPLEMENTATION OF AREA EFFICIENT, LOW-POWER AND HIGH SPEED 128-BIT REGULAR SQUARE ROOT CARRY SELECT ADDER

A Comparative Study on Direct form -1, Broadcast and Fine grain structure of FIR digital filter

Ajmer, Sikar Road Ajmer,Rajasthan,India. Ajmer, Sikar Road Ajmer,Rajasthan,India.

ISSN:

Customized Booth Multiplier for MM Applications

AUTOMATIC IMPLEMENTATION OF FIR FILTERS ON FIELD PROGRAMMABLE GATE ARRAYS

REVIEW ARTICLE: EFFICIENT MULTIPLIER ARCHITECTURE IN VLSI DESIGN

AN EFFICIENT MULTI RESOLUTION FILTER BANK BASED ON DA BASED MULTIPLICATION

A Simple Design and Implementation of Reconfigurable Neural Networks

Transcription:

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 51, NO. 1, JANUARY 2004 21 Power-Efficient FIR Filter Architecture Design for Wireless Embedded System Shyh-Feng Lin, Student Member, IEEE, Sheng-Chieh Huang, Feng-Sung Yang, Chung-Wei Ku, and Liang-Gee Chen, Fellow, IEEE Abstract This paper presents a novel approach for implementing power-efficient finite-impulse response (FIR) filters that requires less power consumption than traditional FIR filter implementation in wireless embedded systems. The proposed schemes can be adopted in the direct form FIR filter and achieve a large amount of reduction in the power consumption. By using a combination of proposed methods, balanced-modular techniques with retiming and separated processing data-flow scheme with modified canonical signed digit (CSD) representation, experimental results show that the proposed scheme reduce 76% power consumption of the original direct-form structure with slight area overhead. Index Terms Canonical signed digit (CSD), direct form, embedded, finite-impulse response (FIR), power-efficient, retiming, wireless. Fig. 1. Retimed direct form architecture. I. INTRODUCTION IN SEVERAL wireless hand-held systems, the finite-impulse response (FIR) filters are the indispensable parts among various image/video communication applications to reduce noise and to enhance the specific features. With a given specification, the dedicated filter is designed to fit in the applications and has the least effect of redundancy. However, the previous designs of the dedicated filter architecture still have some drawbacks. The overhead of the subexpression sharing [1], [2] is a complicated routine like a chaotic adder tree. To keep timing correct, the substructure sharing will make the registers grow rapidly. Therefore, this approach is difficult for the hardware implementation. In addition, the advantage of the fixed coefficients can not be utilized by the folded architecture [3], [4]. Hence, the folded architecture loses the benefit in the chip area and the power consumption. The direct form and the transposed form [5], [6] usually represent the filter coefficients in the canonical signed digit from (CSD) to decrease the nonzero digits of the constant multipliers. At the same time, Firgen [5] and Laskowski [6] contributed to the elimination of the MSB sign-extension redundancy. However, the Manuscript received February 20, 2002; revised July 11, 2003. This paper was recommended by K. Parhi. S.-F. Lin, S.-C. Huang, and L. G. Chen, are with the Department of Electrical Engineering, Graduate Institute of Electronics Engineering, R344, National Taiwan University, Taipei 106, Taiwan, R.O.C. (e-mail: lgchen@ video.ee.ntu.edu.tw). F.-S. Yang is with the IC Design Group, 8TECH, Inc., Taipei 106, Taiwan, R.O.C. C.-W. Ku is with the DSP Group, VIVOTEK Inc., Taipei 106, Taiwan, R.O.C. Digital Object Identifier 10.1109/TCSII.2003.821513 Fig. 2. Symmetrical Retimed linear-phase direct form architecture with 12 taps. disadvantage is that the structural symmetry in the linear-phase frequency response can not be applied to transposed form filters designs. In this paper, we provide a solution to the problems described above by designing an FIR filter based on the architecture with modular design. The routing scheme is not very complicated and it still keeps the symmetric, and multiplierless benefits. Besides, adding the proposed separated sign processing with modified CSD representation will have excellent results both in balancing critical-path delay and suppressing circuit transition. II. PROPOSED ARCHITECTURE In this paper, the direct form of dedicated FIR with CSD coefficient representation is considered. There are four steps to reduce the power consumption. A. Symmetrical Retimed Direct Form Architecture Retiming method can decrease the critical path as the pipeline method but without increasing the latency of circuit. If the phase of the filter is linear, the symmetrical architecture 1057-7130/04$20.00 2004 IEEE

22 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 51, NO. 1, JANUARY 2004 Fig. 3. Example of carry-save adder tree. circuit transits frequently between positive and negative. For example, 0 in a 10-b 2 s complemented number representation is 0 000 000 000 but is 1 111 111 111. A lot of transitions will consume a large amount of power. Separated signed processing architecture (SSPA) separates the negative digits of coefficients from positive digits. Two accumulating paths for each sign are finally utilized, stored, and merged together. In order to avoid the transition between positive and negative caused by the input data, the filter input must be biased to a positive number instead of the sign-magnitude representation. These biases at the last stage of the accumulating path are deleted. As a result in Fig. 6, this design processes the biased input signal X in two different datapaths for each sign without any control. Eventually, the results positive part and negative part from the datapaths and the compensation bias are summed together to get the final result. Fig. 4. Summation of partial products. can be used to reduce the multiplier operation. Comparing Figs. 1 and 2, the number of multipliers can be reduced half after adopting the symmetrical architecture. The symmetrical retimed direct form architecture (RDFA) takes the advantage in speed and area, and it is the basic model to develop the proposed architecture. B. Balanced Modular Architecture (BMA) The same nonzero digits instead of the same numbers of coefficients are chosen, because the multiplier in each stage is not identical to each other. And a carry-save adder tree adopted with the same depth is used in the modular design. Since the Wallace tree uses the 3:2 compression ratio, the bit numbers of each bit plane are 9, 6, 4, 3, and 2 in the Wallace tree as shown in Fig. 3. An example to explain this situation is illustrated in Fig. 4, where the formula contains 9 nonzero coefficients digits. The maximal number of summation of partial products is 9, and the corresponding depth of the carry-save adder tree is 4. After considering these ideas, the resulted filter structure is displayed in Fig. 5. C. Separated Signed Processing Architecture The 2 s complemented number representation for VLSI design will cause amounts of power consumption while the D. Modification to the CSD (MCSD) Representation Separated signed processing will produce the unbalanced module. Although, the occurrences of positive and negative digits have the same probability, it is just the average statistics. The modification of the CSD representation is proposed to solve the problem. The concept is to modify the CSD representation to balance the positive and negative parts, and the number of nonzero digits is the same as before. For example, if the number of positive digits is much less than that of negative digits, then should be changed into 011 to increase the number of positive digits while decreasing the number of negative digits. Evidently the modified CSD coefficients result in a structure shown in Fig. 7 has higher utilization of hardware than the one in Fig. 6. III. COMPARISONS AND DISCUSSIONS This section shows an example for IS-95 WCDMA pulse shaping FIR filter. The ideal floating-point coefficients of an IS-95 WCDMA FIR filter with 33-taps for the third-generation cellular phone. From our power analysis the symmetrical retimed direct form architecture (SRDFA) just needs 47% power consumption compare with the original direct-form architecture. Applying BMA will reduce to 64% of original power. By combining SSPA with MCSD representation, the power consumption can be reduced to 78% of the original one. If the four schemes are adopted together, the power consumption can decrease to 24% original direct-form architecture. The simulation results are shown in Table I, Figs. 8 and 9. Compared to the linear-phase direct form architecture for IS-95 WCDMA filters, the modularization obviously decreases the transition count as shown in Table II. When the FIR filter is fed with a sequence of randomly generated data, the result is similar. For the IS-95 WCDMA pulse shaping filter, adopting the proposed architecture can reduce the number of circuit transition to be 71.4%.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 51, NO. 1, JANUARY 2004 23 Fig. 5. Balanced modular FIR filter architecture. Fig. 6. Architecture with 4-level pipeline of pulse-shaping filter for IS-95 WCDMA. Fig. 7. The 3-level pipeline architecture of pulse-shaping filter for IS-95 WCDMA after adopting MCSD. Fig. 8. Power comparison of the proposed four schemes.

24 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 51, NO. 1, JANUARY 2004 Fig. 9. Area comparison of the proposed four schemes. TABLE I COMPARISON RESULTS OF THE PROPOSED FOUR SCHEMES TABLE II THE NUMBER OF CIRCUIT TRANSITIONS OF THREE CASES IV. CONCLUSION In this paper, a low-power architecture for dedicated linear phase FIR filter is proposed. Four schemes are suggested, including retimed structure, balanced modular architecture, separated signed processing data flow and modification of the CSD representation. From the experimental results, the proposed signal processing schemes reduce about ten to 30% circuits transition in the accumulation path to achieve the maximum efficiency of hardware components. The proposed schemes not only address the linear-phase FIR filter, but also can improve the non linear-phase FIR filer. REFERENCES [1] G. Wacey and D. R. Bull, POFGEN: A design automation system for VLSI digital filters with invariant transfer function, in Proc. IEEE Int. Symp. Circuits and Systems (ISCAS), vol. 1, 1993, pp. 631 634. [2] M. Abo-Zahhad and S. M. Ahmed, Filter designer: A complete design and synthesis program for lumped, wave-digital, FIR and IIR filters, in Proc. 13th National Radio Science Conf., Cairo, Egypt, Mar. 9 21, 1996, pp. C24.1 C24.15. [3] V. Verma and C. Chien, A VHDL based functional compiler for optimum architecture generation of FIR filters, in Proc. IEEE Int. Symp. Circuits and Systems (ISCAS), vol. 4, 1996, pp. 564 567. [4] W. Wilhelm and T. G. Noll, A new mapping technique for automated design of highly efficient multiplexed FIR digital filters, Proc. IEEE Int. Symp. Circuits and Systems, (ISCAS), vol. 4, pp. 2252 2255, 1997.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 51, NO. 1, JANUARY 2004 25 [5] R. Jain, P. T. Yang, and T. Yoshino, FIRGEN: A computer-aided design system for high performance FIR filter integrated circuits, IEEE Trans. Signal Processing, vol. 39, pp. 1655 1668, July 1991. [6] J. Laskowski and H. Samueli, A 150-MHz 43-tap half-band FIR digital filter in 1.2-um CMOS generated by silicon compiler, in Proc. Custom Integrated Circuits Conf., 1992, pp. 11.4.1 11.4.4. [7] T. Yamazaki, Y. Kondo, S. Igota, and S. Iwase, FASTOOL an FIR filter compiler based on the automatic design of the multi-input-adder, in Proc. IEICE Trans. Fund., vol. E78-A, Dec. 1995, pp. 1699 1705. [8] R. I. Hartley, Subexpression sharing in filters using canonic signed digit multipliers, IEEE Trans. Circuits Syst. II, vol. 43, pp. 677 688, Oct. 1996. [9] R. Pasko, P. Schaumont, V. Derudder, and D. Durackova, Optimization method for broadband modem FIR filter design using common subexpression elimination, in Proc. Int. Symp. System Synthesis, 1997, pp. 100 106. [10] S. Sugawa, H. Shimamoto, S. Hosotani, Y. Imamura, T. Takagaki, H. Ijiri, K. Okada, and T. Sumi, An area efficient hardware sharing filter generator for integration of multiple video format conversions, in IEEE Int. Conf. Consumer Electronics Tech. Dig. Papers, 1997, pp. 414 415. 47 386. [11] M. Potkonjak, M. B. Srivastava, and A. P. Chandrakasan, Multiple constant multiplications: Efficient and versatile framework and algorithms for exploring common subexpression elimination, IEEE Trans. Computer-Aided Design, vol. 16, pp. 151 165, Feb. 1996. [12] M. Potkonjak, M. B. Srivastava, and A. Chandrakasan, Efficient substitution of multiple constant multiplications by shifts and additions using iterative pairwise matching, in Proc. 31st ACM/IEEE Design Automation Conf., 1994, pp. 189 194. [13] S. F. Lin, S. C. Huang, F. S. Yang, C. W. Ku, and L. G. Chen, An efficient linear-phase FIR filter architecture design for wireless embedded system, in Proc. IEEE Workshop Signal Processing System (SiPS), Antwerp, Belgium, Sept. 2001. [14] K. Azadet and C. J. Nicol, Low-power equalizer architectures highspeed modems, IEEE Commun. Mag., vol. 36, pp. 118 126, Oct. 1998.