DESIGN AND IMPLEMENTATION OF FFT ARCHITECTURE FOR REAL-VALUED SIGNALS BASED ON RADIX-2 3 ALGORITHM

Similar documents
Architecture for Canonic RFFT based on Canonic Sign Digit Multiplier and Carry Select Adder

An Efficient Design of Parallel Pipelined FFT Architecture

Design Of A Parallel Pipelined FFT Architecture With Reduced Number Of Delays

A Combined SDC-SDF Architecture for Normal I/O Pipelined Radix-2 FFT

Combination of SDC-SDF Architecture for I/O Pipelined Radix-2 FFT

Low power and Area Efficient MDC based FFT for Twin Data Streams

An Area Efficient FFT Implementation for OFDM

Design of Area and Power Efficient FIR Filter Using Truncated Multiplier Technique

IMPLEMENTATION OF 64-POINT FFT/IFFT BY USING RADIX-8 ALGORITHM

A Novel Approach in Pipeline Architecture for 64-Point FFT Processor without ROM

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors

Design and Implementation of High Speed Carry Select Adder

Area Efficient Fft/Ifft Processor for Wireless Communication

VLSI Implementation of Area-Efficient and Low Power OFDM Transmitter and Receiver

Trade-Offs in Multiplier Block Algorithms for Low Power Digit-Serial FIR Filters

A HIGH SPEED FFT/IFFT PROCESSOR FOR MIMO OFDM SYSTEMS

Area Power and Delay Efficient Carry Select Adder (CSLA) Using Bit Excess Technique

JDT LOW POWER FIR FILTER ARCHITECTURE USING ACCUMULATOR BASED RADIX-2 MULTIPLIER

Design and Implementation of Truncated Multipliers for Precision Improvement and Its Application to a Filter Structure

A Fixed-Width Modified Baugh-Wooley Multiplier Using Verilog

International Journal of Scientific & Engineering Research, Volume 5, Issue 11, November ISSN

DESIGN & IMPLEMENTATION OF FIXED WIDTH MODIFIED BOOTH MULTIPLIER

Multiple Constant Multiplication for Digit-Serial Implementation of Low Power FIR Filters

M.Tech Student, Asst Professor Department Of Eelectronics and Communications, SRKR Engineering College, Andhra Pradesh, India

A FFT/IFFT Soft IP Generator for OFDM Communication System

Design of an optimized multiplier based on approximation logic

High performance Radix-16 Booth Partial Product Generator for 64-bit Binary Multipliers

VLSI Implementation of Pipelined Fast Fourier Transform

DESIGN OF MULTIPLE CONSTANT MULTIPLICATION ALGORITHM FOR FIR FILTER

Fast Fourier Transform: VLSI Architectures

EFFICIENT DESIGN OF FFT/IFFT PROCESSOR USING VERILOG HDL

A SURVEY ON FFT/IFFT PROCESSOR FOR HIGH SPEED WIRELESS COMMUNICATION SYSTEM

A High-Speed Low-Complexity Modified Processor for High Rate WPAN Applications

Design A Redundant Binary Multiplier Using Dual Logic Level Technique

2 Assistant Professor, Dept of ECE, Universal College of Engineering & Technology, AP, India,

Design and Implementation of Digit Serial Fir Filter

Modified Booth Encoding Multiplier for both Signed and Unsigned Radix Based Multi-Modulus Multiplier

Low-Power Multipliers with Data Wordlength Reduction

Tirupur, Tamilnadu, India 1 2

Efficient Carry Select Adder Using VLSI Techniques With Advantages of Area, Delay And Power

FPGA Implementation of Area Efficient and Delay Optimized 32-Bit SQRT CSLA with First Addition Logic

High Speed Vedic Multiplier Designs Using Novel Carry Select Adder

Keshab Parhi Electrical and Computer Engineering

Design and Characterization of 16 Bit Multiplier Accumulator Based on Radix-2 Modified Booth Algorithm

High Speed Binary Counters Based on Wallace Tree Multiplier in VHDL

Design of Reconfigurable FFT Processor With Reduced Area And Power

Low Power Approach for Fir Filter Using Modified Booth Multiprecision Multiplier

HIGH PERFORMANCE BAUGH WOOLEY MULTIPLIER USING CARRY SKIP ADDER STRUCTURE

DESIGN AND IMPLEMENTATION OF AREA EFFICIENT, LOW-POWER AND HIGH SPEED 128-BIT REGULAR SQUARE ROOT CARRY SELECT ADDER

Design and Implementation of High Speed Carry Select Adder Korrapatti Mohammed Ghouse 1 K.Bala. 2

Index Terms: Low Power, CSLA, Area Efficient, BEC.

Implementation techniques of high-order FFT into low-cost FPGA

PERFORMANCE COMPARISON OF HIGHER RADIX BOOTH MULTIPLIER USING 45nm TECHNOLOGY

Design and Implementation of High Radix Booth Multiplier using Koggestone Adder and Carry Select Adder

DESIGN AND IMPLEMENTATION OF MOBILE WiMAX (IEEE e) PHYSICAL LAYERUSING FPGA

NOVEL HIGH SPEED IMPLEMENTATION OF 32 BIT MULTIPLIER USING CSLA and CLAA

Implementation of Parallel Multiplier-Accumulator using Radix- 2 Modified Booth Algorithm and SPST

A MODIFIED ARCHITECTURE OF MULTIPLIER AND ACCUMULATOR USING SPURIOUS POWER SUPPRESSION TECHNIQUE

Data Word Length Reduction for Low-Power DSP Software

Design and Implementation of 64-bit MAC Unit for DSP Applications using verilog HDL

LOW POWER AND AREA EFFICIENT PARALLEL FIR DIGITAL FILTER STRUCTURE USING MODIFIED SQRT CARRY SELECT ADDER

Simulation of Parallel Pipeline Radix 2^2 Architecture

Globally Asynchronous Locally Synchronous (GALS) Microprogrammed Parallel FIR Filter

Low Power 3-2 and 4-2 Adder Compressors Implemented Using ASTRAN

LOW POWER FEED FORWARD FFT ARCHITECTURES USING SWITCH LOGIC

A Hardware Efficient FIR Filter for Wireless Sensor Networks

Mahendra Engineering College, Namakkal, Tamilnadu, India.

IN SEVERAL wireless hand-held systems, the finite-impulse

FOR HIGH SPEED LOW POWER APPLICATIONS USING RADIX-4 MODIFIED BOOTH ENCODER

IJCSIET--International Journal of Computer Science information and Engg., Technologies ISSN

A Faster Carry save Adder in Radix-8 Booth Encoded Multiplier

Implementation of FPGA based Design for Digital Signal Processing

A High Speed Wallace Tree Multiplier Using Modified Booth Algorithm for Fast Arithmetic Circuits

Faster and Low Power Twin Precision Multiplier

Design of an Optimized FBMC Transmitter by using Clock Gating Technique based QAM for Low Area, Power and High Speed Applications

ISSN Vol.07,Issue.08, July-2015, Pages:

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS

Comparative Analysis of Multiplier in Quaternary logic

CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES

VLSI Design and FPGA Implementation of N Binary Multiplier Using N-1 Binary Multipliers

Implementation of High Speed and Low Area Digital Radix-2 CSD Multipliers using Pipeline Concept

Area Efficient and Low Power Reconfiurable Fir Filter

PAPER A High-Speed Two-Parallel Radix-2 4 FFT/IFFT Processor for MB-OFDM UWB Systems

FPGA Implementation of Desensitized Half Band Filters

PUBLICATIONS OF PROBLEMS & APPLICATION IN ENGINEERING RESEARCH - PAPER CSEA2012 ISSN: ; e-issn:

Design of 8-4 and 9-4 Compressors Forhigh Speed Multiplication

An Optimized Design of High-Speed and Energy- Efficient Carry Skip Adder with Variable Latency Extension

IJCSIET-- International Journal of Computer Science information and Engg., Technologies ISSN

A Survey on Power Reduction Techniques in FIR Filter

Implementation of 256-bit High Speed and Area Efficient Carry Select Adder

AN ADVANCED VLSI ARCHITECTURE OF PARALLEL MULTIPLIER BASED ON HIGHER ORDER MODIFIED BOOTH ALGORITHM

Implementation of a FFT using High Speed and Power Efficient Multiplier

A Novel High Performance 64-bit MAC Unit with Modified Wallace Tree Multiplier

Fixed Point Lms Adaptive Filter Using Partial Product Generator

Pre-Encoded Multipliers Based on Non-Redundant Radix-4 Signed-Digit Encoding

PRECISION FOR 2-D DISCRETE WAVELET TRANSFORM PROCESSORS

An area optimized FIR Digital filter using DA Algorithm based on FPGA

Design and Implementation of Complex Multiplier Using Compressors

An Efficient Method for Implementation of Convolution

An Efficient Implementation of Downsampler and Upsampler Application to Multirate Filters

Transcription:

DESIGN AND IMPLEMENTATION OF FFT ARCHITECTURE FOR REAL-VALUED SIGNALS BASED ON RADIX-2 3 ALGORITHM 1 Pradnya Zode, 2 A.Y. Deshmukh and 3 Abhilesh S. Thor 1,3 Assistnant Professor, Yeshwantrao Chavan College of Engineering 2 Professor, Department of Electronics Engg., G.H. Raisoni College of Engineering, Nagpur (E-mail: 1 pradnya.u@rediff.com, 2 aydeshmukh@gmail.com, 3 abhileshthor@yahoo.com). Abstract- A new FFT architecture for real-valued signal is proposed using Radix-2 3 algorithm. It is based on modifying flow graph of the FFT algorithm such that it has both real and complex datapaths. A redundant operation in flow graph is replaced by imaginary part. Using folding technique RFFT architecture with any level of parallelism can be achieved. This RFFT architecture will lead to low hardware complexity as compare to radix-2 and radix 2 2 algorithm in terms of adder, and delay. N-point 2 parallel radix-2 3 architecture requires (log 8 N-1) complex,2log 2 N adders, 3N/2-2 delays. RFFT which is used for real time applications and in portable devices for which low power consumption is main requirement, so accordingly carry propagate adder which has least power consumption and CSD is selected for our proposed architecture. Keywords FFT, Parallel Processing, Pipelining, Real Signals, radix-2 3, Folding 1. Introduction Fast Fourier transform (FFT) is one of the widely used algorithms in digital signal processing [1]. Now a day s interest in the computation of FFT for real valued signals (RFFT) is increased since most of the physical signals are real. RFFT is very important algorithm used in various real time applications. In the area of digital signal processing (DSP) [2]. FFT is very important algorithm. Hardware complexities can be reduce in asymmetric digital subscriber line (ADSL) [3] by using RFFT. In applications like spectral and filtering analysis [4] FFT also plays an important role which helps to analyze spectral components. RFFT is very vital algorithm for analyzing signals like electroencephalography (EEG) and electrocardiography (ECG). RFFT is also used in various portable devices which leads to low power consumption. FFT is also used in power spectral density which can detect whether signal is perfect or there is any problem. This paper tells about designing RFFT architecture. In this flow graph is modified and redundant part is replaced by imaginary part in order to reduce complexity. Using folding technique RFFT architecture with any level of parallelism can be achieved. As imaginary part is injected in butterfly structure it will have both real and complex datapath. RFFT is used for real time applications and in portable devices for which low power consumption is main requirement, so accordingly carry propagate adder which has least power consumption and CSD is selected for our architecture. The paper is organized as follows. Section II describes previous work related to RFFT. Section III describes proposed architecture for 16 point RFFT radix-2 3 DIF. Section VI describes FPGA Implementation of adder and CSD. The Experimental results are discussed in section V and finally, concluding remarks are in section VI. 2. Previous Work Previously, in the past various algorithms for computation of RFFT is presented but they did not have proper regular geometry. This is very important for deriving pipelined architecture. Firstly pipelined architecture for real valued signal was designed in [5]. But it is restricted to only radix- 2 and 4 parallel RFFT architecture. Also, it has only real datapaths. To derive FFT architecture for real and complex inputs a design is presented in [6].But in this architecture even after removing redundant operations, it still calculate this samples. Again there was no full hardware utilization of architecture derived in this paper. In [7] pipelined RFFT architecture is derived but it has more hardware complexity than our proposed architecture as it has more number of adders, s and delays than radix-2 3 algorithm in [8]. ISSN (Print): 2249-9210 ISSN (Online): 2348-1862 71 IJREAS, Vol. 02, Issue 02, July 2014

3. Proposed Work The N-point discrete Fourier transform (DFT) of a sequence x[n]is defined as X[k] = x[n]w Where W = e ( / ) In RFFT inputs are real, If x[n] is real, then output X[k] have conjugate symmetric X [N-k] = X * [k] (1) Due to this, (N/2)-1 output calculations can be removed as they are redundant.proposed work involves following 2 steps. A. Modified butterfly structure Redundant samples can be find by approach in [5]. After finding redundant samples these are removed. But it leads to irregular geometry, so efficient pipelining cannot be done. In order to get regular geometry redundant operations are replaced by imaginary part. Now efficient pipelining can be done. Modified flow graph of 16 point RFFT DIF radix-2 3 shown in Fig.1 B. Folding. Fig. 2 Butterfly structure I for proposed architecture [8] Fig. 3 Butterfly structure II for proposed architecture [8] Using folding technique in [9] pipelined architecture can be derived from DFG. Also it leads to optimized datapath [10]. Nodes which are in DFG can be implemented by butterfly structure I and II. Proposed work 2 parallel architecture shown below in Fig. 4 Fig. 4. Proposed work 4. FPGA IMPLEMENTATION In this various adders are simulated on Spartan 6, Xc6sl16 device, CSG324 package and adder with least power power consumption is selected and also CSD is simulated on Spartan-6. II. Fig.1 Modified flow graph 16 point RFFT radix-2 3 DIF UNITS Since flow graph has both complex and real parts, it has both complex and real datapath. So to handle this we have two butterfly structures. First is very straight it involves two real inputs and consists of real adder and subtraction. This butterfly structure is shown in Fig.2butterfly structure is shown in Fig.2 Adder RFFT is generally used for for real time applications and portable devices like ECG and EEG etc. Low power and low area is main requirement for such devices so, we have select adder accordingly. So various adders like carry propagate as in [11], carry skip as in [12] and carry look ahead adder as in [13] is simulated on Spartan 6, Xc6sl16 device, CSG324 package. Table I shows their performance based on area, power and delay. Amongst these carry propagate adder suits best as it has least power consumption. Also its RTL view shown in Fig.5 and waveform in Fig.6. ISSN (Print): 2249-9210 ISSN (Online): 2348-1862 72 IJREAS, Vol. 02, Issue 02, July 2014

TABLE I Adder(16bit) Power(mw) Delay(ns) Area(slices) Carry propagate Carry look ahead 30 15.243 25 / 9112 42 13.567 59 Carry skip 37 14.767 44 has multiplexer which is controlled by these pair of bits. Depending on these input pair, multiplexer in output will be input data, inverse of input data or all zeros. Depending on these pair of bits shifts are applied. It also has circuitry which receives and combines these outputs and further shifts of bits are applied to generate final output. Normal and CSD[16] (8 bit x 8 bit)is simulated on Spartan 6, Xc6sl16 device, CSG324 package and their performance based on area,power and delay is shown in Table II. Normal CSD TABLE II Power(mw) Delay(ns) Area(slice) 35 11.7 116 / 9112 32 10.5 88 Clearly it is seen that CSD has less delay and power consumption, so it is selected for our proposed architecture. Its waveform shown in Fig.7 and RTL view in Fig.8. Fig. 5. RTL View of carry propagate adder Figure 7 Output waveform of CSD B. Multiplier Figure 6 Output Waveform of carry propagate adder In normal, multiplication is done by shifts producing partial products and then adding all the partial products. So multiplication of two N bits will generate N x N partial products and subsequently (N-1) adders will require if two inputs N bit adders are used. So number of hardware component increases and also time required for multiplication increases. So there is need of which generates less partial products which in turn reduces time as well as power consumption for multiplication.csd suits best to our requirement. Proposed CSD is very efficient way of multiplication; it leads to reduction in number of partial products by using redundancy of sign code CSD in [14] and provides an efficient way of multiplication as in [15]. Number of partial products also gets reduced. As number of partial products are reduced, it is more hardware efficient, so it require less time for multiplication and low power consumption than normal. Constant value with which multiplicand is to be multiplied has pair of bits, Fig. 8. RTL view of CSD ISSN (Print): 2249-9210 ISSN (Online): 2348-1862 73 IJREAS, Vol. 02, Issue 02, July 2014

C. Simulation result of FFT architecture Fig.9 shows inputs to FFT and Fig.10 shows outputs of FFT. will have low hardware as compare to other previous designs. Radix-2 [8] Radix-2 2 [8] Proposed Radix 2 3 (2 parallel) TABLE III CONSIDER N=64 Complex 2(log 4 N - 1) 4 C.M (log 4 N -1) 2 C.M (log 8 N 1) 1 C.M Adders 4log 2 N 24 adders 4log 2 N - 2 22 adders 2log 2 N 12 adders Delays 2N 128 delays 2(N - 2) 124 delays 3N/2 2 94 delays 6. Conclusion Fig.9 Inputs to FFT Efficient architecture for computation of RFFT has been proposed in this paper. Datapaths will be optimized by folding. It will also have less number of adders, and delay with respect to previous architecture. This architecture will have low power with respect to adder as carry propagate adder will be used and relatively fast as CSD will be used. References Fig.10 Outputs of FFT 5. Comparison and Analysis Table III compares the hardware complexity in terms of adder, and delays of proposed architectures with previous architecture for N-point FFT and as an example for N=64.It has been observe that number of adders, and delays for radix-2 3 architecture will be less as compared to other architectures, so radix-2 3 architecture [1] A. V. Oppenheim, R.W. Schafer, and J.R. Buck, Discrete- Time Signal Processing, 2nd ed. Englewood Cliffs, NJ, USA: Prentice Hall, 1998. [2] H. Chi and Z. Lai, A cost-effective memory-based realvalued FFT and Hermitian symmetric IFFT processor for DMT-based wire-line transmission systems, in Proc. ISCAS, May 2005, vol. 6, pp. 6006 6009. [3] W. Ko, J. Kim, Y. Park, T. Koh, and D. Youn, An efficient DMT modem for the G.LITE ADSL transceiver, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 11, no. 6, pp. 997 1005, Dec. 2003. [4] S. He and M. Torkelson, Designing pipeline FFT processor for OFDM (de)modulation, in Proc. Int. Symp. Signals Syst., Oct. 1998. pp. 257 262. [5] M. Garrido, K. K. Parhi, and J. Grajal, A pipelined FFT architecture for real-valued signals, IEEE Trans. Circuits Syst. I, Reg Papers, vol.56, no. 12, pp. 2634 2643, Dec. 2009. [6] M. Ayinala, M. Brown, and K. K. Parhi, Pipelined parallel FFT architectures via folding transformation, IEEE Trans. Very Large Scale Integer. (VLSI) Syst., vol. 20, no. 6, pp. 1068 1081, Jun. 2012. [7] M. Ayinala and K. K. Parhi, Parallel-pipelined radix-2 2 FFT architecture for real valued signals, in Proc. Asilomar Conf. Signals, Syst. Comput., Nov. 2010, pp. 1274 1278. [8] Manohar Ayinala, Keshab K. Parhi, FFT Architectures for Real-Valued Signals Based on Radix-2 3 and Radix-2 4 Algorithms IEEE transactions on circuits and systems- I:regular papers,2013. ISSN (Print): 2249-9210 ISSN (Online): 2348-1862 74 IJREAS, Vol. 02, Issue 02, July 2014

[9] K. K. Parhi, C. Y. Wang, and A. P. Brown, Synthesis of control circuits in folded pipelined DSP architectures, IEEE J. Solid State Circuits,vol. 27, no. 1, pp. 29 43, 1992. [10] Mario Garrido, J. Grajal, M. A. Sánchez, and Oscar Gustafsson, Pipelined Radix-2 k Feedforward FFT Architectures IEEE Trans. on very large scale integration (VLSI) systems, vol. 21, no. 1, January 2013 [11] V.G. Oklobdzija Design and analysis of fast carrypropagate adder under non-equal input signal arrival profile IEEE 1994 [12] Cha Min, E.E Swartzlander Modified carry skip adder for reducing first block delay IEEE 2000 [13] R.W Doran Variants of an improved carry look-ahead adder IEEE Tranctions 1988 [14] Jeffrey O. Coleman, Arda Yurdakul Fractions in the Canonical-Signed-Digit Number System 2001 Conference on Information Sciences and Systems, The Johns Hopkins University, March 21 23, 2001 [15] Michael A. Soderstrand CSD MULTIPLIERS FOR FPGA DSP APPLICATIONS Circuit and systems 2003 ISCAS 03.Procedings of 2003 international Symposium on (volume 5). [16] Prabir Saha, A. Banerjee,I Banerjee,ADandapat High speed low power floating point design based on CSD(canonical sign digit) VDAT 2010. ISSN (Print): 2249-9210 ISSN (Online): 2348-1862 75 IJREAS, Vol. 02, Issue 02, July 2014