Implementation of a High Speed Four Transmitter Space-Time Encoder using Field Programmable Gate Array and Parallel Digital Signal Processors

Similar documents
Implementation of Space Time Block Codes for Wimax Applications

AN FPGA IMPLEMENTATION OF ALAMOUTI S TRANSMIT DIVERSITY TECHNIQUE

Wideband Spectral Measurement Using Time-Gated Acquisition Implemented on a User-Programmable FPGA

FPGA implementation of Generalized Frequency Division Multiplexing transmitter using NI LabVIEW and NI PXI platform

FPGA Implementation of Digital Modulation Techniques BPSK and QPSK using HDL Verilog

Channelization and Frequency Tuning using FPGA for UMTS Baseband Application

Software Design of Digital Receiver using FPGA

IMPLEMENTATION OF SOFTWARE-BASED 2X2 MIMO LTE BASE STATION SYSTEM USING GPU

A GENERAL SYSTEM DESIGN & IMPLEMENTATION OF SOFTWARE DEFINED RADIO SYSTEM

Mobile & Wireless Networking. Lecture 2: Wireless Transmission (2/2)

Module -18 Flip flops

Optimized BPSK and QAM Techniques for OFDM Systems

Computer-Based Project in VLSI Design Co 3/7

SIMULATION AND IMPLEMENTATION OF LOW POWER QPSK ON FPGA Tushar V. Kafare*1 *1( E&TC department, GHRCEM Pune, India.)

An Efficient Method for Implementation of Convolution

DESIGN, IMPLEMENTATION AND OPTIMISATION OF 4X4 MIMO-OFDM TRANSMITTER FOR

The Application of System Generator in Digital Quadrature Direct Up-Conversion

Lecture 3: Wireless Physical Layer: Modulation Techniques. Mythili Vutukuru CS 653 Spring 2014 Jan 13, Monday

Hardware Implementation of OFDM Transceiver. Authors Birangal U. M 1, Askhedkar A. R 2 1,2 MITCOE, Pune, India

Performance Analysis of n Wireless LAN Physical Layer

A HIGH PERFORMANCE HARDWARE ARCHITECTURE FOR HALF-PIXEL ACCURATE H.264 MOTION ESTIMATION

Design and FPGA Implementation of an Adaptive Demodulator. Design and FPGA Implementation of an Adaptive Demodulator

Spectral Monitoring/ SigInt

BPSK_DEMOD. Binary-PSK Demodulator Rev Key Design Features. Block Diagram. Applications. General Description. Generic Parameters

TU Dresden uses National Instruments Platform for 5G Research

THIS work focus on a sector of the hardware to be used

Department of Electronics & Telecommunication Engg. LAB MANUAL. B.Tech V Semester [ ] (Branch: ETE)

Implementation of Digital Signal Processing: Some Background on GFSK Modulation

Design of Adjustable Reconfigurable Wireless Single Core

Designing with STM32F3x

Using Soft Multipliers with Stratix & Stratix GX

CHAPTER III THE FPGA IMPLEMENTATION OF PULSE WIDTH MODULATION

MIMO RFIC Test Architectures

What s Behind 5G Wireless Communications?

SV2C 28 Gbps, 8 Lane SerDes Tester

Appendix B. Design Implementation Description For The Digital Frequency Demodulator

Method We follow- How to Get Entry Pass in SEMICODUCTOR Industries for 2 nd year engineering students

TSTE17 System Design, CDIO. General project hints. Behavioral Model. General project hints, cont. Lecture 5. Required documents Modulation, cont.

Advances in Antenna Measurement Instrumentation and Systems

Single Chip FPGA Based Realization of Arbitrary Waveform Generator using Rademacher and Walsh Functions

A 3-10GHz Ultra-Wideband Pulser

An Accurate phase calibration Technique for digital beamforming in the multi-transceiver TIGER-3 HF radar system

Experiment 3. Direct Sequence Spread Spectrum. Prelab

A Variable-Frequency Parallel I/O Interface with Adaptive Power Supply Regulation

Techniques for Implementing Multipliers in Stratix, Stratix GX & Cyclone Devices

Time Matters How Power Meters Measure Fast Signals

Audio Sample Rate Conversion in FPGAs

Keyword ( FIR filter, program counter, memory controller, memory modules SRAM & ROM, multiplier, accumulator and stack pointer )

Rapid FPGA Modem Design Techniques For SDRs Using Altera DSP Builder

VIIP: a PCI programmable board.

Rep. ITU-R BO REPORT ITU-R BO SATELLITE-BROADCASTING SYSTEMS OF INTEGRATED SERVICES DIGITAL BROADCASTING

Implementing Logic with the Embedded Array

Digital Audio Broadcasting Eureka-147. Minimum Requirements for Terrestrial DAB Transmitters

A Survey on Power Reduction Techniques in FIR Filter

Keywords: CIC Filter, Field Programmable Gate Array (FPGA), Decimator, Interpolator, Modelsim and Chipscope.

Vol. 4, No. 4 April 2013 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

Source Coding and Pre-emphasis for Double-Edged Pulse width Modulation Serial Communication

Abstract of PhD Thesis

Spread Spectrum. Chapter 18. FHSS Frequency Hopping Spread Spectrum DSSS Direct Sequence Spread Spectrum DSSS using CDMA Code Division Multiple Access

2015 The MathWorks, Inc. 1

CHAPTER 4 FIELD PROGRAMMABLE GATE ARRAY IMPLEMENTATION OF FIVE LEVEL CASCADED MULTILEVEL INVERTER

THE DESIGN OF A PLC MODEM AND ITS IMPLEMENTATION USING FPGA CIRCUITS

Folded Low Resource HARQ Detector Design and Tradeoff Analysis with Virtex 5 using PlanAhead Tool

FPGA Based 70MHz Digital Receiver for RADAR Applications

BPSK System on Spartan 3E FPGA

Power consumption reduction in a SDR based wireless communication system using partial reconfigurable FPGA

Design of 2 4 Alamouti Transceiver Using FPGA

MWA Antenna Description as Supplied by Reeve

International Journal of Advanced Research in Computer Science and Software Engineering

Digital Systems Design

Multiple Access Techniques for Wireless Communications

PLC2 FPGA Days Software Defined Radio

IMPLEMENTATION OF G.726 ITU-T VOCODER ON A SINGLE CHIP USING VHDL

Lecture 3, Handouts Page 1. Introduction. EECE 353: Digital Systems Design Lecture 3: Digital Design Flows, Simulation Techniques.

Globally Asynchronous Locally Synchronous (GALS) Microprogrammed Parallel FIR Filter

Implementation of FPGA based Design for Digital Signal Processing

Chapter 2 Overview - 1 -

MODULATION AND MULTIPLE ACCESS TECHNIQUES

A Comparison of Two Computational Technologies for Digital Pulse Compression

OQPSK COGNITIVE MODULATOR FULLY FPGA-IMPLEMENTED VIA DYNAMIC PARTIAL RECONFIGURATION AND RAPID PROTOTYPING TOOLS

SDR14TX: Synchronization of multiple devices via PXIe backplane triggering

Practical issue: Group definition. TSTE17 System Design, CDIO. Quadrature Amplitude Modulation (QAM) Components of a digital communication system

A GENERIC ARCHITECTURE FOR SMART MULTI-STANDARD SOFTWARE DEFINED RADIO SYSTEMS

Design of Spread-Spectrum Communication System Based on FPGA

FPGA Implementation of QAM and ASK Digital Modulation Techniques

HOW DO MIMO RADIOS WORK? Adaptability of Modern and LTE Technology. By Fanny Mlinarsky 1/12/2014

LLRF4 Evaluation Board

The Comparative Study of FPGA based FIR Filter Design Using Optimized Convolution Method and Overlap Save Method

STRS COMPLIANT FPGA WAVEFORM DEVELOPMENT

UTILIZATION OF AN IEEE 1588 TIMING REFERENCE SOURCE IN THE inet RF TRANSCEIVER

Exercise 3-2. Digital Modulation EXERCISE OBJECTIVE DISCUSSION OUTLINE DISCUSSION. PSK digital modulation

1. The decimal number 62 is represented in hexadecimal (base 16) and binary (base 2) respectively as

Implementation of a BPSK Transceiver for use with KUAR

REALISATION OF AWGN CHANNEL EMULATION MODULES UNDER SISO AND SIMO

Software-Defined Radio using Xilinx (SoRaX)

DESIGN OF MULTIPLYING DELAY LOCKED LOOP FOR DIFFERENT MULTIPLYING FACTORS

Anju 1, Amit Ahlawat 2

SV3C CPTX MIPI C-PHY Generator. Data Sheet

BPSK Modulation and Demodulation Scheme on Spartan-3 FPGA

RF and Microwave Test and Design Roadshow 5 Locations across Australia and New Zealand

Transcription:

Implementation of a High Speed Four Transmitter Space-Time ncoder using Field Programmable Gate Array and Parallel Signal Processors Peter J. Green and Desmond P. Taylor Department of lectrical and omputer ngineering University of anterbury hristchurch New Zealand peter.green@canterbury.ac.nz taylor@elec.canterbury.ac.nz Abstract This paper describes the concept, architecture, development and demonstration of a high performance, 4 transmitter, real-time space time encoder designed for research into transmitter diversity and multiple input and multiple output (MIM)wireless systems. It is implemented on a Xilinx Virtex 2 Pro Field Programmable Gate Array (FPGA) and parallel processing on multiple digital signal processors (DSP). The system is software defined to allow for flexibility in the choice of transmit modulation formats, data rates and space-time coding schemes. Hardware, firmware and software aspects of the space time encoder system to meet design requirements are discussed. The testing and demonstration of the system running the Alamouti space time coding scheme is covered. The current implementation is an enhancement to an existing Smart Antenna Software RAdio Test System (SAS- RATS) platform [3, 4] designed to test and verify various space time architectures and algorithms. f significant interest is the real-time testing of the space time (ST) coding schemes developed by Alamouti [1] and others mentioned in [2]. Space time coding schemes are necessary to support the high data rates of future wireless mobile and local area network standards. The primary objective is to increase system capacity and performance through the use of multiple antennas, spatial multiplexing and space time (ST) coding. A requirement for real-time space-time coding experiments is that all transmitters must be synchronised. ach transmitter must output data from the space-time encoder algorithm at precisely the same time. ur original transmitters, developed in 2000 [3] were designed for beamforming experiments and not for synchronised space-time encoding operations. ur goal now is to achieve synchronised transmit symbol rates of greater than 1 Mbaud per transmitter with pulse shaping from 4 transmitters. Another requirement is that the transmitter characteristics must be software defined to allow for flexibility in choice of modulation formats, data rates and space-time coding schemes. To meet the desired TX data rates and programmability objectives; the Xilinx Virtex 2 Pro FPGA, Motorola DSPs and Analog Devices quadrature digital upconverter integrated circuit (I) were selected. The I has an integrated direct digital synthesizer, 14-bit digital-to-analog converter and quadrature modulator. The I operates at a clock frequency of 200 MHz and is programmed to output a 70 MHz intermediate frequency (IF) signal. This is then upconverted to 915 MHz by a separate SASRATS analog radio frequency (RF) upconverter unit. The required symbol rate is programmed into the and the device generates an output clock (PDLK) signal at twice the symbol rate. nce enabled for quadrature modulation, the device will request 14 bit In phase (I) and uadrature phase () signals. The I signals must be presented sequentially and continously to the and clocked on the rising edge of the PDLK. After an I pair is received, the 70 MHz modulated signal is produced. The complete system implementation consists of a master Xilinx Virtex 2 Pro FPGA, 4 slave DSPs and 4 sets of the boards as shown in Figure 1. The four DSPs are used to handle the computation overheads of pulse shaping for each transmitter. The FPGA performs random number generation, mapping data into the desired digital modulation format and space time encoding. Data is transfered to the 4 slaves in parallel through four 16-bit ports configured on the Xilinx FPGA board. The data on the FPGA output ports are distributed to each Slave DSP through their respective Port A s triggered by interrupt driven Direct Memory Access (DMA) transfers. n each slave, the finite impulse response (FIR) filtering is carried Proceedings of the Third I International Workshop on lectronic Design, Test and Applications (DLTA 06) 0-7695-2500-8/05 $20.00 2005 I Authorized licensed use limited to: University of anterbury. Downloaded on June 14,2010 at 02:26:58 UT from I Xplore. Restrictions apply.

Xilinx Virtex 2 Pro FPGA VM VM VM VM PDLKS Figure 1. SASRAT 4 transmitter space-time encoder out by the enhanced filter co-processor (FP). The filtered data to then sent to Port B. Port B is a 16 bit port and has sufficient resolution to output the 14 bit data required by the TX boards. Timing is controlled by the PDLK signals from the TX board to the master FPGA. The master then sends Interrupt Requests (IR) to the slave DSPs at the appropriate time instant for transfer and FIR processing of data. 1 Pulse shaping of transmitter symbols ne requirement of any digital transmitter is the need for pulse shaping filters. This is to shape the transmitted spectrum to meet out of band emission requirements and ensure that at the receiver, the received signal is sampled at an optimal point in the pulse interval to maximize the probability of an accurate decision. The symbol pulses must not interfere with one another at the optimal sampling point. A rectangular pulse can be used but is not ideal as it takes infinite bandwidth. Raised cosine pulse shaping is normally used between transmitter and receiver to conserve bandwidth and to ensure no intersymbol interference at the sampling points. The filters are implemented using finite impulse response (FIR) filters on the DSP. However, to ensure that the raised cosine frequency characterics are met, the filter must oversample the data by at least a factor of 2 (samples per symbol). The sequential data input format to the and minimum pulse shaping (2 X oversampling) requirements imply that for a transmit symbol rate of 1 Mbauds, new data must be calculated and presented to the at 4 MSPS per transmitter and this requires a fast digital processing platform. To provide a more accurate spectral shape, it is also desirable to oversample by a factor greater than 2. This requirement will increase the FIR filter length significantly. TX 1 TX 2 TX 3 TX 4 The design allows for filter lengths of up to 512 taps. The high speed pulse shaping function can be directly implemented on the FPGA platform but is limited by the number of 18 18 bit multipliers available in the various versions of the Virtex 2 Pro. At the time of writing, the best version (2VP100) of the Virtex 2 Pro has 444 18 18 bit multipliers. The number of cells in our Virtex 2 Pro (2VP30) is limited to 136 multipliers. Using the 2VP30 with 8 parallel (I) processing paths for 4 transmitters leaves only 17 multipliers. This gives a 16 tap FIR filter per path. Another option is to have the I and data timemultiplexed on 4 processing paths to give a 33 tap FIR filter/per path. Adding clever reuse of multipliers and coefficients in a symmetrical coefficient FIR filter design can have the equivalence of a 66 tap FIR filter. This is still below the goal of a 512 tap FIR filter. For this reason, we implement the pulse shaping function on the four s. The has an enhanced filter co-processor (F- P) which can be configured for FIR filtering. The F- P has 12K-word data and 12K- word coefficient memory banks and can easily implement a 512 tap FIR filter at the required rate. The is configured to perform DMA transfer from the FPGA to the FP on the negative edge of the IR signal through Port A and outputs the filtered sample from the FP to Port B. The data on Port B is read by the TX boards on the rising edge of the IR. However, there is a disadvantage to this approach!!. A factor that limits the data rate when using the is the processing speed of the interrupt service routine. The DSP does not respond instantly. There is a significant time delay of 50 ns between the detection of the negative edge of the IR signal to the first execution of required instructions as the DSP takes clock cycles to set up the stack and other registers to respond to an interrupt service routine. It takes a further 60nS to process and output data onto Port B. Another 10nS guard time is added to ensure that data is stable on the rising edge of PDLK to bring the total time to 120nS. Thus a period of 480 ns (2.082 MHz) is needed to send an I sequential pair to the transmitter. If the DSP outputs 2 samples per symbol, then the achievable output symbol rate from the transmitter is just above 1 MBauds. If pulse shaping were done by the FPGA, a much higher output symbol rate could be achieved as the FPGA approach does not incur any IR and DMA interrupt overheads etc. All processing is done on dedicated multipliers. The limitation of the FPGA is the small number of filter taps. Thus for applications requiring very high speed but short filter lengths, the FPGA approach is recommended. For these applications, the hardware design allows the FPGA to bypass the DSPs and connect directly to the upconverters. Proceedings of the Third I International Workshop on lectronic Design, Test and Applications (DLTA 06) 0-7695-2500-8/05 $20.00 2005 I Authorized licensed use limited to: University of anterbury. Downloaded on June 14,2010 at 02:26:58 UT from I Xplore. Restrictions apply.

LAR LK INV LR LR [7:0] [7:0] XLXN_33(2) XLXN_7(0) XLXN_7(1) XLXN_7(2) Load I INV INV PDLK UT0 UT1 UT2 LAD RST LK set_data(4) set_data(5) set_data(15:0) set_data(3) set_data(2) set_data(1) set_data(0) set_data2(2) set_data2(4) set_data2(6)set_data2(7) TX_NABL_IN LAD LK RST D G set_data(13) NIS_0 NIS_1 NIS_2 NIS_3 LR set_data(15) set_data(14) LAD RST TX1_NABL SLI D[15:0] L SLI D[7:0] L TX2_NABL R R [15:0] [7:0] I I I I TXAB_N TXAB_1_0 TXAB_2_0 TXAB_1_1 TXAB_12_ TXAB_2_1 TXAB_1_2 TXAB_2_2 TXAB_1_ TXAB_1_3 TXAB_2_3 TXAB_1_4 TXAB_1_ TXAB_2_4 TXAB_1_5 TXAB_2_5 TXAB_12_ TXAB_1_6 TXAB_2_6 TXAB_1_7 TXAB_2_ TXAB_2_7 TXAB_1_8 TXAB_2_8 TXAB_2_ TXAB_1_9 TXAB_2_9 TXAB_1_10 TXAB_ TXAB_2_10 TXAB_1_11 TXAB_2_11 TXAB_ TXAB_1_12 TXAB_2_12 TXAB_1_13 TXAB_2_13 AA2 Noise_utput(3) Noise_utput(2) Noise_utput(1) Noise_utput(0) XR3 I T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T 0 0 1 1 2 2 3 3 Table 1. xample of the 2 transmitter Alamouti scheme over 6 symbols Time Instants t t + T t +2T t +3T t +4T t +5T TX0 s 0 s 1 s 2 s 3 s 4 s 5 TX1 s 1 s 0 s 3 s 2 s 5 s 4 PSK GNRATR R8 R8 PSK_Random_Generator_V2 FD4 FD4 LATH M2_1 Alamouti V3 UTPUT FRS T R8 2 Alamouti encoding and its implementation on FPGA LK GNRATR TX_NABLR ILD M2_1B1 M2_1B1 In a classical one-transmitter system, symbols s 0, s 1, s 2,... are transmitted at times t, t + T, t +2T,... respectively. In a two transmitter Alamouti encoder scheme however, the symbols s 0 and s 1 are transmitted simultaneously from transmitters TX0 and TX1 at time instant t. At time instant t + T, the symbols s 1 and s 0 are transmitted simultaneously out of the transmitters where * represents the complex conjugate. Table 1 shows the transmitted symbols for a 6 symbol 2 transmit Alamouti scheme. The complete design is implemented using schematic entry on the Xilinx Integrated System nviroment (IS) Foundation design tool. IS has a large library of functional blocks such as adders, multipliers, registers, memory and logic for schematic entry. VHDL code can also be integrated as a block with other schematic components if desired. This approach allows hardware designers to quickly use FPGA technology to implement hardware designs without mastering VHDL. The IS tool then translates the design into firmware that is needed to program the Virtex 2 Pro. The IS tool also incorporates the Xilinx ore Generator intellectual property(ip) modules with functions such as FIR filters which can embedded into a schematic design to shorten design cycle time. The system architecture to implement a real time, continuously operating 2 transmit Alamouti scheme is shown in Figure 2. It consist of the quadrature phase shift keying (PSK) random generator block, the clock generators (R8), the register latch (FD4), the Look Up Table (Alamouti V3) block with tri-state output buffers (T) and the transmit enable controller (ILD). In the random generator implementation, the symbols are PSK symbols where each symbol represent 2 bits of data. First, a pseudo random sequence generator is designed to generate the random bits. Figure 3 shows the implementation of a 24 bit maximal shift random generator which consist of a concatenation of an 8 bit(sr8rl) and 16 bit (SR16RL) programmable shift registers taken from the Xilinx library. A feedback signal is derived from specific tap points in the shift registers via a three input XR gate. The registers must initially be reset and loaded with a preset 24-bit data which acts as a seed in the random number generator. nce loaded and enabled, the generator will output Figure 2. Schematic of a 2 transmitter spacetime encoder implemented on Xilinx IS 24 BIT MAXIMAL SHIFT RANDM NUMBR GNRATR set_data(6) set_data(7) set_data(8) set_data(9) set_data(10) set_data(11) set_data(12) Noise_utput(15) SR16RL SR8RL Figure 3. PSK Random Number Generator a random bit at the rising edge of each clock cycle. To generate 2 parallel paths for a 2 transmit Alamouti scheme, 4 bits of data (,,, ) are tapped out of the random generator shift registers to generate 2 PSK symbols. The 4 bits are stored in a register (FD4) in Figure 2 for further processing. A fresh set of 4 bits is latched after every 4 clock cycles. The clock generator circuitry ensures that all signals are clocked and latched at the correct instant in time. The clock input is derived from the PDLK signal from the transmitter board. Note that the I and signals must be presented in sequence at the rising edge of PDLK and repeated indefinitely until the TX NABL is disabled. The PDLK is first inverted to drive the system and then inverted again to output as a IR signal to the DSP s. Two binary ripple counters (R8), one positive and the other negative edge triggered, are used to generate the various clock signals. The various clock signals generated by the clock circuitry are shown in Figure 4. LK3 enables the Proceedings of the Third I International Workshop on lectronic Design, Test and Applications (DLTA 06) 0-7695-2500-8/05 $20.00 2005 I Authorized licensed use limited to: University of anterbury. Downloaded on June 14,2010 at 02:26:58 UT from I Xplore. Restrictions apply.

TX_ TX_N TXAB_12_ TXAB_1_ TXAB_1_ TXAB_12_ TXAB_2_ TXAB_2_ TXAB_ TXAB_ TXAB_N TX_ TX_N 0 0 1 1 2 2 3 3 TXA_1_0 TXA_2_0 TXA_1_1 TXA_2_1 TXA_1_2 TXA_2_2 TXA_1_3 TXA_2_3 TXA_1_4 TXA_2_4 TXA_1_5 TXA_2_5 TXA_1_6 TXA_2_6 TXA_1_7 TXA_2_7 TXA_1_8 TXA_2_8 TXA_1_9 TXA_2_9 TXAB_ TXA_1_10 TXA_2_10 TXA_1_11 TXA_2_11 TXA_1_12 TXA_2_12 TXA_1_13 TXA_2_13 M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N TXAB_12_ TXAB_1_ TXAB_1_ TXAB_12_ TXAB_2_ TXAB_2_ INV TXAB_ TXAB_N TX_ TX_N 0 0 1 1 2 2 3 3 TXB_1_0 TXB_2_0 TXB_1_1 TXB_2_1 TXB_1_2 TXB_2_2 TXB_1_3 TXB_2_3 TXB_1_4 TXB_2_4 TXB_1_5 TXB_2_5 TXB_1_6 TXB_2_6 TXB_1_7 TXB_2_7 TXB_1_8 TXB_2_8 TXB_1_9 TXB_2_9 TXB_1_10 TXB_2_10 TXB_1_11 TXB_2_11 TXB_1_12 TXB_2_12 TXB_1_13 TXB_2_13 M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N TXA_1_0 TXB_1_0 TXAB_N TXA_2_0 TXB_2_0 TXAB_N TXA_1_1 TXB_1_1 TXAB_N TXA_2_1 TXB_2_1 TXAB_N TXA_1_2 TXB_1_2 TXAB_N TXA_2_2 TXB_2_2 TXAB_N TXA_1_3 TXB_1_3 TXAB_N TXA_2_3 TXAB_1_0 TXB_2_3 TXAB_N TXA_1_4 TXAB_2_0 TXB_1_4 TXAB_N TXA_2_4 TXAB_1_1 TXB_2_4 TXAB_N TXA_1_5 TXAB_2_1 TXB_1_5 TXAB_N TXA_2_5 TXAB_1_2 TXB_2_5 TXAB_N TXA_1_6 TXAB_2_2 TXB_1_6 TXAB_N TXA_2_6 TXAB_1_3 TXB_2_6 TXAB_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N TXA_1_7 TXAB_2_3 TXB_1_7 TXAB_N TXA_2_7 TXAB_1_4 TXB_2_7 TXAB_N TXA_1_8 TXAB_2_4 TXB_1_8 TXAB_N TXA_2_8 TXAB_1_5 TXB_2_8 TXAB_N TXA_1_9 TXAB_2_5 TXB_1_9 TXAB_N TXA_2_9 TXAB_1_6 TXB_2_9 TXAB_N TXA_1_10 TXAB_2_6 TXB_1_10 TXAB_N 0 0 1 1 TXA_2_10 TXAB_1_7 TXB_2_10 TXAB_N TXA_1_11 TXAB_2_7 TXB_1_11 TXAB_N TXA_2_11 TXAB_1_8 TXB_2_11 TXAB_N TXA_1_12 TXAB_2_8 TXB_1_12 TXAB_N TXA_2_12 TXAB_1_9 TXB_2_12 TXAB_N TXA_1_13 TXAB_2_9 TXB_1_13 TXAB_N TXA_2_13 TXAB_1_10 TXB_2_13 TXAB_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N M4_1 TX_ TX_N TXAB_2_10 TXAB_1_11 TXAB_2_11 TXAB_1_12 TXAB_2_12 TXAB_1_13 TXAB_2_13 2 2 3 3 I I I PLK LK0 LK1 LK2 LK3 TX1 DATA TX2 DATA _V3 BANK 1 _V3 BANK 2 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 MULTIPLXRS M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 M2_1 Figure 4. lock signals the Random Generator over 4 PDLK cycles to allow a fresh set of 4 bits to be generated. LK0, LK1, LK2 and latched values of,, and are address lines to the Look Up Table () circuit block. LK0 when 0 indicates an I data output, data output when 1. LK1 remains 0 or 1 for the duration of one I pair (i.e. one time slot). LK2 remains 1 or 0 for the duration of 2 I pairs (2 time slots). and determine the PSK symbol to be output to transmitter TX1 and and set the symbol data to TX2.,, and remain unchanged over 4 timeslots where each timeslot consist of an I and data sequential pair ( requirement). Although only 2 timeslots are required for Alamouti, the extra two are added for requirements of the pulse shaping filters (2X oversampling) in the oprocessor. Thus the original Alamouti symbol sequence from TX1 is changed from s0, s1 to s0, 0, s1, 0. Similarly, the symbol sequence from TX2 is changed from s1, s0 to s1, 0, s0, 0. Therefore to maintain the original symbol rate, PDLK must be increased by a factor of 2. The Alamouti block consist of 2 banks (BANK 1 and BANK 2), each of size 2 X 16 X 16 as shown in Figure 5. There are 2 sets of 16-bit words per bank (for TX1 and TX2) and there are 16 words in each bank. ach word has a unique address. The outputs of the two banks are multiplexed using 32, 2-input multiplexers (M2 1) under the control of LK2. When LK2 is 0, the outputs to TX1 and TX2 comes from BANK 1 in the first and second timeslots. During the third and fourth timeslots when LK2 is 1, the outputs to TX1 and TX2 come from BANK 2. The circuitry of each bank is shown in Figure 6. ach bank is made up of 16 cells. ach cell is made up of two 4-bit s () and two multiplexers (M4 1). ach cell has two outputs, each representing one bit of a 16-bit word for TX1 and TX2. The outputs of each in a cell is sent to two multiplexers (M4 1) which select Figure 5. Structure of 2 Banks of Look Up Tables with multiplexed outputs BANK STRUTUR Figure 6. Internal structure of one bank the correct data to send to the transmitters TX1 and TX2. The multiplexer is needed because in the Alamouti 2 TX scheme, the second symbol at the first transmitter is the negative conjugate of the first symbol of the second transmitter. Similarly the second symbol at the second transmitter is the conjugate of the first symbol of the first transmitter. The switching of the multiplexer outputs is controlled by LK1. The values prestored in the s for the Alamouti 2 transmit PSK scheme with 2 X pulse shaping filter oversampling requirement are shown in the tables of Figure 7. ther data formating options and space-time codes can be configured by programming the S with the appropriate data and the properly setting the control lines. The overall design is then verified on a simulator platform from Mentor Graphics called ModelSim X-III. The simulator enables the verification of the HDL source code and the functional and timing models generated by the IS Foundation software. Proceedings of the Third I International Workshop on lectronic Design, Test and Applications (DLTA 06) 0-7695-2500-8/05 $20.00 2005 I Authorized licensed use limited to: University of anterbury. Downloaded on June 14,2010 at 02:26:58 UT from I Xplore. Restrictions apply.

LK(2) LK(1) LK(0) TX 1 0 0 0 0 0 I -0.707 1 0 1 0 0 1 0 1 1 0 0 0 0 1 BANK 1 0 0 0 0 1-0.707 1 0 1 0 0 1 0 1 1 0 0 0 0 1 0 0 0 1 0 I -0.707 1 0 1 0 0 1 0 1 1 0 0 0 0 1 TX 1 SLT 1 0 0 0 1 1 0.707 0 1 0 1 1 0 1 0 0 1 1 1 1 1 0 0 1 0 0 I 0.707 0 1 0 1 1 0 1 0 0 1 1 1 1 1 Address 0 0 1 0 1-0.707 1 0 1 0 0 1 0 1 1 0 0 0 0 1 A : 00 -> 07 0 0 1 1 0 I 0.707 0 1 0 1 1 0 1 0 0 1 1 1 1 1 0 0 1 1 1 0.707 0 1 0 1 1 0 1 0 0 1 1 1 1 1 0 1 0 0 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TX 2 SLT 2 0 1 0 1 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Address 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A : 08 -> 0F 0 1 1 1 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Initiailize Interrupts, DMA, FP and Filter oefficients TX nabled? n LK(1) LK(0) LK(2) TX 2 0 0 0 0 0 I -0.707 1 0 1 0 0 1 0 1 1 0 0 0 0 1 BANK 1 0 0 0 0 1-0.707 1 0 1 0 0 1 0 1 1 0 0 0 0 1 0 0 0 1 0 I -0.707 1 0 1 0 0 1 0 1 1 0 0 0 0 1 TX 2 SLT 1 0 0 0 1 1 0.707 0 1 0 1 1 0 1 0 0 1 1 1 1 1 0 0 1 0 0 I 0.707 0 1 0 1 1 0 1 0 0 1 1 1 1 1 Address 0 0 1 0 1-0.707 1 0 1 0 0 1 0 1 1 0 0 0 0 1 B : 00 -> 07 0 0 1 1 0 I 0.707 0 1 0 1 1 0 1 0 0 1 1 1 1 1 0 0 1 1 1 0.707 0 1 0 1 1 0 1 0 0 1 1 1 1 1 0 1 0 0 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TX 1 SLT 2 0 1 0 1 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Address 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 B : 08 -> 0F 0 1 1 1 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 LK(1) LK(0) LK(2) TX 1 1 0 0 0 0 I -0.707 1 0 1 0 0 1 0 1 1 0 0 0 0 1 BANK 2 1 0 0 0 1 0.707 0 1 0 1 1 0 1 0 0 1 1 1 1 1 1 0 0 1 0 I -0.707 1 0 1 0 0 1 0 1 1 0 0 0 0 1 TX 2 SLT 3 1 0 0 1 1-0.707 1 0 1 0 0 1 0 1 1 0 0 0 0 1 * 1 0 1 0 0 I 0.707 0 1 0 1 1 0 1 0 0 1 1 1 1 1 Address 1 0 1 0 1 0.707 0 1 0 1 1 0 1 0 0 1 1 1 1 1 A : 10 -> 17 1 0 1 1 0 I 0.707 0 1 0 1 1 0 1 0 0 1 1 1 1 1 1 0 1 1 1-0.707 1 0 1 0 0 1 0 1 1 0 0 0 0 1 1 1 0 0 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TX 1 SLT 4 1 1 0 1 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Address 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A : 18 -> 1F 1 1 1 1 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 LK(1) LK(0) LK(2) TX 2 1 0 0 0 0 I 0.707 0 1 0 1 1 0 1 0 0 1 1 1 1 1 BANK 2 1 0 0 0 1-0.707 1 0 1 0 0 1 0 1 1 0 0 0 0 1 1 0 0 1 0 I 0.707 0 1 0 1 1 0 1 0 0 1 1 1 1 1 TX 1 SLT 3 1 0 0 1 1 0.707 0 1 0 1 1 0 1 0 0 1 1 1 1 1 (-) * 1 0 1 0 0 I -0.707 1 0 1 0 0 1 0 1 1 0 0 0 0 1 Address 1 0 1 0 1-0.707 1 0 1 0 0 1 0 1 1 0 0 0 0 1 B : 10 -> 17 1 0 1 1 0 I -0.707 1 0 1 0 0 1 0 1 1 0 0 0 0 1 1 0 1 1 1 0.707 0 1 0 1 1 0 1 0 0 1 1 1 1 1 1 1 0 0 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TX 2 SLT 4 1 1 0 1 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Address 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 B : 18 -> 1F 1 1 1 1 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 y Interrupt on IRA? DMA controller transfers I data from FPGA via Port A to all FPs FP performs FIR filtering of I data FP outputs filtered I data to Port B n y y Interrupt on IRA? DMA controller transfers data from FPGA via Port A to all FPs FP performs FIR filtering of data FP outputs filtered data to Port B n Figure 7. data for 2 transmitter PSK modulated Alamouti scheme Figure 8. Flowchart of the slave software 2.1 Software algorithm for DSP slaves When data is valid at the outputs of TX0 and TX1, the FPGA sends an IR signal to the DSPs. The DSPs respond on the falling edge of the IR signal and perform DMA transfers from the TX0 and TX1 outputs to the enhanced filter coprocessor (FP) on the respective DSPs to perform pulse shaping of the symbol as shown in the flowchart of Figure 8. n completion of the FIR computation, the filtered I sample is sent to Port B. This filtered sample is loaded into the upconverter on the rising edge of the PDLK signal. n the next falling edge of the IR signal, the sample is processed in a similar manner. When both I and samples are loaded into the upconverter, the I data modulates the 70 MHz intermediate frequency. This process continues indefinetely until the TX nable control line is disabled. Note that there is a period of latency from the moment the TX nable control line is enabled to the first valid data from the transmitter. This latency period is dependent on the length of the FIR filter programmed into the FP. A Data Valid bit on the FP register is monitored at start up. 2.2 Testing and system performance n testing the system, it was found that the PDLKs of the four upconverters synchronize at different phases of the PDLK waveform. In the initial design, the four upconverters ran from a common 10 MHz clock. ach has an internal digital phase lock loop (DPLL) circuit that synthesizes a 200 MHz internal clock from the 10 MHz reference source. However, it is found that the each locks to 200 MHz at slightly different times and thus it is imposssible to get the PDLK signals from all 4 units to align precisely. To resolve this problem and maintain perfect phase alignment among four Is in this specific application, we choose to bypass the internal DPLL and run a common 200 MHz reference source to all units. This is achieved in the final design by using a D111 I from Texas Instruments which can produce up to 9 synchronized 200 MHz differential outputs from a common 200 MHz clock source. Four differential outputs are used to drive the four upconverters. The space time encoder is fully commissioned and the set-up is shown in Figures 9 and 10. The encoder is housed in a separate chassis to minimize interference between the Proceedings of the Third I International Workshop on lectronic Design, Test and Applications (DLTA 06) 0-7695-2500-8/05 $20.00 2005 I Authorized licensed use limited to: University of anterbury. Downloaded on June 14,2010 at 02:26:58 UT from I Xplore. Restrictions apply.

Figure 9. Space time encoder hardware for 4 antenna system Figure 10. omplete SASRATS transmit platform with space time encoder digital encoder circuitry and the analog RF upconverters. The four modulator boards and the four DSP slaves are stacked one above the other primarily to minimize interconnect lengths for high speed data transmission among boards and also to conserve space in the chasis. The 4 transmit system has been tested using BPSK and PSK modulation and optimized to 1.5 MBauds symbol rate per transmitter using 2X oversampling and FIR filtering on the DSP. A 2 transmitter PSK Alamouti encoder scheme has been fully tested and a 4 transmitter orthogonal space time code will be implemented in the near future. The system is limited by the interrupt and program processing speed of the DSP slaves. However the system has been tested to operate up to 5 Mbauds symbol rate using direct connection between the FPGA and upconverters but with limited pulse shaping. Symbol rates are limited not by FPGA speed but by the surface acoustic wave (SAW) filter bandwidth of 10 MHz used in the analog upconverters. The SAW filters are used in the analog upconverter circuitry to limit bandwidth and control unwanted spurious emissions of the radio spectrum at 915 MHz. ware programmable and allows changes to be easily made by changing the firmware on the FPGA or software in the DSP slaves. References [1] S. Alamouti. Space block coding: A simple transmitter diversity technique for wireless communications. I J. Select. Areas. ommunication, 16:1451 1458, ct. 1998. [2] D. Gesbert et al. From theory to practice: An overview of mimo space-time coded wireless systems. I Journal on Selected Areas in ommunications, 21:281 302, Apr. 2003. [3] P. Green and D. Taylor. Smart antenna software radio test system. Proceedings of the First I International Workshop on lectronic Design, Test and Applications., 1:68 72, Jan. 2002. [4] P. Green and D. Taylor. xperimental verification of spacetime algorithms using the smart antenna software radio test system (sasrats) platform. Personal, Indoor and Mobile Radio ommunications, 2004. PIMR 2004. 15th I International Symposium on, 4:2539 2544, 2004. 3 onclusions We have described the design, development and sucessful implementation of a 4 transmitter space time encoder based on a Xilinx Virtex 2 Pro FPGA board, s for pulse shaping and the Analog Devices modulator boards are capable of carrying out space time coding algorithms of up to 4 transmitters. The encoder is fully operational and a 2 transmit Alamouti scheme has been implemented and tested. The system is fully soft- Proceedings of the Third I International Workshop on lectronic Design, Test and Applications (DLTA 06) 0-7695-2500-8/05 $20.00 2005 I Authorized licensed use limited to: University of anterbury. Downloaded on June 14,2010 at 02:26:58 UT from I Xplore. Restrictions apply.