Detector Implementations Based on Software Defined Radio for Next Generation Wireless Systems Janne Janhunen

Similar documents
Implementation and Complexity Analysis of List Sphere Detector for MIMO-OFDM systems

Fixed-Point Aspects of MIMO OFDM Detection on SDR Platforms

A GPU Implementation for two MIMO OFDM Detectors

SELECTIVE SPANNING WITH FAST ENUMERATION DETECTOR IMPLEMENTATION REACHING LTE REQUIREMENTS

Research Article Application-Specific Instruction Set Processor Implementation of List Sphere Detector

The Case for Optimum Detection Algorithms in MIMO Wireless Systems. Helmut Bölcskei

A WiMAX/LTE Compliant FPGA Implementation of a High-Throughput Low-Complexity 4x4 64-QAM Soft MIMO Receiver

MODIFIED K-BEST DETECTION ALGORITHM FOR MIMO SYSTEMS

Array Like Runtime Reconfigurable MIMO Detector for n WLAN:A design case study

Flex-Sphere: An FPGA Configurable Sort-Free Sphere Detector For Multi-user MIMO Wireless Systems

Sno Projects List IEEE. High - Throughput Finite Field Multipliers Using Redundant Basis For FPGA And ASIC Implementations

ASIC Implementation Comparison of SIC and LSD Receivers for MIMO-OFDM

MULTIPLE-INPUT multiple-output (MIMO) systems

Research Article 3G Long Term Evolution Baseband Processing with Application-Specific Processors

Massively Parallel Signal Processing for Wireless Communication Systems

Advanced MIMO Systems for Maximum Reliability and Performance

FPGA Prototyping of A High Data Rate LTE Uplink Baseband Receiver

SIC AND K-BEST LSD RECEIVER IMPLEMENTATION FOR A MIMO-OFDM SYSTEM

IMPLEMENTATION OF SOFTWARE-BASED 2X2 MIMO LTE BASE STATION SYSTEM USING GPU

The Next Generation Challenge for Software Defined Radio

WHITEPAPER MULTICORE SOFTWARE DESIGN FOR AN LTE BASE STATION

4.4 Implementation Structures in FPGAs and DSPs. Presented by Lee Pucker President, ForwardLink Consulting

3.2Gbps Channel-Adaptive Configurable MIMO Detector for Multi-Mode Wireless Communication

K-Best Decoders for 5G+ Wireless Communication

Socware, Pacwoman & Flexible Radio. Peter Nilsson. Program Manager Socware Research & Education

What s Behind 5G Wireless Communications?

IMPLEMENTATION OF A K-BEST BASED MIMO-OFDM DETECTOR ALGORITHM

A High-Speed QR Decomposition Processor for Carrier-Aggregated LTE-A Downlink Systems

2015 The MathWorks, Inc. 1

A High Throughput Configurable SDR Detector for Multi-user MIMO Wireless Systems

WiMAX Basestation: Software Reuse Using a Resource Pool. Arnon Friedmann SW Product Manager

SOFTWARE IMPLEMENTATION OF THE

From Antenna to Bits:

Implementation of a Soft Output Sphere Decoder by Rapid Prototyping Methodology

DSP Design in Wireless Communication LIANG LIU AND FREDRIK EDMAN,

MIMO in 3G STATUS. MIMO for high speed data in 3G systems. Outline. Information theory for wireless channels

ni.com The NI PXIe-5644R Vector Signal Transceiver World s First Software-Designed Instrument

ELT Radio Architectures and Signal Processing. Motivation, Some Background & Scope

IMPLEMENTATION OF ADVANCED TWO-DIMENSIONAL INTERPOLATION-BASED CHANNEL ESTIMATION FOR OFDM SYSTEMS

Implementation of a Soft Output Sphere Decoder by Rapid Prototyping Methodology

ABSTRACT. MIMO (Multi-Input Multi-Output) wireless systems have been widely used in nextgeneration

A Complete Real-Time a Baseband Receiver Implemented on an Array of Programmable Processors

LOW-POWER SOFTWARE-DEFINED RADIO DESIGN USING FPGAS

Iterative Soft Decision Based Complex K-best MIMO Decoder

Convergence of Hardware and Software in Platforms for Radio Technologies

Next Generation Wireless Communication System

Prototyping Next-Generation Communication Systems with Software-Defined Radio

Planning of LTE Radio Networks in WinProp

Folded Low Resource HARQ Detector Design and Tradeoff Analysis with Virtex 5 using PlanAhead Tool

A HIGH SPEED FFT/IFFT PROCESSOR FOR MIMO OFDM SYSTEMS

LTE Aida Botonjić. Aida Botonjić Tieto 1

SDR OFDM Waveform design for a UGV/UAV communication scenario

h 11 h 12 h 12 h 22 h 12 h 22 (3) H = h 11 h12 h h 22 h 21 (7)

VLSI IMPLEMENTATION OF LOW POWER RECONFIGURABLE MIMO DETECTOR. A Thesis RAJBALLAV DASH

Energy efficient multi-granular arithmetic in a coarse-grain reconfigurable architecture

REVOLUTIONIZING THE COMPUTING LANDSCAPE AND BEYOND.

Design and Implementation of Signal Processing Systems: An Introduction

Overview. 1 Trends in Microprocessor Architecture. Computer architecture. Computer architecture

Implementing WiMAX OFDM Timing and Frequency Offset Estimation in Lattice FPGAs

Using a design-to-test capability for LTE MIMO (Part 1 of 2)

Low-Complexity LDPC-coded Iterative MIMO Receiver Based on Belief Propagation algorithm for Detection

STRS COMPLIANT FPGA WAVEFORM DEVELOPMENT

Dr. D. M. Akbar Hussain

Technical Aspects of LTE Part I: OFDM

1318 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 27, NO. 8, OCTOBER 2009

Software Implementation and Analysis of a Differentially Encoded DPSK Physical Layer Wireless Communication System on an SDR Baseband Processor

Fine-grained Channel Access in Wireless LAN. Cristian Petrescu Arvind Jadoo UCL Computer Science 20 th March 2012

Introduction to co-simulation. What is HW-SW co-simulation?

REAL-TIME IMPLEMENTATION OF A SPHERE DECODER-BASED MIMO WIRELESS SYSTEM

PERFORMANCE ANALYSIS OF WIRELESS COMMUNICATION ALGORITHMS ON A VECTOR SIGNAL PROCESSOR

Sourjya Bhaumik, Shoban Chandrabose, Kashyap Jataprolu, Gautam Kumar, Paul Polakos, Vikram Srinivasan, Thomas Woo

An FPGA 1Gbps Wireless Baseband MIMO Transceiver

Further Vision on TD-SCDMA Evolution

Implementation of LS, MMSE and SAGE Channel Estimators for Mobile MIMO-OFDM

MIPI VGI SM for Sideband GPIO and Messaging Consolidation on Mobile System

SOFTWARE IMPLEMENTATION OF a BLOCKS ON SANDBLASTER DSP Vaidyanathan Ramadurai, Sanjay Jinturkar, Sitij Agarwal, Mayan Moudgill, John Glossner

Partial Reconfigurable Implementation of IEEE802.11g OFDM

Software-Defined Radio Architecture for Broadband OFDM Transceivers

PoC #1 On-chip frequency generation

EE382V-ICS: System-on-a-Chip (SoC) Design

Spectrum Detector for Cognitive Radios. Andrew Tolboe

Network Energy Performance of 5G Systems. Dr. Ylva Jading Senior Specialist Ericsson Research

FROM SIMULATION TO DEMONSTRATION A SDR-BASED MULTI-MODE TESTBED

Sphere Decoding in Multi-user Multiple Input Multiple Output with reduced complexity

Mehnaz Rahman Gwan S. Choi. K-Best Decoders for 5G+ Wireless Communication

SourceSync. Exploiting Sender Diversity

Using a COTS SDR as a 5G Development Platform

5G new radio architecture and challenges

A High Definition Motion JPEG Encoder Based on Epuma Platform

IEEE AC MIMO TRANSMITTER BASEBAND PROCESSING ON CUSTOMIZED VLIW PROCESSOR

Reduced Complexity Software Receivers for TD-SCDMA Downlink

A GALS Many-Core Heterogeneous DSP Platform with Source-Synchronous On-Chip Interconnection Network

Comb type Pilot arrangement based Channel Estimation for Spatial Multiplexing MIMO-OFDM Systems

Evolution of DSP Processors. Kartik Kariya EE, IIT Bombay

CHAPTER 4 FIELD PROGRAMMABLE GATE ARRAY IMPLEMENTATION OF FIVE LEVEL CASCADED MULTILEVEL INVERTER

A Flexible VLSI Architecture for Extracting Diversity and Spatial Multiplexing Gains in MIMO Channels

Vol. 4, No. 4 April 2013 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

A Sphere Decoding Algorithm for MIMO

Decision-Directed Channel Estimation Implementation for Spectral Efficiency Improvement in Mobile MIMO-OFDM

Low-Power Communications and Neural Spike Sorting

Transcription:

GIGA seminar 11.1.2010 Detector Implementations Based on Software Defined Radio for Next Generation Wireless Systems Janne Janhunen janne.janhunen@ee.oulu.fi

2 Outline Introduction Benefits and Challenges of Programmability System Model Unlinear Detector Algorithms Programmable Platforms and Architectures Results Conclusions

3 Introduction MIMO technique combined with OFDM (MIMO-OFDM) has been introduced to 3GPP LTE and WiMAX and proposed to LTE-A. High data rate requirements cause challenges to the real-time implementations. A software defined radio (SDR) is a radio communication system where components are implemented using software on a computing device. Algorithm study and development K-best list sphere detection (LSD) algorithm Layered ORthogonal lattice Detector (LORD) Selective Spanning with Fast Enumeration (SSFE) Programmable platforms Digital Signal Processors (DSP) such as TMS320C6711 (floating point), TMS320C55x (fixed-point) and TMS320C6455 (fixed-point). System-on-a-chip such as Sandbridges SB3011 and SB3500 devices which employ multithreading and multiple cores. Application-specific instruction-set processor (ASIP) which is based on the transport triggered architecture (TTA) Tekijöiden sukunimet aakkosjärjestyksessä

Benefits and Challenges of Programmability 4 Programmability = reuse of hardware Programmable platform provides an opportunity to exploit the silicon more efficiently than a pure hardware implementation in a multi-standard world. In addition, software design and time-to-market is faster than in hardware design To improve performance, programmable core can be accelerated with fine grained accelerators. However, programmability increases power consumption (possible up-to 20-50x compared to corresponding hardware accelerator) and computational overhead Instruction fetch/decode Caches Registers Control Platform Hardware accelerator 90 nm CMOS Embedded processor General purpose processor Power consumption/ operation ~5-10pJ ~125-500pJ ~10-20nJ Silvén 2008 Embedded processor energy consumption breakdown. Dally et al 2008

System Model 5 The MIMO-OFDM system model requirements are based on the 3G LTE standard. The received signal can be described with the equation y s H x s s η, s 1,2,, S, s y s H x s s η, s 1,2,, S, s where S is the number of subcarriers, x is the transmitted signal, η is the Gaussian noise vector and H is the channel matrix

6 Unlinear Detector Algorithms and Simulations

Unlinear Detector Algorithms 7 All algorithms are based on the tree type of search An example: 2x2 antenna system, 16-QAM, real system model K-best, K=4 + Fixed computational complexity + Fixed throughput + Amount of control is small - Wasted partial Euclidean distance (PED) computation - Large list size increases the computational complexity fast - Expensive sorter operation - Limited possibility to parallelize tree search between levels Layered ORthogonal lattice Detector (LORD) + Fixed computational complexity + Fixed throughput + Rather simple slicing operation chooses the closest constellation points + Supports parallel tree search, also inside the tree + Achieves a maximum a posteriori (MAP) solution in 2x2 antenna case - Computational complexity gets high with higher modulation - Tree searches are required the number of transmit antennas Selective Spanning with Fast Enumeration (SSFE), m=[2 1 2 2] + Fixed computational complexity + Fixed throughput + Rather simple slicing operation replaces the expensive sorting + No unnecessary PED computation - In typical case, high number of nodes (constellation points) are required on the top level of the tree - Final list size might be high

Simulation 8

9 Platforms and Architectures

Platforms and Architectures 10 Digital Signal Processors (DSP) TMS320C6711 (floating point VLIW (Very Long Instruction Word)) TMS320C6455 (fixed-point VLIW) TMS320C55x (fixed-point, low-power processor) System-on-a-chip (SoC) Sandbridge SB3500 (multi-threading and multiple cores, resembles VLIW) Application-specific instruction-set processor (ASIP) Transport Triggered Architecture (TTA) VLIW (Fischer 1983) TTA (Corporaal 1991)

Transport Triggered Architecture 11 TTA resembles a VLIW architecture TTA instruction word consists of multiple moves -> one for each bus Each move determines the data transport on the corresponding bus Very fine-grained control Allows optimization which is not available in a conventional processors, e.g. data moves between functional units without using registers Finite State Machine of a hardware accelerator is replaced by the transport program in TTA About the same number of control bits are required as in FSM based data path control TTA instruction word consists of multiple slots Depending on the design, it is possible to achieve the same energy efficiency with TTA as ASIC. add R0, R1 R2 R0 adder.operand R1 adder.trigger adder.result R2 adder.result mul.operand

12 Transport Triggered Architecture The bypass network of the processor is exposed to the programmer/compiler Software has complete control over the internal transports Operations are side-effects of data transports: only one instruction MOVE! Writing data into a triggering port of a functional unit starts computation The latencies of functional units are visible to programmer/compiler TCE (TTA Codesign Environment) C-compiler available in a toolset Mapping TTA on platform: FPGA and ASIC known latency operand ADD result trigger Tekijöiden sukunimet aakkosjärjestyksessä

Transport Triggered Architecture 13

OSEd - Operation Set Editor 14 With OSEd it is possible to add, simulate and delete operation definitions

Results(1) 15 Four implementations of K-best list sphere detector K=16 Clock frequency (MHz) Throughput (Mbps) TMS320C6455 1200 1.8 Sandblaster 3500 1800 (3x600) 3.4 Sandblaster 3500 + instruction set extension for sorter 1800 (3x600) 32.0 ASIP based on TTA 280 7.6

Results(2) 16 A design example of K-best-LORD algorithm TTA assembly hand coded -> tight scheduling, all the function units are kept busy 2x2 16-QAM system: 35 clock cycles per tree search for LORD, two searches per symbol vector in LORD algorithm. Therefore 70 clock cycles required for symbol vector. FU # of FUs Latency (cc) MUL 4 1 SLICER 4 1 ADD/SUB 8 1 SORTER 1 2 REG BANKS (8x16bit, 1024 bits) 8 1 Clock rate [MHz] 100 11.4 300 34.4 500 57.1 Decoding rate [Mbps]

Conclusion 17 MIMO technique combined with OFDM (MIMO-OFDM) provides an opportunity for higher data rates but real-time implemention has to be pushed on the edge. Digital signal processors require (fine grained) accelerators to achieve expected data rates. Because there are multiple (wireless communication) standards to be supported, programmable platforms are of interest. Software defined radio is an old concept. However, not until now techniques have become mature enought to start responding to the expectations what has been build on it.

18 Thank you!