VA04D 16 State DVB S2/DVB S2X Viterbi Decoder. Small World Communications. VA04D Features. Introduction. Signal Descriptions. Code

Similar documents
PCE04I Inmarsat Turbo Encoder. Small World Communications. PCE04I Features. Introduction. Signal Descriptions

PCD04D4 DVB RCS2 Turbo Decoder. Small World Communications. PCD04D4 Features. Introduction. 30 May 2015 (Version 1.04) Product Specification

LCD01G GMR 1 High Speed LDPC Decoder. Small World Communications. LCD01G Features. Introduction. 21 January 2016 (Version 1.03) Product Specification

Convolutional Coding Using Booth Algorithm For Application in Wireless Communication

TSTE17 System Design, CDIO. General project hints. Behavioral Model. General project hints, cont. Lecture 5. Required documents Modulation, cont.

Versuch 7: Implementing Viterbi Algorithm in DLX Assembler

Serial and Parallel Processing Architecture for Signal Synchronization

AN INTRODUCTION TO ERROR CORRECTING CODES Part 2

Analysis of Convolutional Encoder with Viterbi Decoder for Next Generation Broadband Wireless Access Systems

Discontinued IP. IEEE e CTC Decoder v4.0. Introduction. Features. Functional Description

Available online at ScienceDirect. Procedia Technology 17 (2014 )

ISSN: International Journal of Innovative Research in Science, Engineering and Technology

TABLE OF CONTENTS CHAPTER TITLE PAGE

Outline. Communications Engineering 1

IP-PSK-DEMOD4. BPSK, QPSK, 8-PSK Demodulator for FPGA FEATURES DESCRIPTION APPLICATIONS HARDWARE SUPPORT DELIVERABLES

Chapter 3 Convolutional Codes and Trellis Coded Modulation

BPSK_DEMOD. Binary-PSK Demodulator Rev Key Design Features. Block Diagram. Applications. General Description. Generic Parameters

FPGA Implementation of Viterbi Algorithm for Decoding of Convolution Codes

Using TCM Techniques to Decrease BER Without Bandwidth Compromise. Using TCM Techniques to Decrease BER Without Bandwidth Compromise. nutaq.

International Journal of Scientific & Engineering Research Volume 9, Issue 3, March ISSN

6. FUNDAMENTALS OF CHANNEL CODER

VHDL based Design of Convolutional Encoder using Vedic Mathematics and Viterbi Decoder using Parallel Processing

Notes 15: Concatenated Codes, Turbo Codes and Iterative Processing

Front End To Back End VLSI Design For Convolution Encoder Pravin S. Tupkari Prof. A. S. Joshi

Statistical Communication Theory

International Journal of Computer Trends and Technology (IJCTT) Volume 40 Number 2 - October2016

Performance of Nonuniform M-ary QAM Constellation on Nonlinear Channels

Improved concatenated (RS-CC) for OFDM systems

C802.16a-02/76. IEEE Broadband Wireless Access Working Group <

EESS 501 REVISION HISTORY

FOR applications requiring high spectral efficiency, there

Rep. ITU-R BO REPORT ITU-R BO SATELLITE-BROADCASTING SYSTEMS OF INTEGRATED SERVICES DIGITAL BROADCASTING

Project: IEEE P Working Group for Wireless Personal Area Networks (WPANs)

BPSK System on Spartan 3E FPGA

CONCLUSION FUTURE WORK

FPGA BASED DIGITAL QPSK MODULATORS FOR ADVANCED KA-BAND REGENERATIVE PAYLOAD. Kishori Lal Sah, TVS Ram, V. Ramakrishna and Dr.

FPGA Implementation of MHz and mw High Speed Low Power Viterbi Decoder

Spreading Codes and Characteristics. Error Correction Codes

Performance comparison of convolutional and block turbo codes

Mohammad Hossein Manshaei 1393

Channel Coding for IEEE e Mobile WiMAX

PROJECT 5: DESIGNING A VOICE MODEM. Instructor: Amir Asif

Design Trade-offs in the VLSI Implementation of High-Speed Viterbi Decoders and their Application to MLSE in ISI Cancellation

Hardware/Software Co-Simulation of BPSK Modulator and Demodulator using Xilinx System Generator

UNIVERSITY OF SOUTHAMPTON

An Improved Rate Matching Method for DVB Systems Through Pilot Bit Insertion

A Survey of Advanced FEC Systems

Joint Viterbi Detector/Decoder for Satellite Comms.

3GPP TSG RAN WG1 Meeting #85 R Decoding algorithm** Max-log-MAP min-sum List-X

THE idea behind constellation shaping is that signals with

WLAN a Spec. (Physical Layer) 2005/04/ /4/28. WLAN Group 1

Design and Comparison of Viterbi Decoder on Spartan-3A (XC3S400A- 4FTG256C) and Spartan- 3E (XC3S500E- 4FT256) Using Verilog

FPGA Realization of Gaussian Pulse Shaped QPSK Modulator

Bit Error Rate Performance Evaluation of Various Modulation Techniques with Forward Error Correction Coding of WiMAX

Channel Coding RADIO SYSTEMS ETIN15. Lecture no: Ove Edfors, Department of Electrical and Information Technology

COM-1518SOFT HIGH-SPEED DIRECT-SEQUENCE SPREAD- SPECTRUM DEMODULATOR VHDL SOURCE CODE / IP CORE

RADIO SYSTEMS ETIN15. Channel Coding. Ove Edfors, Department of Electrical and Information Technology

Performance of COFDM Technology for the Fourth Generation (4G) of Mobile System with Convolutional Coding and Viterbi Decoding

QAM Modulator IP Core Specifcatoon

CH 4. Air Interface of the IS-95A CDMA System

UNIVERSITY OF MICHIGAN DEPARTMENT OF ELECTRICAL ENGINEERING : SYSTEMS EECS 555 DIGITAL COMMUNICATION THEORY

BPSK Modulation and Demodulation Scheme on Spartan-3 FPGA

Department of Electronic Engineering FINAL YEAR PROJECT REPORT

CS/EE Homework 9 Solutions

Design and Implementation of 4-QAM Architecture for OFDM Communication System in VHDL using Xilinx

Hardware Implementation of BCH Error-Correcting Codes on a FPGA

Optimized BPSK and QAM Techniques for OFDM Systems

EXTENDED CONSTRAINED VITERBI ALGORITHM FOR AIS SIGNALS RECEIVED BY SATELLITE

Study of Turbo Coded OFDM over Fading Channel

Commsonic. DVB-C/J.83 Cable Demodulator CMS0022. Contact information

Maximum Likelihood Detection of Low Rate Repeat Codes in Frequency Hopped Systems

THE DESIGN OF A PLC MODEM AND ITS IMPLEMENTATION USING FPGA CIRCUITS

RECOMMENDATION ITU-R BO Digital satellite broadcasting system with flexible configuration (television, sound and data)

PERFORMANCE EVALUATION OF WIMAX SYSTEM USING CONVOLUTIONAL PRODUCT CODE (CPC)

Comparison of BER for Various Digital Modulation Schemes in OFDM System

Simulink Modeling of Convolutional Encoders

AN IMPROVED NEURAL NETWORK-BASED DECODER SCHEME FOR SYSTEMATIC CONVOLUTIONAL CODE. A Thesis by. Andrew J. Zerngast

Study of turbo codes across space time spreading channel

Comparison Between Serial and Parallel Concatenated Channel Coding Schemes Using Continuous Phase Modulation over AWGN and Fading Channels

Payload measurements with digital signals. Markus Lörner, Product Management Signal Generation Dr. Susanne Hirschmann, Signal Processing Development

A Low Power and High Speed Viterbi Decoder Based on Deep Pipelined, Clock Blocking and Hazards Filtering

Construction of Adaptive Short LDPC Codes for Distributed Transmit Beamforming

EFFECTS OF PHASE AND AMPLITUDE ERRORS ON QAM SYSTEMS WITH ERROR- CONTROL CODING AND SOFT DECISION DECODING

DVB-S2X Modulator IP Core Specifcatoon

Performance Evaluation and Comparative Analysis of Various Concatenated Error Correcting Codes Using BPSK Modulation for AWGN Channel

Improving Data Transmission Efficiency over Power Line Communication (PLC) System Using OFDM

High Data Rate QPSK Modulator with CCSDS Punctured FEC channel Coding for Geo-Imaging Satellite

IEEE Broadband Wireless Access Working Group <

Stratix II DSP Performance

PERFORMANCE EVALUATION OF WCDMA SYSTEM FOR DIFFERENT MODULATIONS WITH EQUAL GAIN COMBINING SCHEME

Chapter 0 Outline. NCCU Wireless Comm. Lab

Contents Chapter 1: Introduction... 2

Performance Analysis of n Wireless LAN Physical Layer

Block code Encoder. In some applications, message bits come in serially rather than in large blocks. WY Tam - EIE POLYU

FPGA Implementation of Digital Modulation Techniques BPSK and QPSK using HDL Verilog

Design and FPGA Implementation of an Adaptive Demodulator. Design and FPGA Implementation of an Adaptive Demodulator

Synthesis and Analysis of 32-Bit RSA Algorithm Using VHDL

Systems for Audio and Video Broadcasting (part 2 of 2)

Journal of Babylon University/Engineering Sciences/ No.(5)/ Vol.(25): 2017

Physical-Layer Network Coding Using GF(q) Forward Error Correction Codes

Transcription:

16 State DVB S2/DVB S2X Viterbi Decoder Preliminary Product Specification Features 16 state (memory m = 4, constraint length 5) tail biting Viterbi decoder Rate 1/5 (inputs can be punctured for higher rates) Optional or standard DVB S2/DVB S2X code polynomials Data length K from 4 to 32 bits Up to 382 MHz internal clock Up to 46 Mbit/s decoding speed (K = 16) 6 bit received signed magnitude data 1315 6 input LUTs. Asynchronous logic free design Free simulation software Available as VHDL core for Xilinx FPGAs under SignOnce IP License. ASIC, Altera, Lattice and Microsemi cores available on request. Introduction The is a 16 state tail biting error control decoder using the maximum likelihood Viterbi algorithm. The decoder is designed to decode the DVB S2 [1] or DVB S2X [2] standard rate 1/5 tail biting convolutional code. External code inputs and input data puncturing allow other 16 state tail biting codes to be decoded with data length K from four to 32 bits. To reduce complexity with little performance degradation, the uses only a single Viterbi decoder with 16 add compare select (ACS) circuits working in parallel. A single external Kx30 synchronous RAM is used for the input data. The input data is read for 2L+K clock cycles, where K is the data length and L = 32 is the window training length. The decoder inputs the data in reverse order modulo K so as to minimise the decoder delay. The last L+K path decisions are stored in memory where a traceback is performed, taking an additional L+K clock cycles. The last K bits of the traceback are output as the decoded data. A pipeline delay of D = 5 clock cycles gives a total decoding time of 3L+2K+D = 101+2K clock cycles. Figure 1 shows the schematic symbol for the decoder. The VHDL core can be used with Xilinx Integrated Software Environment (ISE) or Vivado software to implement the core in Xilinx R0I[5:0] R1I[5:0] R2I[5:0] R3I[5:0] R4I[5:0] START CLK K[5:0] G0I[1:3] G1I[1:3] G2I[1:3] G3I[1:3] G4I[1:3] GS RST RR RA[4:0] XD XDR XDA[4:0] BUSY FINISH Figure 1: schematic symbol. FPGA s. Table 1 shows the performance achieved with various Xilinx parts. T cp is the minimum clock period over recommended operating conditions. These performance figures may change due to device utilisation and configuration. Signal Descriptions BUSY Decoder Busy CLK System Clock FINISH Decoder Finish G0I G5I External Code GS External Code Select 0 = DVB S2/S2X polynomials 1 = Use G0I to G4I K Data Length (4 32) R0I R4I Received Data RA Received Data Address RR Received Data Ready RST Synchronous Reset START Decoder Start XD Decoded Data Output XDA Decoded Data Address XDR Decoded Data Ready Code Figure 2 gives a block diagram of a rate 1/5 16 state (m = 4) non systematic encoder. X is the data input and Y0 to Y4 are the coded outputs. 1

X D s 0 s 1 s 2 s 3 D D D g 1 0 g 2 0 g 3 0 Y0 g 1 1 g 2 1 g 3 1 Y1 g 1 2 g 2 2 g 3 2 Y2 g 1 3 g 2 3 g 3 3 Y3 g 1 4 g 2 4 g 3 4 Figure 2: 16 state non systematic convolutional encoder. Y4 The code polynomial coefficients are GiIj = g j i {0, 1}, 0 i 4, 1 j 3. Table 1: Performance of Xilinx parts. Data Rate (Mbit/s) Xilinx Part T cp (ns) K=8 K=16 K=32 XC5VLX30 1 4.572 14.9 26.3 42.4 XC5VLX30 2 3.914 17.4 30.7 49.5 XC5VLX30 3 3.480 19.6 34.5 55.7 XC6VLX75T 1 3.876 17.6 31.0 50.0 XC6VLX75T 2 3.424 19.9 35.1 56.6 XC6VLX75T 3 3.093 22.1 38.8 62.7 XC7Z010 1 5.554 12.3 21.6 34.9 XC7Z010 2 4.592 14.8 26.1 42.2 XC7Z010 3 4.103 16.6 29.3 47.2 XC7A35T 1 5.476 12.4 21.9 35.4 XC7A35T 2 4.496 15.2 26.7 43.1 XC7A35T 3 3.999 17.0 30.0 48.4 XC7K70T 1 3.502 19.5 34.3 55.3 XC7K70T 2 2.825 24.2 42.5 68.6 XC7K70T 3 2.612 26.1 46.0 74.2 The encoder polynomials are defined as g i (D) 1 g 1 i D g 2 i D 2 g 3 i D 3 D 4 (1) where D is the delay operator and + indicates modulo 2 (exclusive OR) addition. It is usual practice to express the coefficients in octal notation, e.g., g 4 = 31 8 = 11001 2 g 4 (D) = 1 + D + D 4. This corresponds to G4I[1:3] = 100 2. The DVB S2/DVB S2X standard is selected when GS = 0. It has code polynomials g 0 = 25 8, g 1 = 27 8, g 2 = 33 8, g 3 = 37 8 and g 4 = 31 8. When GS = 1, the external code inputs G0I to G4I are selected. Tail biting is achieved by initialising the encoder shift register (without transmitting any coded bits) with the last m = 4 bits of the K data bits so that s 3 = x K 4, s 2 = x K 3, s 1 = x K 2 and s 0 = x K 1, where (s 3,s 2,s 1,s 0 ) is the encoder state and x 0 to x K 1 is the length K input data. The K data bits are then input to produce the 5K coded bits. No tail bits are transmitted. Viterbi Decoder The Viterbi decoder is designed to efficiently decode short length tail biting convolutional codes. 2

1 A 10 A 2 11 A 2 A 2 Q BPSK Q A QPSK 0 A A 2 00 01 Figure 3: BPSK and QPSK signal sets. Theory of Operation The Viterbi decoding algorithm [3] finds the most likely transmitted sequence given the received noisy sequence. For binary phase shift keying (BPSK) or quadrature phase shift keying (QPSK) modulation the received signal is described by R i A((1 k 2yi ) m k n i ) (2) k where A is the signal amplitude, y i k {0, 1}, i = 0 to 4 correspond to the coded bits, m = 1 for BPSK or m = 2 for QPSK, and n i k is a Gaussian distributed random variable with zero mean and normalised variance 2. Figure 3 shows the signal sets for BPSK and QPSK. We have 2 2mR E b N 0 1 P P (3) where E b N 0 is the energy per bit to single sided noise density ratio and R = K/N is the code rate (K is the number of information bits and N is the number of coded bits). Since a zero is transmitted as +A m and a one is transmitted as A m the sign bit of a noiseless R i k in two s complement notation is equal to y i k. The value of A directly corresponds to the 6 bit signed magnitude inputs. The 6 bit inputs have 63 quantisation regions with a central dead zone. The quantisation regions are labelled from 31 to +31. Due to quantisation and limiting effects the value of A should be adjusted according to the received signal to noise ratio. For example, for rate 1/5, we recommend that A = 10.7 be used. This value of A lies in quantisation region 11 (which has a range between 10.5 and 11.5). Example 1: Rate 1/5 BPSK code operating at E b N 0 = 3.5 db. From (3) we have 2 = 1.1167. Decoder Operation The optimum maximum likelihood decoder for a tail biting convolutional code requires 2 m = 16 Viterbi decoders for each of the 2 m identical start and end states of the code. The sequence with the smallest state metric (SM) is then chosen as the decoded sequence. To reduce decoder complexity, a suboptimal algorithm is used. The input data is first input for L training symbols, followed by K symbols (the main sequence) and then L post training symbols. The training symbols ensure that the SMs are close to their correct values at the start of the main sequence. The post training symbols are used to ensure reliable path decisions are available at the end of the main sequence. For a large enough L, little performance degradation is achieved compared to the optimal algorithm. If the symbols are input in a forward sequence, the traceback operation will output the decoded bits in a reverse sequence. To output the decoded bits in a forward sequence then requires a small output memory and an additional delay of K clock cycles. To avoid this reversing step, we instead input the data in reverse sequence. The traceback will then output the data in a forward sequence. A reverse trellis is used, which is obtained by time reversing the code polynomials, for example 10111 becomes 11101. If the main sequence is input in reverse order as R K 1 down to R 0, then the traceback is output as X 0 to X m 1. The L training and post training input symbols are then added to the main sequence in groups of K symbols, with the first symbol having address K 1+L mod K = L 1 mod K. For example, for K = 16, the first symbol has address 31 mod 16 = 15. Figure 4 illustrates the Viterbi decoder input timing for K = 16. After the START signal is sent, the decoder will read the received data at the CLK speed. It is assumed that the received data is stored in a synchronous read RAM of size Kx30. The received data ready signal RR goes high to indicate the data to be read from the address given by RA[4:0]. The BUSY signal remains high during decoding. The START signal is ignored during decoding, except for the last decoded bit that is output. Figure 5 illustrates the Viterbi decoder output timing for K = 16. The decoded output XD is output 3

CLK START RR RA 15 14 13 12 11 3 2 1 0 RxI R 15 R 14 R 13 R 12 R 4 R 3 R 2 R 1 R 0 BUSY Figure 4: Viterbi Decoder Input Timing (K = 16). while XDR is high with XDA[4:0] indicating the bit address. FINISH goes high for the last decoded bit. Data Format The decoder uses 6 bit signed magnitude quantisation for R0I to R4I. Table 2 shows the 6 bit quantisation ranges. Note that 0 and 32 indicate the central dead zone and have the same range. Note that most analog to digital (A/D) converters do not have a central dead zone. For maximum performance, we recommend that 7 bit A/Ds are used with the output converted to 6 bit so that the appropriate ranges are obtained. For input data quantised to less than 6 bits, the data should be mapped into the most significant bit positions of the input, the next bit equal to 1 and the remaining least significant bits tied low. For example, for 3 bit received data R0T[2:0], where R0T[2] is the sign bit, we have R0I[5:3] = R0T[2:0] and R0I[2:0] = 4 in decimal (100 in binary). Table 2: Quantisation for R0I to R4I. Decimal Binary Range 31 011111 30.5 30 011110 29.5 30.5 2 000010 1.5 2.5 1 000001 0.5 1.5 0 000000 0.5 0.5 32 100000 0.5 0.5 33 100001 1.5 0.5 34 100010 2.5 1.5 62 111110 30.5 29.5 63 111111 30.5 CLK XDR XDA 0 1 2 3 12 13 14 15 XD X 0 X 1 X 2 X 3 X 12 X 13 X 14 X 15 BUSY FINISH Figure 5: Viterbi Decoder Output Timing (K = 16). 4

Punctured Code Operation Manual puncturing can be performed by forcing R0I[4:0] to R4I[4:0] low. For example, rate 2/3 can be obtained by puncturing a rate 1/2 code with puncturing patterns of 11 for R0I and 10 for R1I. That is, R0I is not punctured, R1I is forced low every other decoded bit and R2I to R4I are always punctured. Other Inputs The RST input when high synchronously forces all flip flops low. This is useful for VHDL simulations where flip flops are initially in an unknown state. Decoder Speed The decoding speed is given by F f d d 2 (3L D) K (4) where F d is the internal clock speed, K is the data length, L = 32 is the training length and D = 5 is the pipeline delay. For example, if K = 16 and F d = 300 MHz, the decoding speed is 36.0 Mbit/s. Simulation Software Free software for simulating the Viterbi decoder in additive white Gaussian noise (AWGN) is available by sending an email to info@sworld.com.au with va04dsim request in the subject header. The software uses an exact functional simulation of the Viterbi decoder, including all quantisation and limiting effects. Figure 6 shows the bit error rate (BER) and frame error rate (FER) performance obtained for the standard rate 1/5 16 state tail biting convolutional code decoded by the Viterbi decoder for K = 16 and A = 10.7. No puncturing is performed. Ordering Information SW SOP (SignOnce Project License) SW SOS (SignOnce Site License) SW VHD (VHDL ASIC License) All licenses are perpetual and include Xilinx VHDL cores, unlimited instantiations, free updates for one year and free lifetime support. SOP allows the core to be used for a specified project. SOS allows unlimited projects for a specified development site. VHD includes a VHDL core customised for your ASIC. Note that Small World Communications only provides software and does not provide the actual devices themselves. Please contact Small World Communications for a quote. BER 0.1 0.01 0.001 0.0001 1e-005 1e-006 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Eb/No (db) FER BER Figure 6: Standard 16 state, rate 1/5, K = 16 tail biting convolutional code performance. References [1] ETSI, Digital Video Broadcasting (DVB); Second generation framing structure, channel coding and modulation systems for broadcasting, interactive services, news gathering and other broadband satellite applications; Part 1: DVB S2, ETSI EN 302 307 1 V1.4.1, Nov. 2014. [2] ETSI, Digital Video Broadcasting (DVB); Second generation framing structure, channel coding and modulation systems for broadcasting, interactive services, news gathering and other broadband satellite applications; Part 2: DVB S2 Extensions (DVB S2X), ETSI EN 302 307 2 V1.1.1, Feb. 2015. [3] A. J. Viterbi, Error bounds for convolutional codes and an asymptotically optimum decoding algorithm, IEEE Trans. Inform. Theory, vol. IT 13, pp. 260 269, Apr. 1967. Small World Communications does not assume any liability arising out of the application or use of any product described or shown herein; nor does it convey any license under its copyrights or 5

any rights of others. Small World Communications reserves the right to make changes, at any time, in order to improve performance, function or design and to supply the best product possible. Small World Communications will not assume responsibility for the use of any circuitry described herein. Small World Communications does not represent that devices shown or products described herein are free from patent infringement or from any other third party right. Small World Communications assumes no obligation to correct any errors contained herein or to advise any user of this text of any correction if such be made. Small World Communications will not assume any liability for the accuracy or correctness of any engineering or software support or assistance provided to a user. 2017 Small World Communications. All Rights Reserved. Xilinx and Vivado are registered trademark of Xilinx, Inc. All XC prefix product designations are trademarks of Xilinx, Inc All other trademarks and registered trademarks are the property of their respective owners. Small World Communications, 6 First Avenue, Payneham South SA 5070, Australia. info@sworld.com.au ph. +61 8 8332 0319 http://www.sworld.com.au fax +61 8 8332 3177 Version History 0.00 28 July 2017. preliminary product specification. 0.01 7 August 2017. Deleted DCS input. Added GS and G0I to G5I inputs. Increased range of K from 8 16 bits to 4 32 bits. 1.00 18 August 2017. First release. Added decoder complexity and performance values. Updated channel performance figure. Decreased pipeline delay from D = 6 to D = 5. Corrected RA[4:0] start address. 6