c 2009 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,

Similar documents
A study of turbo codes for multilevel modulations in Gaussian and mobile channels

Parameter Free Iterative Decoding Metrics for Non-Coherent Orthogonal Modulation

PRACTICAL, COMPUTATION EFFICIENT HIGH-ORDER NEURAL NETWORK FOR ROTATION AND SHIFT INVARIANT PATTERN RECOGNITION. Evgeny Artyomov and Orly Yadid-Pecht

NATIONAL RADIO ASTRONOMY OBSERVATORY Green Bank, West Virginia SPECTRAL PROCESSOR MEMO NO. 25. MEMORANDUM February 13, 1985

Dynamic Optimization. Assignment 1. Sasanka Nagavalli January 29, 2013 Robotics Institute Carnegie Mellon University

High Speed, Low Power And Area Efficient Carry-Select Adder

Walsh Function Based Synthesis Method of PWM Pattern for Full-Bridge Inverter

Rejection of PSK Interference in DS-SS/PSK System Using Adaptive Transversal Filter with Conditional Response Recalculation

Space Time Equalization-space time codes System Model for STCM

Efficient Large Integers Arithmetic by Adopting Squaring and Complement Recoding Techniques

Adaptive Modulation for Multiple Antenna Channels

Calculation of the received voltage due to the radiation from multiple co-frequency sources

To: Professor Avitabile Date: February 4, 2003 From: Mechanical Student Subject: Experiment #1 Numerical Methods Using Excel

Digital Transmission

A NSGA-II algorithm to solve a bi-objective optimization of the redundancy allocation problem for series-parallel systems

Revision of Lecture Twenty-One

Review: Our Approach 2. CSC310 Information Theory

Approximate Joint MAP Detection of Co-Channel Signals

Throughput Maximization by Adaptive Threshold Adjustment for AMC Systems

Performance Analysis of Multi User MIMO System with Block-Diagonalization Precoding Scheme

THE USE OF CONVOLUTIONAL CODE FOR NARROWBAND INTERFERENCE SUPPRESSION IN OFDM-DVBT SYSTEM

Side-Match Vector Quantizers Using Neural Network Based Variance Predictor for Image Coding

LOCAL DECODING OF WALSH CODES TO REDUCE CDMA DESPREADING COMPUTATION

Fast Code Detection Using High Speed Time Delay Neural Networks

Uncertainty in measurements of power and energy on power networks

PERFORMANCE EVALUATION OF BOOTH AND WALLACE MULTIPLIER USING FIR FILTER. Chirala Engineering College, Chirala.

A thesis presented to. the faculty of. the Russ College of Engineering and Technology of Ohio University. In partial fulfillment

HUAWEI TECHNOLOGIES CO., LTD. Huawei Proprietary Page 1

A Comparison of Two Equivalent Real Formulations for Complex-Valued Linear Systems Part 2: Results

A High-Speed Multiplication Algorithm Using Modified Partial Product Reduction Tree

Error Probability of RS Code Over Wireless Channel

NOVEL ITERATIVE TECHNIQUES FOR RADAR TARGET DISCRIMINATION

A new family of linear dispersion code for fast sphere decoding. Creative Commons: Attribution 3.0 Hong Kong License

SIMULATED PERFORMANCE A MATLAB IMPLEMENTATION OF LOW-DENSITY PARITY- CHECK CODES. By: Dan Dechene Kevin Peets. Supervised by: Dr.

Performance Study of OFDMA vs. OFDM/SDMA

Passive Filters. References: Barbow (pp ), Hayes & Horowitz (pp 32-60), Rizzoni (Chap. 6)

ANNUAL OF NAVIGATION 11/2006

Understanding the Spike Algorithm

Hierarchical Generalized Cantor Set Modulation

1 GSW Multipath Channel Models

熊本大学学術リポジトリ. Kumamoto University Repositor

Chaotic Filter Bank for Computer Cryptography

Joint Power Control and Scheduling for Two-Cell Energy Efficient Broadcasting with Network Coding

Optimizing Transmission Lengths for Limited Feedback with Non-Binary LDPC Examples

FFT Spectrum Analyzer

International Journal of Network Security & Its Application (IJNSA), Vol.2, No.1, January SYSTEL, SUPCOM, Tunisia.

antenna antenna (4.139)

DESIGN OF OPTIMIZED FIXED-POINT WCDMA RECEIVER

California, 4 University of California, Berkeley

Inverse Halftoning Method Using Pattern Substitution Based Data Hiding Scheme

DESIGN OF OPTIMIZED FIXED-POINT WCDMA RECEIVER

IEE Electronics Letters, vol 34, no 17, August 1998, pp ESTIMATING STARTING POINT OF CONDUCTION OF CMOS GATES

Reduced Cluster Search ML Decoding for QO-STBC Systems

The Spectrum Sharing in Cognitive Radio Networks Based on Competitive Price Game

THE GENERATION OF 400 MW RF PULSES AT X-BAND USING RESONANT DELAY LINES *

Figure.1. Basic model of an impedance source converter JCHPS Special Issue 12: August Page 13

problems palette of David Rock and Mary K. Porter 6. A local musician comes to your school to give a performance

Uplink User Selection Scheme for Multiuser MIMO Systems in a Multicell Environment

DC-FREE TURBO CODING SCHEME FOR GPRS SYSTEM

Comparative Analysis of Reuse 1 and 3 in Cellular Network Based On SIR Distribution and Rate

Define Y = # of mobiles from M total mobiles that have an adequate link. Measure of average portion of mobiles allocated a link of adequate quality.

Keywords LTE, Uplink, Power Control, Fractional Power Control.

Distributed Resource Allocation and Scheduling in OFDMA Wireless Networks

IIR Filters Using Stochastic Arithmetic

High Speed ADC Sampling Transients

Performance Analysis of Power Line Communication Using DS-CDMA Technique with Adaptive Laguerre Filters

Design of Shunt Active Filter for Harmonic Compensation in a 3 Phase 3 Wire Distribution Network

Graph Method for Solving Switched Capacitors Circuits

Phasor Representation of Sinusoidal Signals

Secure Transmission of Sensitive data using multiple channels

A MODIFIED DIFFERENTIAL EVOLUTION ALGORITHM IN SPARSE LINEAR ANTENNA ARRAY SYNTHESIS

Optimal Placement of PMU and RTU by Hybrid Genetic Algorithm and Simulated Annealing for Multiarea Power System State Estimation

arxiv: v1 [cs.it] 30 Sep 2008

MTBF PREDICTION REPORT

Markov Chain Monte Carlo Detection for Underwater Acoustic Channels

Prevention of Sequential Message Loss in CAN Systems

A High-Sensitivity Oversampling Digital Signal Detection Technique for CMOS Image Sensors Using Non-destructive Intermediate High-Speed Readout Mode

In-system Jitter Measurement Based on Blind Oversampling Data Recovery

MIMO-OFDM Systems. Team Telecommunication and Computer Networks, FSSM, University Cadi Ayyad, P.O. Box 2390, Marrakech, Morocco.

Latency Insertion Method (LIM) for IR Drop Analysis in Power Grid

LOW-density parity-check (LDPC) codes first discovered

The Performance Improvement of BASK System for Giga-Bit MODEM Using the Fuzzy System

Resource Allocation Optimization for Device-to- Device Communication Underlaying Cellular Networks

VRT014 User s guide V0.8. Address: Saltoniškių g. 10c, Vilnius LT-08105, Phone: (370-5) , Fax: (370-5) ,

Index Terms Adaptive modulation, Adaptive FEC, Packet Error Rate, Performance.

EFFICIENT FIELD PROGRAMMABLE GATE ARRAY IMPLEMENTATION OF A CONVOLUTIONAL TURBO CODE FOR LONG TERM EVOLUTION SYSTEMS

On the Optimal Solution for BER Performance Improvement in Dual-Hop OFDM Relay Systems

Network Reconfiguration in Distribution Systems Using a Modified TS Algorithm

Multi-transmitter aperture synthesis with Zernike based aberration correction

Learning Ensembles of Convolutional Neural Networks

RESOURCE CONTROL FOR HYBRID CODE AND TIME DIVISION SCHEDULING

Multiple Error Correction Using Reduced Precision Redundancy Technique

An Alternation Diffusion LMS Estimation Strategy over Wireless Sensor Network

Section 5. Signal Conditioning and Data Analysis

Bit-interleaved Rectangular Parity-Check Coded Modulation with Iterative Demodulation In a Two-Node Distributed Array

TECHNICAL NOTE TERMINATION FOR POINT- TO-POINT SYSTEMS TN TERMINATON FOR POINT-TO-POINT SYSTEMS. Zo = L C. ω - angular frequency = 2πf

DIGITAL multi-tone (DMT) modulation, also known as

EE 508 Lecture 6. Degrees of Freedom The Approximation Problem

Enhanced Artificial Neural Networks Using Complex Numbers

A Novel GNSS Weak Signal Acquisition Using Wavelet Denoising Method

Transcription:

c 2009 IEEE. Personal use of ths materal s permtted. Permsson from IEEE must be obtaned for all other uses, n any current or future meda, ncludng reprntng/republshng ths materal for advertsng or promotonal purposes, creatng new collectve works, for resale or redstrbuton to servers or lsts, or reuse of any copyrghted component of ths work n other works. do: http://dx.do.org/10.1109/icc.2009.5199493

Stochastc Decodng of LDPC Codes over GF(q) Gab Sarks, She Mannor and Warren J. Gross Department of Electrcal and Computer Engneerng, McGll Unversty, Montreal, Quebec, Canada H3A 2A7 Emal: gab.sarks@mal.mcgll.ca, she.mannor@mcgll.ca, warren.gross@mcgll.ca Abstract Nonbnary LDPC codes have been shown to outperform currently used codes for magnetc recordng and several other channels. Currently proposed nonbnary decoder archtectures have very hgh complexty for hgh-throughput mplementatons and sacrfce error-correcton performance to mantan realzable complexty. In ths paper, we present an alternatve decodng algorthm based on stochastc computaton that has a very smple mplementaton and mnmal performance loss when compared to the sum-product algorthm. We demonstrate the performance of the algorthm when appled to a GF(16) code and provde detals of the hardware resources requred for an mplementaton. I. INTRODUCTION Low-Densty Party Check (LDPC) codes are lnear block codes that can acheve performance close to the Shannon lmt under teratve decodng. Bnary LDPC codes have receved much nterest and are specfed n recent wreless and wrelne communcatons standards, for example dgtal vdeo broadcast (DVB-S2), WMAX wreless (IEEE 802.16e) and 10 ggabt Ethernet (IEEE 802.3an). Nonbnary LDPC codes defned over q-ary Galos felds (GF(q)) were ntroduced n [1] and were shown to perform better than equvalent bt-length bnary codes for addtve-whte Gaussan nose (AWGN) channels. In [2], Song et al. showed that GF(q) LDPC codes sgnfcantly outperform bnary LDPC and Reed-Solomon (RS) codes for the magnetc recordng channel. Chen et al. [3] demonstrated that LDPC codes over GF(16) perform better than RS codes for general channels wth bursts of nose; thus makng GF(q) LDPC a canddate to replace RS codng n many storage systems. Djordjevc et al. concluded that nonbnary LDPC codes acheve lower BER than other codes whle allowng for hgher transmsson rates when used wth the fber-optc channel [4]. LDPC codes over GF(q) are defned such that elements of the party check matrx H are elements of GF(q). As n the bnary case, these codes are decoded by the sum-product algorthm (SPA) appled to the Tanner graph representaton of the party-check matrx H. Unfortunately, the nonbnary values of H result n very hgh complexty for the check node updates n the graph, presentng a sgnfcant barrer to practcal realzaton. The only hardware mplementaton n the lterature s fully seral, consstng of only one varable node and one check node [5]. There have been a number of approaches to reduce the complexty of the check node update n the lterature. MacKay et al. proposed usng the fast Fourer transform (FFT) to convert convoluton to multplcaton n the check nodes [6]. Song et al. use the log-doman to replace multplcaton wth addton [2]. Declercq et al. ntroduced the extended mn-sum (EMS) algorthm as an approxmaton to the SPA, computng lkelhood values for only a subset of the feld elements; thus reducng the number of computatons performed [7]. Whle these approaches are smpler than a drect mplementaton of the SPA, there s a need to further reduce the complexty for practcal decoder mplementatons. Recently, a new approach to decodng bnary LDPC codes based on stochastc computaton ([8], [9]) was ntroduced n [10]. Stochastc decoders use random bt-streams to represent probablty messages and result n smple node hardware and reduced wrng complexty. Subsequently, area-effcent fullyparallel hgh-throughput decoders wth performance close to the SPA were demonstrated n feld-programmable gate arrays (FPGAs) [11], [12]. We realzed that the complexty benefts of stochastc decodng mght be even greater for nonbnary LDPC codes and could result n a practcal decoder mplementaton. In ths paper, we present a generalzaton of stochastc decodng to LDPC codes over GF(q). The algorthm has sgnfcantly lower hardware complexty than other nonbnary decodng algorthms n the lterature. A. Notaton II. SUM-PRODUCT DECODING Snce most dgtal systems transmt data usng 2 p symbols, the focus n current research s on codes defned over GF(2 p ). In ths secton, we descrbe the SPA for decodng LDPC codes over GF(2 p ). However, t should be noted that the SPA works on any feld GF(q) wth mnor modfcatons to notaton and channel lkelhood calculatons. The elements of GF(2 p ) can be represented as powers of the prmtve element α, or usng polynomals; the latter form s used n ths secton; so that the polynomal (x) = p l=1 lx l 1, where l are bnary coeffcents, represents an element of GF(2 p ). The notaton used n ths secton for representng nternode messages s smlar to that of [7]; namely that U and V represent messages headng n the drecton of check and varable nodes respectvely. The subscrpts represent the source and destnaton nodes. For example, U xy s a message from node x to node y. All the messages are probablty mass functon (PMF) vectors ndexed usng GF(2 p ) elements. Fg. 1a shows ths notaton appled to a Tanner graph. B. Algorthm Whle nonbnary codes can also be decoded usng SPA on Tanner graphs, the check node update s modfed because the

Ths convoluton represents a sgnfcant computatonal challenge n mplementng nonbnary LDPC decoders. III. STOCHASTIC DECODING (a) Fg. 1: Stochastc decoder graphs wth X and X 1 denotng forward and nverse permutaton operatons. (a) message labels, (b) message propagaton wth EMs added to the decoder. elements of H are nonbnary. Therefore the check constrant for a check node of degree d c s: d c h k k (x) = 0, (1) k=1 where h k s the element of H wth ndces correspondng to the check and varable nodes of nterest. Ths s dfferent from the bnary case where the check constrant s d c k=1 k(x) = 0. To accommodate ths change, Davey et al. [1] assgned values from H as labels to the edges connectng varable and check nodes and ntegrated the multplcaton nto the check node functonalty. Declercq et al. [7] ntroduced a thrd node type called the permutaton node whch connects varable and check nodes and performs multplcaton as shown n Fg. 1a; therefore, revertng the check node constrant to d c k=1 j k(x) = 0. Whle the two approaches are functonally equvalent; the one n [7] results n smpler equatons and mplementaton snce all check nodes of the same degree are dentcal. The frst step n the SPA s computng the channel lkelhood vector L v [(x)] for each varable node v whch s computed based on the channel model and modulaton scheme. The outgong message from varable node v to permutaton node z s gven by: U vz = L v d v p=1,p z (b) V pv, (2) where s the term-by-term product of vectors and d v s the varable node degree. Normalzaton s needed so that a GF (2 p ) U vz[a] = 1. Permutaton nodes mplement multplcaton by an element from H when passng messages from the varable to check nodes, and multplcaton by the nverse of an element from H n the other drecton. As shown n [7] the multplcaton and multplcaton by nverse can performed usng cyclc shfts of the postons of the values n a message vector except those values ndexed by 0. The party check constrant does not nclude multplcaton by elements of H anymore; therefore, the check node update equaton s the convoluton of ncomng messages as shown n [7]: V ct = dc p=1,p t U pa. (3) A message n the SPA for LDPC codes over GF(q) s a vector contanng the probabltes of each of the q possble symbols. Stochastc decodng uses streams of symbols chosen from GF(q) to represent these messsages; the number of occurrences of a symbol n a stream dvded by the total number of symbols observed n the stream gves the probablty of that symbol. The advantage of utlzng such a method for message passng les n the smple crcutry requred to manpulate the stochastc streams to reflect lkelhood changes as presented n Secton III-D. Stochastc decodng of bnary LDPC codes results n smple hardware structures. The reader s referred to [8], [9], [10], [11], [12] for detals on bnary stochastc decodng algorthms and ther mplementaton. Smlar notaton to the SPA s used when descrbng the stochastc decodng message updates, the dfference beng that messages are seral stochastc streams nstead of vectors; thus, an ndex t s used to denote the locaton of a symbol wthn a stream and the stream name s overlned, e.g. U vp (t). A. Node Equatons Wnstead et al. [13] presented a stochastc decodng algorthm that uses streams of ntegers nstead of the conventonal bnary streams. In that work, an nteger stream encodes the probabltes of the states n a trells, leadng to a demonstraton of trells decodng of a (16,11) Hammng code and a turbo product decoder bult from the Hammng component decoders. However, that work dd not nterpret the ntegers as fnte feld symbols and dd not utlze GF(q) arthmetc. In ths secton we present the node equatons for a stochastc decoder for LDPC codes over GF(q). Takng the vew that the nonbnary streams are composed of fnte feld elements, we present message update rules that are much smpler than those derved from a straghtforward applcaton of the rules n [13]. In partcular, the trells representaton of the convoluton n the check node reduces to Galos feld addton. Secton III-E demonstrates the performance of the stochastc algorthm when decodng a (256,128)-symbol LDPC code over GF(16). Varable Node: A stochastc varable node of degree d v takes as nput d v stochastc streams from permutaton nodes n addton to one generated based on channel lkelhood values. In [13], the output of a node s updated f ts nputs satsfy some constrant; otherwse, the output remans unchanged from the prevous teraton. To mplement a varable node constrant on an output message stream at tme t, we copy the nput symbol to the output symbol f the nput symbols on all the other ncomng edges are equal at tme t. For a stochastc varable node wth output U vp and nputs V v, we propose the followng update rule: { a f V U vp (t) = v = a, : p (4) U vp (t 1) otherwse

Usng equaton (4) and assumng the nputs are ndependent, the PMF of the output s: P [U vp (t) = c] = P [V v (t) = c] +(1 P [V v (t) = a])p [U vp (t 1) = c] a GF(q) As n [13], f the stochastc streams are assumed to be statonary, then P [U vp (t) = c] = P [U vp (t 1) = c] and the PMF of U vp (t) becomes: P [V v (t) = c] P [U vp (t) = c] = a GF(q) (5) P [V v (t) = a]. (6) Equaton (6) s dentcal to the normalzed output of a sumproduct varable node; therefore, equaton (4) s a vald update rule for the stochastc varable node. Permutaton Node: The functon of the permutaton node s to remove multplcaton by elements of H from the check node constrant. In the sum-product algorthm ths s acheved by a cyclc shft of the message vector elements as n secton II-B. Here, we demonstrate that multplyng the stochastc stream from a varable to a check node by an element of H accomplshes the same result. Assumng a permutaton node p whch corresponds to h = α, the permutaton node output message n a SPA decoder s defned such that each element n the message vector s gven by: U pc [a] = U vp [a.α ], a GF(q). When, n a stochastc decoder, the permutaton node multples all elements of the nput stream by h, the output PMF becomes: P [U pc (t) = a] = P [U vp (t) = a.α ] The SPA and stochastc output PMFs are dentcal and snce the multplcatve group of GF(q) s cyclc and multplcaton s closed on GF(q), the stochastc permutaton node operaton s equvalent to that of the SPA algorthm. Smlarly, t can be shown that for messages passed from check to varable nodes, the nverse permutaton node operaton s multplcaton by h 1. It should be noted that h 0, snce a value of 0 n H sgnfes the lack of a connecton between a varable and a check node. Therefore, there are no permutaton nodes wth a multpler h = 0. Check Node: When dervng the stochastc update message for a check node, a degree-three node s consdered and the result s generalzed to a check node of any degree. Let U 1c and U 2c be the node nputs, whch are assumed to be ndependent, and V cp ts output. From equaton (3), the output of such a node when usng the SPA s gven as: P [V cp = z U 1c, U 2c ] = P [U 1c = x]p [U 2c = y], (7) where s GF(q) addton. x y=z In the stochastc node, we defne the output as the GF(q) addton of nput,.e V cp (t) = U 1c (t) U 2c (t). The PMF of the output s computed as: P [V cp (t) = z] = P [U 1c (t) U 2c (t) = z] (8) = P [U 1c (t) = x]p [U 2c (t) = y]. x y=z The PMFs (7) and (8) are dentcal; therefore t s concluded that GF(q) s a vald update message for a degree-3 stochastc check node. Snce the output of a check node can be computed recursvely [7], the prevous concluson can be generalzed to a check node of any degree, and the output messages for these nodes are gven as: V cp (t) = d c =1, p U c (t), (9) where the summaton s GF(q) addton. It can be readly shown that the prevous node equatons reduce to the bnary ones presented n [10] for GF(2). B. Nose-Dependent Scalng and Edge-Memores In bnary stochastc decodng the swtchng actvty can become very low resultng n poor bt-error-rate performance. Ths phenomenon s called latch-up and s caused by cycles n the graph that cause the stochastc streams to become correlated nvaldatng the ndependent stream assumpton used to derve equatons (4) and (9). Two solutons were proposed n [10]: nose-dependent scalng and edge memores. Both of these methods are used to mprove the performance of the GF(q) decoder. Nose-dependent scalng ncreases swtchng actvty by scalng down the channel lkelhood values. For example, when transmttng data usng BPSK modulaton over an AWGN channel the scaled lkelhood of each receved bt l () s calculated by: l () = [l()] 2ασ 2 n Y, where l() s the unscaled bt lkelhood, σn 2 s the nose varance, and the rato α Y s determned offlne to yeld the best performance n the SNR range of nterest. Accordngly the equaton for computng the channel lkelhood values becomes: p L[(x)] = [l( k )] 2ασ 2 n Y. (10) k=1 Edge memores (EM) are fnte depth buffers nserted between varable nodes and permutaton nodes and randomly reorder symbols n the output streams of varable nodes; thus, they break correlaton between streams wthout affectng the overall stream statstcs. The EM contents are updated wth the varable node output when the node update condton s satsfed, and reman ntact otherwse. The output of the EM s that of the varable node n the frst case, or a randomly selected symbols from ts contents n the second. Due to the

Algorthm Multplcaton Addton LUT FFT-SPA [2] 2 p (d 2 c + 4d c ) p2 p+1 d c + 2 p 0 Log-FFT-SPA [2] 0 (p2 p+1 + 2 p+2 )d c p2 p+1 d c Stoc. d c 1 d c 1 0 Stoc.-LUT 0 d c 1 d c 1 TABLE I: The number of operatons needed by FFT-SPA, Log-FFT-SPA, and stochastc decoders to compute a sngle check node output message ncludng the permutaton node operatons. memory s fnte length, older symbols are dscarded when new ones are added. Fgure 1b demonstrates the message passng mechansm and the locaton of edge memores wthn a stochastc decoder. For complexty comparson, Table I provdes the number of operatons needed to compute a sngle check node output message n the FFT-SPA and Log-FFT-SPA algorthms as presented n [2]. It should be noted that the operatons for the SPA are for real numbers and quantzaton wll degrade the decoder performance; whle those for the stochastc decoder are over a fnte feld GF(2 p ). C. Algorthm Descrpton At the begnnng of the algorthm the edge memores are ntalzed usng scaled channel lkelhood values as PMFs for ther content dstrbuton. The followng steps descrbe the stochastc decodng algorthm for each decodng cycle. 1: Varable node messages are computed usng equaton (4), edge memores are updated where approprate, and messages are sent from edge memores to permutaton nodes. 2: Permutaton nodes perform GF(q) multplcaton on ncomng messages and send the results to check nodes. 3: Check node messages are computed as n equaton (9) and are sent to permutaton nodes. 4: Permutaton nodes perform GF(q) multplcaton by nverse and send resultng messages to varable nodes. 5: Each varable node contans counters C[a] correspondng to GF(q) elements. These counters are ncremented based on ncomng messages and the channel message L(t). A varable node belef s defned as arg max C[a]. 6: Varable nodes belefs are updated accordngly. The streams are processed on a symbol-by-symbol bass, one symbol each cycle (steps 1-5), untl the algorthm converges (the varable node belefs satsfy the check constrants) or a maxmum number of teratons s reached. As n the bnary algorthm presented n [10] the processng s not packetzed. D. Implementaton Whle the stochastc decodng algorthm s defned for any fnte feld; the mplementaton presented n ths secton s lmted to GF(2 p ) as these are the most utlzed felds and they yeld the smplest mplementaton. The polynomal representaton of GF(2 p ) s used when mplementng the algorthm. Ths choce greatly smplfes the crcutry needed to perform GF(2 p ) addton. All gate number estmates assume 2-nput logc gates n a tree confguraton. (a) d v = 2 var. node (b) d c = 4 chk. node Fg. 2: GF(8) stochastc elements. Varable Node: To mplement the operaton specfed by equaton 4, a GF(2 p ) equalty check s needed. XNOR gates and an AND gate are used to perform the check and provde an enable (latch) sgnal to an edge-memory as shown n Fgure (2a). To extend the crcut for a hgher order feld, more XNOR gates are used and connected to a larger AND gate. Ths accommodates the ncrease n the number of bts requred to represent each GF(2 p ) symbol n the stochastc streams. For hgher degree nodes, the number of nputs to each XNOR gate s ncreased. The total number of gates, wthout counters, requred by a varable node s: [p(d v 1)XNOR + (p 1)AND]d v. (11) Each varable node requres a maxmum of 2 p counters to track occurances of each symbol and determne the node belef. The sze of EMs assocated wth a varable node of degree d v s d v lp bts, where l s the EM length. Permutaton Node: Permutaton nodes can be mplemented usng GF(2 p ) multplers. For a partcular code, the symbols arrvng at a permutaton node are always multpled by the same element of H. As a result, the multpler can be desgned to multply by a specfc (constant) element of GF(2 p ) nstead of a generc GF(2 p ) multpler, sgnfcantly reducng crcut complexty. Alternatvely, look-up tables (LUT) can be used snce ther sze would not be large. The multplcaton by nverse for messages passed n the other drecton s mplemented n a smlar manner. If LUTs are used to mplement multplcaton, each node requres two LUTs: one for multplcaton by h and one for multplcaton by h 1. An operaton LUT contans 2 p 1 entres each p bts wde. Check Node: The outgong messages from check nodes are GF(2 p ) summatons of ncomng messages. Snce the GF(2 p ) symbols are represented usng the polynomal form, ths operaton can be realzed utlzng XOR operatons between correspondng bt lnes of messages. The crcut n Fg. 2b s an example of a degree 4 check node n GF(8). To mplement a hgher degree check node, the number of nputs to each XOR gate s ncreased to account for the extra ncomng messages. Extendng ths crcut to hgher order felds can be done by addng more XOR gates. The total number of gates requred by a check node s: [p(d c 1)XOR]d c. (12)

10 0 10-1 SP Stochastc DC max = 10 6 Stochastc DC max = 10 5 10 0 10-1 SP Stochastc DC max = 10 6 Stochastc DC max = 10 5 10-2 Frame Error Rate 10-2 10-3 10-4 Bt Error Rate 10-3 10-4 10-5 10-6 10-5 10-7 10-6 0 0.5 1 1.5 2 2.5 3 3.5 4 E b /N 0 (db) 10-8 0 0.5 1 1.5 2 2.5 3 3.5 4 E b /N 0 (db) Fg. 3: FER for a (256,128)-symbol (2,4)-regular LDPC code over GF(16). EM length = 50, α Y = 0.5. Fg. 4: BER for a (256,128)-symbol (2,4)-regular LDPC code over GF(16). EM length = 50, α Y = 0.5. SNR (db) 2.0 2.5 3.0 3.5 4.0 DCavg (DCmax = 10 6 ) 22599 8888 4243 2329 1433 DCavg (DCmax = 10 5 ) 17958 8511 4209 2326 1433 TABLE II: Average number of decodng cycles. E. Performance Fgures 3 and 4 demonstrate the performance of the stochastc decoder compared to that of a SPA decoder when decodng a (256,128)-symbol LDPC code over GF(16) [14], when usng an AWGN channel, BPSK, and random codewords. The SPA decoder has a maxmum of 1000 teratons, whle the stochastc decoder s maxmum s 10 6 decodng cycles (DC). The performance of the two decoders s very smlar and the two decoders perform dentcally for hgher SNR values. The change n the slope of the error rate graph was also observed n [14]. We note that the maxmum number of decodng cycles s much greater than the average number of decodng cycles as shown n Table II, wth DCavg determnng the decoder throughput. Fgures 3 and 4 demonstrate that, at hgher SNRs, DCmax can be reduced wth a small performance loss. It should be noted that the number of teratons n the SPA decoder and decodng cycles n the stochastc decoder are not drectly comparable. SPA teratons nvolve complex operatons, for example, the node operatons n EMS [15] nvolve sortng and teratng over ncomng message elements; thus, requrng many clock cycles. In a stochastc decoder, a decodng cycle s very smple and can be completed wthn a sngle clock cycle. Also, due to the nature of stochastc computaton, the proposed mplementaton lends tself to ppelnng (due to the random order of the messages, the feedback loop n the graph s broken allowng ppelnng [12]); thus, enablng clock rates faster than those possble wth the SPA. IV. CONCLUSION In ths paper we presented a stochastc decodng algorthm whch we expect to enable practcal hgh-throughput decodng of LDPC codes over GF(2 p ). ACKNOWLEDGEMENT The authors would lke to thank Prof. D. Declercq from ENSEA for helpful dscussons. REFERENCES [1] M. Davey and D. MacKay, Low-densty party check codes over GF(q), IEEE Commun. Lett., vol. 2, no. 6, pp. 165 167, 1998. [2] H. Song and J. Cruz, Reduced-complexty decodng of Q-ary LDPC codes for magnetc recordng, IEEE Trans. Magn., vol. 39, no. 2, pp. 1081 1087, 2003. [3] J. Chen, L. Wang, and Y. L, Performance comparson between nonbnary LDPC codes and reed-solomon codes over nose bursts channels, n Proc. Internatonal Conference on Communcatons, Crcuts and Systems, L. Wang, Ed., vol. 1, 2005, pp. 1 4 Vol. 1. [4] I. Djordjevc and B. Vasc, Nonbnary LDPC codes for optcal communcaton systems, IEEE Photoncs Technology Letters, vol. 17, no. 10, pp. 2224 2226, 2005. [5] C. Spagnol, W. Marnane, and E. Popovc, FPGA mplementatons of LDPC over GF(2 m ) decoders, n Proc. IEEE Workshop on Sgnal Processng Systems, W. Marnane, Ed., 2007, pp. 273 278. [6] D. MacKay and M. Davey, Evaluaton of Gallager codes for short block length and hgh rate applcatons, n In Codes, Systems and Graphcal Models. Sprnger-Verlag, 2000, pp. 113 130. [7] D. Declercq and M. Fossorer, Decodng algorthms for nonbnary LDPC codes over GF(q), IEEE Trans. Commun., vol. 55, no. 4, pp. 633 643, 2007. [8] B. Ganes, Advances n Informaton Systems Scence. Plenum, New York, 1969, ch. 2, pp. 37 172. [9] V. Gaudet and A. Rapley, Iteratve decodng usng stochastc computaton, Electroncs Letters, vol. 39, no. 3, pp. 299 301, Feb. 2003. [10] S. Sharf Tehran, W. Gross, and S. Mannor, Stochastc decodng of LDPC codes, IEEE Commun. Lett., vol. 10, no. 10, pp. 716 718, 2006. [11] S. Sharf Tehran, S. Mannor, and W. J. Gross, An area-effcent FPGAbased archtecture for fully-parallel stochastc LDPC decodng, n Proc. IEEE Workshop on Sgnal Processng Systems, 17 19 Oct. 2007, pp. 255 260. [12], Fully parallel stochastc LDPC decoders, IEEE Trans. Sgnal Process., vol. 56, no. 11, pp. 5692 5703, Nov. 2008. [13] C. Wnstead, V. Gaudet, A. Rapley, and C. Schlegel, Stochastc teratve decoders, n Proc. Internatonal Symposum on Informaton Theory ISIT, 2005, pp. 1116 1120. [14] C. Poullat, M. Fossorer, and D. Declercq, Desgn of regular (2, d c)-ldpc codes over GF(q) usng ther bnary mages, IEEE Trans. Commun., vol. 56, no. 10, pp. 1626 1635, October 2008. [15] A. Vocla, F. Verder, D. Declercq, M. Fossorer, and P. Urard, Archtecture of a low-complexty non-bnary LDPC decoder for hgh order felds, n Proc. Internatonal Symposum on Communcatons and Informaton Technologes ISCIT 07, F. Verder, Ed., 2007, pp. 1201 1206.