Iterative Demodulation and Decoding of DPSK Modulated Turbo Codes over Rayleigh Fading Channels

Iterative Demodulation and Decoding of DPSK Modulated Turbo Codes over Rayleigh Fading Channels Bin Zhao and Matthew C. Valenti Dept. of Comp. Sci. & Elect. Eng. West Virginia University Morgantown, WV 26506-6109, USA bzhao@csee.wvu.edu, mvalenti@wvu.edu Abstract In this paper we propose a new method to implement turbo codes over fading channels by extending the idea of turbo DPSK [3] to include turbo outer codes. An analytical tool has been developed to estimate the performance of extended turbo DPSK. A simulation study shows that extended turbo DPSK performs worse than BPSK modulated turbo codes with coherent detection. The energy inefficiency of extended turbo DPSK results from a big energy gap between BPSK and DPSK modulation at low E s /N o where turbo codes operate. Increasing the rate of the turbo code will actually reduce the loss in energy efficiency because higher rate turbo codes wor at a relatively higher E s /N o region where the energy gap between DPSK and BPSK shrins. 1. Introduction Due to their remarable performance in AWGN and flat fading channels with perfect channel estimation, turbo codes have been widely studied and applied to different communication systems. However in noncoherent and partially-coherent channels, a severe penalty in energy efficiency occurs [1]. For instance, in AWGN channels, turbo codes lose 3.5 db in energy efficiency with differentially detected DPSK modulation, while with non-coherent FSK, turbo codes lose 6-7 db in energy efficiency. Therefore in the design of turbo codes, coherent detection techniques are usually necessary to ensure their superb performance. M-ary phase shift eying (MPSK) is a popular digital modulation technique suitable for coherent detection. The information signal is encoded in the phase of the carrier signal. Demodulation requires absolute phase information to achieve ultimate performance. However, channel distortion usually maes coherent detection much more complicated and costly. Besides complexity, the disadvantages of coherent detection include sensitivity to acquisition time and phase tracing errors. Possible solutions include using a training sequence or using pilot symbol assisted modulation. The pilot symbol approach is considered to have higher bandwidth and energy efficiency than the training sequence approach. The fading rate of the channel is a critical factor in the design of the insertion rate and pattern for pilot symbol assisted modulation. Another common modulation technique to relieve the difficulty in coherent detection

is to differentially encode the information before transmission. Coherent detection applied to differentially encoded PSK (DEPSK) results in twice the Bit Error Rate () of PSK, because one error in phase detection will affect both the current and subsequent recovered information bits. A much simpler partially-coherent detection alternative is differentially encoded differentially detected PSK (DPSK). The information is recovered from the phase difference of adjacent symbols. Although differential detection is much easier to implement, even more performance degradation occurs in DPSK compared to DEPSK. DPSK performance is observed to be asymptotically 2-3 db worse than PSK at high signal to noise ratio (SNR) when no coding is used. However, the gap in energy efficiency between PSK and DPSK actually is much larger at low SNR where most concatenated codes wor. The gap tends to increase as SNR decreases. This phenomenon gives an intuitive explanation for the severe performance degradation of turbo codes in non-coherent and partially coherent channels. In the recent effort to achieve near coherent detection performance in unnown or time-varying channels where absolute coherent detection is difficult, two major techniques have been proposed, one using pilot symbols in conjunction with coherent detection to estimate the channel [2], and another that uses per-survivor-processing in conjunction with differential modulation [3][4]. In [2], pilot symbols are used to obtain initial channel estimates. After each iteration of turbo decoding, the decoded data is used along with the pilot symbols to refine the channel estimates. For channel estimation, a Weiner filter is used to obtain a minimum mean-square error estimate of the channel. Although the channel estimation error will raise the error floor of the turbo code, simulation results indicate a performance penalty of less than 0.5 db relative to ideal coherent detection at with a normalized fade rate of 0.005. In [3], a DPSK modulated convolutional code is modeled as a serially concatenated code and termed Turbo DPSK. The proposed a posteriori probability (APP) DPSK demodulator incorporates persurvivor-processing [4] and linear prediction for channel estimation within the Bahl- Coce-Jeline-Raviv (BCJR) algorithm. Simulation curves show the proposed coded DPSK system appears to perform asymptotically as well as coded coherent PSK in an AWGN channel, while it can marginally outperform (coded) coherent PSK in Rayleigh fading due to the time diversity introduced by the differential encoder. The initial motivation for our paper is to find a good method to implement turbo codes in unnown fading channels. In

section 2, we describe the structure of extended turbo DPSK with turbo outer codes. In section 3, we develop a simple analytical tool to predict the performance of extended turbo DPSK. Instead of using per-survivor processing, perfect channel estimation is assumed to investigate the potential advantage of extending turbo DPSK by using turbo outer codes. The performance reported here can be used as a benchmar for future research wor on this topic. 2. System Model We introduce an identical coding scheme as [3], the major difference is that a (K=4) turbo code rather than a convolutional code serves as the outer code. Conceptually, we consider the DPSK inner encoder to be a recursive rate 1 convolutional code, which in turn can be represented by a two-state trellis. One critical component of [3] is the APP DPSK demodulator, which incorporates channel estimation into the MAP decoder by using linear prediction and persurvivor-processing. A time window sliding over multiple adjacent stages of the DPSK trellis constructs a super trellis for this APP demodulator. However in this paper, we assume perfect channel estimation, and thus the APP DPSK super trellis can be reduced to a simple DPSK two-state trellis without losing optimality. There are two other important components in our system diagram which is shown in Figure 1: the turbo interleaver and the channel interleaver. The turbo interleaver turbo interlea ver RSC RSC M U X Turbo Encoder Channel Inte rle a ve r D P S K Channel Turbo decoder + - Deint channel Interleaver + - APP Dem od extrinsic in fom ation Figure 1: System model for serial concatenation of turbo code and DPSK modulation

provides interleaving gain due to its spectral thinning functionality [5], while the channel interleaver will not only offer interleaving gain, but also help to overcome the correlated channel. In [6], it is shown that with perfect channel estimation, the fully interleaved channel has at least 3 db more energy efficiency at = over that of correlated Rayleigh fading with BT=0.01 and even more over correlated fading with BT=0.001. A spread random interleaver will be used for channel interleaving [7]. As for the outer code, we follow the UMTS turbo code standard [8]. Each recursive constituent code is a rate ½ convolutional code with generator polynomial (1, 13/15). Both trellises are terminated with independent tails. The design of our turbo interleaver also follows the UMTS standard, which guarantees a sense of structural pseudo-randomness. The log-map [9] algorithm is applied to decode both the inner and outer code. For convenience of explanation, we refer to the iterative decoding just within the turbo decoder as the local iteration and the iterative DSPK demodulation and (local turbo) decoding as the overall iteration. The soft output of both inner and outer codes are derived using where Pr( X = 1 Y ) Λ = log (1) Pr( X = 0 Y ) Λ is the log lielihood ratio (LLR) of X, X represents the bit for which a LLR is desired, and Y represents the received channel sequence. X could be either an input bit or an output bit of the constituent encoder. For the APP DPSK demodulator, X represents the input information of the DPSK encoder, while for the turbo decoder, X is the output of the turbo encoder representing either a systematic or a parity bit. The LLR can be further broen down into three terms: Pr( Y X Λ = log Pr( Y X = 1) Pr( Yi + log = 0) Pr( Y i X X = 1) = 0) Pr( X = 1) + log ( 2) Pr( X = 0) The third term in (2) is the extrinsic information from the other bloc in the concatenated scheme. For the DPSK demodulator, it is the extrinsic information from the turbo decoder. To avoid information saturation during iterative processing, it should be subtracted from (2) before being fed into the turbo decoder as a soft input. The soft output from the turbo decoder has the same form as (1) and (2), however for the turbo code, the LLR value of the extrinsic information is set to zero; therefore the third part in (2) disappears. Within the turbo decoder, only one local iteration is used per overall iteration. The first term in (2) is the soft value passed in from the output of the APP DPSK demodulator. The second term is the extrinsic information generated by the action of the turbo decoder. Therefore, after interleaving, it could be

considered to be extrinsic information and fed into the DPSK APP demodulator during the next round of iteration. A generalized expression for the branch metric used by the APP DPSK demodulator under unnown fading channels can be expressed as [3]: ~ 2 y b F( Y, B) z z log(1 + e ) 2 2σ n ~ for a ~ = 1 γ ( b ) = ~ 2 y b F( Y, B) z log(1 + e ) 2 2σ n ~ for a = 0 (3) where z is the LLR value of the input extrinsic information, y is the channel output, b ~ is the channel input, a ~ is the DPSK input, F(Y,B) is the estimated channel gain which is a function of survivor path B and channel output sequence Y, and σ n 2 is the noise variance. Under perfect channel estimation, F(Y,B) is the fading 2 coefficient h, while σ n is the variance of the Gaussian noise component. For an AWGN channel, F(Y,B) = 1. The branch metric of the turbo outer code can also be expressed as (3), except that z is set to 0 and F(Y,B) = 1. 3. Analysis of Extended Turbo DPSK In order to analyze the iterative decoding mechanism of concatenated codes, [10] models the probability density of the extrinsic information in iterative decoding using a Gaussian approximation and then computes the mean and variance using the concept of Gaussian density evolution. The input and output Gaussian means and variances of individual SISO modules can be determined by simulation. The Gaussian density evolution can be visualized by plotting a curve for each component decoder showing the output SNR versus the input SNR. When the input-output SNR relationship for each of two component decoders are plotted simultaneously on the same figure, they will form an iterative decoding tunnel. Only when the SNR of the channel is above a certain threshold, will the decoding tunnel be open and the turbo processing converge. The turbo outer code used in our proposed structure maes tunnel analysis much more difficult. Therefore we present here a simpler alternative analytical tool to predict the decoding performance of extended turbo DPSK. We model extended turbo DPSK as a serially concatenated code with a turbo outer and a rate 1 convolutional inner code. Therefore, the bit stream after concatenated encoding could still be treated as a BPSK modulated signal. The analysis of extended turbo DPSK requires the definition of three terms: Ideal APP demodulator: A transformer that maps the input sequence at a particular

E s /N o to an output sequence with a different E s /N o according to the decoding rule of the APP DPSK demodulator. Ideal turbo decoder: A transformer that maps the input sequence at a particular E s /N o to an output sequence with a different E s /N o according to the decoding rule of the turbo code. Processing gain: E s /N o of the output sequence divided by E s /N o of the input sequence (or subtracted if E s /N o is in db). When the input SNR is sufficiently high, iterative APP demodulation and turbo decoding can be considered to be a positive feedbac system between two cascaded SNR transformers. As a result, the total processing gain mainly comes from turbo processing. Performance will converge as the extrinsic information saturates through the process of turbo iteration. For convenient analysis, we simultaneously plot curves of the turbo code with one local iteration, the turbo code with 10 local iterations, APP DPSK demodulation, and coherent BPSK demodulation in Figure 2 (appended to the end of the paper). To understand the utility of this figure, first pic an input E s /N o point 0 on the BPSK curve; it has a corresponding E s /N o point 1 on the DPSK curve. Through ideal DPSK APP demodulation, it can be transformed into the corresponding output E s /N o point 2 on the BPSK curve with the same. The E s /N o of point 2 on the BPSK curve is now the input E s /N o of the turbo code. The output of the turbo code with one local iteration is denoted as point 3 on the turbo code curve. The corresponding output E s /N o of the turbo code with the same is mapped to point 4 on the BPSK curve, thus finishing the first round of iteration. For the next iteration, simply repeat the above process with a starting E s /N o at point 4. From this example, we realize that the APP DPSK demodulator transforms input E s /N o at point 0 to output E s /N o at point 2, while the turbo decoder transforms E s /N o at point 2 to output E s /N o at point 4. Therefore, the APP DPSK demodulator has negative processing gain which degrades the performance and the turbo decoder has positive processing gain. If, during each local iteration, the improvement gained through the turbo decoder is greater than the degradation caused by the APP DPSK demodulator, the iterative processing will converge and vice versa. In Figure 2, we show a rectangular box abcd of convergence, where the E s /N o improvement by the turbo decoder is equivalent to the APP DPSK demodulator s degradation. Whenever the starting E s /N o is higher than point a, the iterations will converge, however, if the starting E s /N o is less than point a, the iterations will not converge. We can visualize the evolution of E s /N o in this way as the turbo processing goes on. In Figure 2, we setch both routes

of convergence and non-convergence. In the tunnel theory, point a is equivalent to the threshold input SNR required to open the decoding tunnel. Through Figure 2, we can also see that the extended turbo DPSK performance can be improved by increasing the number of local iterations. If we use 10 local iterations within the turbo decoder, the E s /N o transformer corresponding to the turbo code will be changed to the steepest curve in the plot. As a result, the threshold box will be shifted left by about 0.5 db. Another advantage of using more local iterations within the turbo decoder is that, with a steeper waterfall curve, extended turbo DPSK will converge much faster. However, this will require significantly higher computational load than that of the extended turbo DPSK with just one local iteration. In Figure 2, the threshold point a for the rate ½ extended turbo DPSK is at about E s /N o = 0.5 db, which corresponds to E b /N o = 3.5 db. Figure 3 shows results for the rate WXUERFRGH, where the threshold point a now occurs at about E s /N o = 1.3 db, which corresponds to E b /N o = 3.47 db. Thus through this analysis, we can predict that although the rate WXUERFRGHSHUIRUPV db better than the rate ½ turbo code with coherently detected BPSK, in the extended turbo DPSK scheme, they will have approximately the same performance. Moreover, the 3-3.5 db loss in energy efficiency for turbo codes with differential detection reported in [1] can be predicted by a simple comparison of the DPSK and BPSK curves at very low SNR. 4. Simulation Results Altogether 8 groups of simulation curves have been generated based on two different code rates (½ and WZRNLQGVRI channel models (AWGN and fully interleaved Rayleigh fading) and two frame sizes (190 and 640). For the AWGN channel, every group contains four curves. The first one (i.e. most energy efficient) corresponds to a turbo code with BPSK modulation and coherent detection. The second one is for the extended turbo DPSK technique with 10 overall iterations and 1 local iteration per overall iteration. The third one is for APP demodulated turbo code with 10 local iterations and only 1 overall iteration (i.e. no feedbac from the turbo decoder bac to the APP DPSK demodulator). The last one is conventional DPSK detection followed by 10 local iterations of turbo decoding. Lie the third curve, no information is fed bac from the turbo decoder to the DPSK demodulator. For the fully interleaved Rayleigh fading channel, only the first three curves are generated for each group, since differential detection is impossible in a fully interleaved channel.

10 1 conventional dps with turbo code dps APP decoder with turbo code turbo dps 10 iterations turbo bps 10 1 conventional dps with turbo code dps APP decoder with turbo code turbo dps 10 iterations turbo bps 0 1 2 3 4 5 Figure 4: Performance of rate WXUER FRGH ZLWK various detection techniques in AWGN channel, frame size:640. 0 1 2 3 4 5 Figure 6: Performance of rate WXUbo code with various detection techniques in AWGN channel, frame size: 190. 10 1 conventional dps with turbo code dps APP decoder with turbo code turbo dps 10 iterations turbo bps 10 1 conventional dps with turbo code dps APP decoder with turbo code turbo dps 10 iterations turbo bps 0 1 2 3 4 5 0 1 2 3 4 5 Figure 5: Performance of rate ½ turbo code with various detection techniques in AWGN channel, frame size:640. With a frame size of 640 bits, a code rate of DQGDQ$:*1FKDQQHO)LJXUH there is about 3 db energy gap between the turbo code with coherently detected BPSK modulation (called ideal turbo code from now on for brevity) and the turbo code with Figure 7: Performance of rate ½ turbo code with various detection techniques in AWGN channel, frame size: 190. differentially detected DPSK. As for the extended turbo DPSK case, the waterfall part appears around 3.5 db, which closely matches our prediction. There is still an energy gap of 2.5 db between the ideal turbo code and extended turbo DPSK. This

10 1 dps APP decoder with turbo code turbo dps 10 iterations turbo bps 10 1 dps APP decoder with turbo code turbo dps 10 iterations turbo bps 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 8 Figure 8: Performance of rate WXUER FRGH ZLWK various detection techniques in fading channel, frame size:640 Figure 10: Performance of rate WXUERFRGHZLWK various detection techniques in fading channel, frame size:190 10 1 dps APP decoder with turbo code turbo dps 10 iterations turbo bps 10 1 dps APP decoder with turbo code turbo dps 10 iterations turbo bps 0 1 2 3 4 5 6 7 8 Figure 9: Performance of rate ½ turbo code with various detection techniques in fading channel, frame size:640 is primarily due to the fact that at low SNR, there is a large energy gap between BPSK and DPSK that can t be recovered by iterative processing. However, there is a processing gain about 0.5 db between the 0 1 2 3 4 5 6 7 8 9 Figure 11: Performance of rate ½ turbo code with various detection techniques in fading channel, frame size: 190 extended turbo DPSK scheme and the differentially detected turbo code. For the rate ½ code (Figure 5), the waterfall part of the coherently detected turbo code has been pushed out by about 0.8 db,

while the rate ½ extended turbo DPSK assumes the same performance as its rate counterpart. Therefore the energy gap between the rate ½ coherent turbo code and the corresponding extended turbo DPSK shrins to 1.75 db. For the shorter frame size of 190 bits (Figures 6&7), the energy gaps are about the same as for the 640 bit case, except that the waterfall parts of the curves are less steep and pushed more outside. In the fully interleaved Rayleigh fading channel, the energy gap among the curves tends to increase. Specifically, with a frame size of 640 (Figure 8), the rate FRKHUHQW turbo code is 1 db worse than its counterpart in the AWGN channel, while the rate extended turbo DPSK performs 2.5 db worse than its AWGN counterpart. Therefore, the energy gap between the rate LGHDO turbo code and rate H[WHQGHGWXUER'36. is increased to 4 db due to channel fading. We note the interesting fact that although the rate H[WHQGHG WXUER '36. KDV WKH same performance as the rate ½ extended turbo DPSK in AWGN, it performs better than rate ½ code in fading. This is because the curves for BPSK and APP demodulated DPSK have a larger gap between them in fading channels than in AWGN. Thus, for the rate ½ code (Figures 9&11), the energy gap between the ideal turbo code and extended turbo DPSK shrins to 3 db. In short, the energy gap between the ideal coherent turbo code and extended turbo DPSK tends to decrease as the code rate increases in both AWGN and fading channels. The extent of shrinage depends on code rate and channel type. The analytical tool we developed in section 3 can be applied to predict the performance of extended turbo DPSK. Through this tool, we would predict that the performance curves of the ideal turbo code and extended turbo DPSK would eventually merge at relatively high code rates. 5. Conclusions In this paper we extended the concept of turbo DPSK to include a turbo outer code. We also developed an analytical tool to predict the iterative decoding performance of extended turbo DPSK. Through simulation, we realize that the extended turbo DPSK technique performs worse than ideal coherently detected turbo codes with BPSK modulation. There is an energy gap of about 2.5 db in an AWGN channel for the rate code, but the gap is decreased to 1.75 db for the rate ½ code. In fully interleaved Rayleigh fading, the gap is widened to 4 db for the rate FRGHDQGG%IRUWKHUDWH½ code. The energy inefficiency of extended turbo DPSK results from undesirable performance of APP DPSK modulation at low E s /N o, which is where turbo codes typically operate. Our prediction of the

threshold E s /N o based on the analytical tool matches well with the waterfall region of the simulation curves. We also realize the fact that increasing the coding rate will actually diminish the gap between the ideal turbo code and extended turbo DPSK, because higher rate turbo codes wor in a relatively higher E s /N o region where the energy gap between APP DPSK and coherent BPSK demodulation shrins. In our future research, higher rate turbo- DPSK (higher than ½ ) will be studied in order to find a suitable code rate for which the energy gap between the ideal turbo code and extended turbo DPSK disappears. The performance of turbo DPSK under correlated fading channels needs further investigation. Finally, per-survivor processing techniques will be applied to turbo DPSK under unnown channel states. References [1] E. K. Hall, S. G. Wilson, Turbo codes for noncoherent channels, in Proc., IEEE GLOBECOM, pp. 915-921, Dec. 1992. [2] M.C. Valenti and B.D. Woerner, Iterative channel estimation and decoding of pilot symbol assisted turbo codes over flat-fading channels, IEEE Journal on Selected Areas in Commun., to appear, 2001. [3] P. Hoeher and J. Lodge, Turbo DPSK : Iterative differential PSK demodulation and channel decoding, IEEE Trans. Commun., vol. 47, pp. 837-842, June 1999. [4] R. Raheli, A. Polydoros, C. K. Tzou Persurvivor processing: A general approach to MLSE in uncertain environments, IEEE Trans. Commun., vol. 43, pp. 354-364, Feb. 1995. [5] S. Benedetto, D. Divsalar etc, Serial concatenation of interleaved codes: Performance analysis, design, and iterative decoding, TDA progress report 42-126 Aug. 15, 1996. [6] E. K. Hall and S. G. Wilson, Design and analysis of turbo codes on Rayleigh fading channels, IEEE Journal on Selected Areas in Communications, vol. 16, pp. 160-174, Feb. 1998. [7] S. Dolinar and D. Divsalar, Weight distributions for turbo codes using random and nonrandom permutations, JPL TDA Progress Report, vol. 42, pp. 56-65, Aug. 15 1995. [8] European Telecommunications Standards Institute, Universal Mobile Telecommunications System (UMTS); Multiplexing and Channel Coding (FDD) (3GPP TS 125.212 version 3.4.0), pp. 14-20, Sept. 23, 2000 [9] P. Robertson, P. Hoeher, and E. Villebrun, Optimal and sub-optimal maximum a posteriori algorithms suitable for turbo decoding, European Trans. on Telecommun., vol. 8, no. 2, pp. 119-125, Mar./Apr. 1997. [10] D. Divsalar, Low complexity turbo lie codes, Proc. Second International Symp. on Turbo Codes and Related Topics, pp. 73-80, Sept. 2000.

Threshold box Non-convergence c b d 2 a 1 0 3 4 Convergence DPSK APP demodualtor BPSK 1 iteration Turbo decoding 10 iterations turbo decoding -8-6 -4-2 0 2 4 6 8 Es/No Figure 2: Threshold box and convergence routes of rate 1/2 turbo DPSK c d b Threshold box a DPSK APP demodualtor BPSK 1 iteration Turbo decoding 10 iterations turbo decoding -8-6 -4-2 0 2 4 6 8 Es/No Figure 3: Threshold box of iterative decoding in rate 1/3 Turbo-DPSK