Information Processing and Combining in Channel Coding Johannes Huber and Simon Huettinger Chair of Information Transmission, University Erlangen-Nürnberg Cauerstr. 7, D-958 Erlangen, Germany Email: [huber, huettinger]@lnt.de Abstract: It is proposed to characterize the performance of coding schemes by the mutual information between encoder input- and decoder output sequence vs. the capacity of a channel in between, instead of the conventional diagrams of bit error probability vs. signal to noise ratio or raw bit error ratio. That way a description is obtained, which proves to be nearly independent from the channel model used. Furthermore, it universally accounts for the quality of reliability estimation provided by the decoder. Hence, information processing of coding schemes is characterized in an unified framework. Different codes as well as different decoding techniques can be compared and evaluated. By deriving tight bounds for the bit error probability, both a direct connection to conventional performance evaluation techniques is established and a very general method for the analysis of concatenated coding schemes is developed. For this purpose information combining is introduced, which links the proposed characterization to transfer characteristics used within EXIT charts of S. ten Brink. Due to the generalized description of information processing of component codes and decoders together with information combining the analysis of iterative decoding of arbitrarily multiply concatenated coding schemes incorporating serial and parallel concatenated structures becomes feasible. For this analysis the transfer characteristics of the constituent coding schemes are sufficient as long as they are linked by large interleavers. Based on the analysis, which is of extremely low computational complexity, design of novel multiply concatenated structures is demonstrated. Keywords: MAP decoding, soft in soft out decoding, asymptotical analysis, multiply concatenated codes.. INTRODUCTION: A common scale for all channels With respect to hard output decoding a coding scheme as shown in Fig. is usually described by its average bit error ratio versus the bit error probability ɛ or erasure probability p of a binary symmetric channel (BSC) or a binary erasure channel (BEC), or the signal to noise ratio E b /N between the binary phase shift keying modulated transmit signal and the additive white Gaussian noise (BPSK AWGN channel). Figure : System model The average bit error ratio is given by BER = K K BER[j] =E j fber[j]g () j= with BER[j] =Pr(Û[j] 6= U[j]). Traditional performance plots for convolutional codes are shown in Fig. 2 for the BPSK AWGN channel as well as the BSC. Due to the different scaling a comparison of both results is impossible. ν = ν = 6 2 3 4 5 6 4 2 2 4 6 log (E b /N ) [db]! ν = ν = 6 2 3 I(X; Y )! ν = ν = 6 2 3 4 5 2 3 3 2 ɛ! ν = ν = 6 I(X; Y )! Figure 2: Traditional (top) and unified (bottom) hard out performance plots for convolutionally encoded transmission over the BPSK AWGN channel (left) and BSC (right) with BCJR decoding []. But, as there is a one to one correspondence of the parameters E b /N and ɛ of memoryless channels to the capacities of the channel models, a unified
representation, as also shown in Fig. 2, is possible by specifying the channel by its capacity. Obviously there is no substantial difference in the behavior of convolutional codes transmitted over different memoryless symmetric channels. Unfortunately even the unified plots are not suited for the comparison of different decoding techniques, as they do not account for the quality of reliability estimation provided by the decoder. 2. INFORMATION PROCESSING CHARACTERISTICS The aim of the Information Processing Characteristic (IPC) is a characterization of coding schemes w.r.t. soft output, which is (almost) independent from the channel model, does not take into account the kind of postprocessing and hence is suited for the comparison of different decoding techniques. Furthermore, the IPC shall be suited for a comparison of coding schemes even for R>C, which is of interest in concatenated coding as the constituent decoders work in this region, although in this region the bit error probability of all coding schemes is quite high. In [8] several kinds of IPCs have been introduced. As for a single code many encodings and a number of decoding techniques exist, it has to be distinguished between characterization of code properties, properties of encoding and decoding. Due to coded transmission only a subset of all possible channel input vectors X 2 C can be transmitted. Hence, after haven chosen the code, the end to end capacity already can be decreased. To describe this effect, IPC(C) def = K I( X; Y ) (2) defines an upper bound for a given code C, that only is achieved for optimum decoding, i.e. I( U; V )! = I( X; Y ) (3) Obviously, IPC(C) is independent from encoding. From ideal coding, i.e. a coding scheme that achieves the performance given by the rate distortion bound [4] for any C a further upper bound on this IPC can be obtained [8]: IPC (C)» min(c/r,). (4) Optimum symbol by symbol decoding, as e.g. performed by BCJR decoding [] for convolutional codes, is the best performance that can be obtained with realistic decoding complexity. As symbol based decoding does not take into account the dependencies of different symbols its output for different symbols can be highly correlated. But, usually this dependency is not exploited by further processing stages. Interleaving is used to rearrange the output data stream in a way that it appears to be memoryless. Hence, we consider symbol by symbol decoding together with interleaving and express the performance as IPC I (C): IPC I (C) def = Ī(U; Y )= K K I(U i ; Y ) (5) i= The IPC I (C) strongly depends on the choice of the encoder. For the considered symbol wise mutual information Viterbi decoding [5] is sub optimal as it does not minimize the bit error probability and furthermore does not provide any reliability estimation. Thus, the IPC Vd (C) def = Ī(U; ÛVd) = K K I(U i ; Ûi,Vd) (6) i= will be lower than the other IPCs. 3. COMPARISON OF DECODING ALGORITHMS FOR CONVOLU- TIONAL CODES Via comparison of the IPC of a coding scheme to ideal coding the suboptimality of the code structure can be determined. But, the calculation of IPC(C) for arbitrary codes is in general difficult, as the mutual information between vectors of length N has to be determined. Fortunately, a practical way to calculate the IPC(C) of convolutional codes, which are lossless encoded, i.e. I( U; Y )=I( X; Y ), is found via the chain rule of mutual information: I( U; Y ) = I(U ; Y )+I(U 2 ; Y ju ) +I(U 3 ; Y j(u,u 2 )) + (7) For a linear binary, time invariant convolutional code with linear encoding transmitted over a memoryless symmetric channel I(U i ; Y j(u U i )) does not depend on the particular choice of U U i. Hence, without loss of generality we can assume that the all zero information word and due to linear encoding also the all zero codeword has been transmitted: I(U i ; Y j(u U i )) = I(U i ; Y j( } {{} )) (8) i Let us consider a trellis representation of the encoder, such that U i is the i th input bit to the encoder. Then according to (8) all previous information symbols and hence also the current state of the encoder
are known. The optimum decoder always starts from a given state. Due to linearity, this decoding situation is always the same as for the very first bit and starting from the all zero state: I(U i ; Y j(u U i )) = I(U ; Y ), (9) which simplifies (7) to: I( U; Y )=K I(U ; Y ). () BCJR decoding provides optimum estimations V of source symbols U given the channel output sequence Y. Hence, I(U ; Y ) = I(U ; V ) is accessible via Monte Carlo simulations. By measuring the mutual information only between the first source symbol U and the first soft out value V of each block IPC(C) is determined. IPC(C) [bitpersourcesymbol]!.9.8.7.6.5.4.3.2. repetition Figure 3: Example Information Processing Characteristic IPC(C) for optimum soft output decoding of the repetition code and rate /2 convolutional codes transmitted over the BPSK AWGN channel. Fig. 3 shows that even convolutional codes short constraint length perform astonishingly close to the limit of s for C<R.The remaining difference vanishes for increased memory ν. But, for C ß R and higher capacities it becomes more and more difficult to approach the performance of ideal coding by increasing the memory of the convolutional code. Hence, it is obvious, that convolutional codes can be applied more successfully in the region C<R, i.e., as component codes in concatenated coding schemes, than for establishing highly reliable communication. The IPC I (C) foroptimum symbol by symbol decoding of convolutional code can be directly obtained from Monte Carlo simulations of BCJR decoding. Comparing Fig. 4 with Fig. 3 a huge loss of optimum symbol by symbol soft output decoding can IPCI(C) [bit per source symbol]!.9.8.7.6.5.4.3.2. repetition Figure 4: Example Information Processing Characteristic IPC I (C)foroptimum symbol by symbol soft output decoding of the repetition code and systematically encoded rate /2 convolutional codes transmitted over the BPSK AWGN channel. be observed. For C<Rthis loss is dominating such that the IPC I (C) of convolutional code with more memory elements is below the one of code of smaller constraint length, which is reversed from the behavior of the IPC(C). An important exception is the repetition code. As the information block length K =,symbol by symbol decoding is optimum. As stated before, any decoding technique different to BCJR decoding will result in an IPC(C) curve below the one for optimum decoding. In Fig. 5 this can be verified for Viterbi decoding. IPCVd(C) [bit per source symbol]!.9.8.7.6.5.4.3.2. Figure 5: Example Information Processing Characteristic IPC Vd (C) forviterbi decoding of systematically encoded rate /2 convolutional codes transmitted over the BPSK AWGN channel.
In the beginning (C ß ) the slope of IPC Vd (C)is less than one, i.e. the performance of convolutionally coded transmission with Viterbi decoding is worse than uncoded transmission. For any convolutional code IPC Vd (C)» IPC I (C)» IPC(C)» C/R () holds, but the difference is more pronounced for low capacity values. Hence, in concatenated coding optimum symbol by symbol decoding, which does not achieve a significant improvement over Viterbi decoding when convolutional codes are used to establish communication at low error rates, is far superior and worth the additional decoding complexity. 4. IPC OF CONCATENATED CODES To determine the IPC I (C) of convolutional codes of short constraint length already is of high computational complexity. Hence, this method becomes impractical for iteratively decoded concatenated codes. Fortunately asymptotical analysis, e.g. via EXIT charts [4] can be used to determine bounds on the IPC I (C) under the assumption of infinite interleaving and infinitely many iterations. Result of the asymptotical analysis either is convergence of iterative decoding, i.e. that arbitrarily reliable communication is possible and hence, that the end to end mutual information has to be one bit per symbol, or a single intersection point of the transfer characteristics. From this point, which gives the constituent extrinsic informations achievable via iterative decoding, the mutual information between the source symbols and the post-decoding soft output of the decoder has to be determined. Using the concept of information combining introduced in [], maximum ratio combining [3] can be bounded. As proven in [3], the post decoding information is at most as large as when it is assumed that the constituent extrinsic informations are distributed as if transmitted over a binary erasure channel. Under this assumption an upper bound on the IPC I (C) of concatenated codes can be obtained. On the other hand, as modeling the constituent extrinsic informations as noisy transmissions over a binary symmetric channel gives a lower bound on the post decoding information obtained, a further estimation of the IPC I (C) of concatenated codes can be given. This IPC I (C) can be achieved with sufficient interleaving and sufficiently many iterations. Exemplary, IPC I (C) for the rate /2 repeat accumulate code [2] in parallel representation [7], and the original rate /2 turbo code [2] will be determined in the following. Fig. 6 shows EXIT charts for several values of E b /N used to determine the IPC I (C) for the rate /2 repeat accumulate code. In the parallel representation just two recursive rate scramblers with memory ν = are concatenated. Circles mark the intersection points of the transfer characteristics of the scramblers. Assuming infinitely many iterations, the decoding process gets stuck exactly at these points. Hence, from the abscissa and ordinate values of these points upper bounds on the constituent extrinsic informations can be obtained. I(U; E) =I(U; Z2) [bitpersymbol]!.9.8.7.6.5.4.3.2. 6 db 4 db 2 db db db log (E s /N )=3 db I(U; Z )=I(U; E 2 )[bitpersymbol]! Figure 6: EXIT chart for the rate /2 repeat accumulate code in parallel representation. The upper bound on the IPC I (C) is shown in Fig. 7. Additionally a dashed line marks the IPC I (C) under the most pessimistic assumption of information combining. IPCI(C) [bitpersourcesymbol]!.9.8.7.6.5.4.3.2. 4 db log (E s /N )= 6 db 2 db db 3 db db Figure 7: IPC I (C) for the rate /2 repeat accumulate code.
Fig. 8 shows the EXIT chart for the original turbo code (OTC), which has been used to determine its IPC I (C) infig.9. I(U; E) =I(U; Z2) [bitpersymbol]!.9.8.7.6.5.4.3.2. 3 db 2.5 db log (E s /N )= 2.2 db I(U; Z )=I(U; E 2 )[bitpersymbol]! Figure 8: EXIT chart for the rate /2 OTC. IPCI(C) [bitpersourcesymbol]!.9.8.7.6.5.4.3.2. log (E s /N )= db 3.9 db 2.5 db 3 db 2.2 db Figure 9: IPC I (C) for the rate /2 OTC. The difference between optimistic and pessimistic assumptions for information combining are much smaller for the original turbo code compared with the RA code. For low capacities the extrinsic information from the constituent decoders is approximately zero and only the systematic branch contributes to the post decoding information. Hence, information combining is not necessary in this region. Only within the small region of the so called turbo cliff, where the amount of extrinsic information provided from the constituent decoders increases rapidly from zero to one, the model of information combining has an effect at all. Fig. 7 and Fig. 9 show that asymptotical analysis is much more than the determination whether convergence of iterative decoding is possible at a given E b /N. It is possible to obtain an upper bound on the end to end performance of concatenated codes without simulation of iterative decoding. 5. BER ESTIMATION FROM IPC After having determined the IPC I (C) for a concatenation, it is possible to bound the achievable bit error ratio. The IPC I (C) describes a memoryless symmetric end to end channel with binary input U 2 f; g. With Fano s inequality [5] which reads e 2 (BER) H(UjV )= Ī(U; V )= IPC I(C) (2) we have a lower bound on the probability of error. Applying this lower bound and the upper bound on the IPC I (C) for a concatenation leads to a strict lower bound on the bit error probability. Furthermore, an upper bound [6] for the bit error probability of a memoryless symmetric channel with binary input is given by: BER» 2 ( Ī(U; V )) = 2 ( IPC I(C)). (3) Together with the pessimistic information combining this bound gives a worst case estimation of the performance that can be achieved in the limit of infinite interleaving performing infinitely many iterations. Fig. depicts (2) and (3). Furthermore, a performance result obtained by simulating the rate /2 RA code with a block length of K = 5 and 6 iterations on the BPSK AWGN channel is also given. 2 3 Upper Bound Simulation Lower Bound 4 IPC I [bit per symbol]! Figure : Bounds on the bit error ratio given an IPC I (C).
As seen in Fig. the performance estimation from the asymptotical analysis is quite close to the simulation result. This can be verified in a traditional performance plot, see Fig.. or less, leads to the encoder shown in Fig. 2. 2 3 4 upper bound simulation lower bound 5 2.5 2.6 2.7 2.8 2.9 3 3. 3.2 ɛ! Figure : Traditional hard out performance plot for the rate /2 repeat accumulate code. Additionally the bounds derived from asymptotical analysis are given. The bounds derived from asymptotical analysis are less than an order of magnitude apart from each other. Hence, without simulation of iterative decoding, the hard out performance of a concatenated coding scheme can be determined with a accuracy sufficient for many applications. 6. MULTIPLY CONCATENATED CODES EXIT charts as introduced in [4] and used within this work permit the asymptotical analysis of serial or parallel concatenations of two constituent codes. In the following novel concepts are introduced to extend this technique to arbitrarily multiple concatenations. Multiple parallel concatenations with different constituent codes, which are also called multiple turbo codes, can be analyzed using an algorithm based on the principle of information combining [9]. This algorithm as introduced in [] under the name AMCA (Analysis of Multiple Concatenations Algorithm) is suited for fast search of suited constituent codes to achieve extremely power efficient multiple turbo codes. Especially for low rate codes, even under strong complexity constraints, codes have been found within two tenth of decibel to the capacity limit. Using the AMCA to find the most power-efficient rate /4 multiple turbo code, whose decoding complexity is less than the one of the rate /4 DRS code [7], i.e. a restriction to constituent codes of memory Figure 2: Encoder of the most power-efficient rate /4 multiple turbo code (constituent codes of memory ν» 3). Iterative decoding convergence is possible for this code at log (E b /N )».6dB. Fig. 3 shows a comparison of the hard out performance of this asymmetric multiple turbo code and the rate /4 DRS code. 2 3 4 DRS AMTC 5.7.6.5.4.3.2. log (E b /N ) [db]! Figure 3: Performance of rate /4 asymmetric multiple turbo code and DRS code (Block length K = ). Both codes are systematically doped with a doping ratio of :5. The number of iterations is 32. Unfortunately, the AMCA is not applicable to multiple serial concatenations or hybrid concatenations. Hence, in this paper we introduce the nested analysis. Beginning from the outermost constituent
codes we analyze parts of a multiple concatenations using EXIT charts or the AMCA, calculate the IPC I (C) of this concatenation and convert it to a transfer characteristic. This novel characteristic then describes the selected part of the concatenation in the same way as the transfer characteristic of a single constituent code, which usually is determined by Monte Carlo integration. It can be used in the next step to extend the analysis to a further part of the multiple concatenation. The only novel technique, not already used within this paper, is the conversion from an IPC I (C) toa transfer characteristic of an outer code as used within EXIT charts. The difference between these two characteristics of coding schemes is, that the IPC I (C) characterizes the post decoding mutual information vs. channel capacity, whereas the transfer characteristic measures the extrinsic mutual information. Fortunately, using the formulas of information combining solved for one of the constituent channels, when the capacity of the other one and the overall capacity is known, it is possible to separate the intrinsic part of the extrinsic part of the mutual information of an IPC I (C) []. Fig. 4 shows the IPC EI (C) ofconvolutional codes derived from the respective IPC I (C). These curves are equivalent to transfer characteristics. Fig. 5, is analyzed. Figure 5: Encoder of a nested concatenation of an outer rate /2 RA code with an inner scrambler. The IPC I (C) of the outer rate /2 RA code already has been calculated in Section 4. It is shown in Fig. 7. It can be converted to an IPC EI (C) in the same way as shown for the convolutional codes. For the use in an EXIT chart it then has to be plotted with flipped axes to meet the conventions, as the output mutual information of an outer code, which is equal to IPC EI (C) is given at the abscissa of an EXIT chart. Transfer characteristics of the inner scrambler can be obtained by Monte Carlo simulations. Fig. 6 shows an EXIT chart of the nested concatenation at log (E b /N )=.4dB. The curves do not touch, i.e. that convergence of iterative decoding is possible. IPCEI(C) [bit per source symbol]!.9.8.7.6.5.4.3.2 ν=2. ν=3 ν=4 ν=5 Figure 4: Example Information Processing Characteristic IPC EI (C) foroptimum symbol by symbol soft output decoding w.r.t. extrinsic information of systematically encoded rate /2 convolutional codes transmitted over the BPSK AWGN channel (derived from the respective IPC I (C)). Exemplary, the serial concatenation of an outer rate /2 RA code with an inner feedforward only scrambler of memory ν = (Generator 3), see I(U; E) =I(U; Z2) [bitpersymbol]!.9.8.7.6.5.4.3.2. I(U; Z )=I(U; E 2 )[bitpersymbol]! Figure 6: EXIT chart of a nested concatenation, see Fig. 5, at log (E b /N )=.4dB. A simulation for a block length K = 5 performing 8 iterations within the outer RA code and, 4, 8, 6 and 32 Iterations between inner scrambler and outer parallel concatenated code, shown in Fig. 7, proves, that low bit error ratios can be obtained for signal to noise ratios larger than log (E b /N )=.4dB.
2 3 32 4..2.3.4.5.6.7.8.9 2 6 ɛ! Figure 7: Hard out performance of the nested concatenation of Fig. 5 7. CONCLUSIONS The IPC is suited as a practical tool to judge the performance of coding schemes as well as a graphical representation useful for theoretical considerations. We showed, that the IPC of a coding scheme, which can be obtained by simulations for simple coding schemes as, e.g., convolutional codes or via asymptotical analysis for concatenated schemes, is sufficient to decide, whether a coding scheme is appropriate for the intended application. It enables us to predict the bit error ratio for every signal to noise ratio, but gives much more information than a bit error ratio curve, as it characterizes a coding scheme w.r.t. soft output and has a scaling that magnifies differences between coding schemes operated below capacity, resulting in bit error ratios close to 5%. Due to this particular scaling at first sight it is obvious, whether a coding scheme has a pronounced turbo cliff, or bit error ratio that is decreased slowly when the signal to noise ratio of the channel is increased. Furthermore, it can be read off an IPC whether a coding scheme is suited as a constituent code of a concatenation. REFERENCES 8 4 [] L. Bahl, J. Cocke, F. Jelinek, J. Raviv, Optimal Decoding of Linear Codes for Minimizing Symbol Error Rate. IEEE Trans. Inform. Theory, vol. IT 2, no. 2, pp. 284 287, 974. [2] C. Berrou, A. Glavieux, and P. Thitimashima, Near Shannon Limit Error Correcting Coding and Decoding: Turbo Codes. Proceedings of ICC 93, pp. 64-7, 993. [3] D. G. Brennan, Linear diversity combining techniques. Proceedings of the IRE, vol. 47, pp. 75 2, Jun. 959. [4] S. ten Brink, Convergence of iterative decoding, IEE Electronics Letters, vol. 35, no., pp. 86 88, May 999. [5]R.M.Fano,Transmission of Information: A Statistical Theory of Communication, JohnWiley & Sons, Inc., New York, 96. [6] M. E. Hellman and J. Raviv, Probability of Error, Equivocation, and the Chernoff Bound, IEEE Transactions on Information Theory, vol.6, no.4:pp.368 372, Jul. 97. [7] S. Huettinger, S. ten Brink, J. B. Huber, Turbo Code representation of RA Codes and DRS Codes for reduced decoding complexity. In Proceedings of Conference on Information Sciences and Systems (CISS 2), The Johns Hopkins University, Baltimore, Maryland, pp. 8 23, March 2-23, 2. [8] S. Huettinger, J. B. Huber, R. Johannesson, R. Fischer, Information Processing in Soft Output Decoding. In Proceedings of 39rd Allerton Conference on Communications, Control and Computing, Oct. 2. [9] S. Huettinger, J. B. Huber, Performance estimation for concatenated coding schemes. In Proceedings of IEEE Information Theory Workshop 23, pp. 23 26, Paris, France, March/April 23. [] S. Huettinger, J. B. Huber, Analysis and Design of Power Efficient Coding Schemes with Parallel Concatenated Convolutional Codes. Accepted for IEEE Transactions on Communications, 23. [] S. Huettinger, J. B. Huber, Extrinsic and Intrinsic Information in Systematic Coding. In Proceedings of International Symposium on Information Theory 22, Lausanne, Jul. 22. [2] H. Jin, and R. McEliece, RA Codes Achieve AWGN Channel Capacity. 3th AAECC Proceedings, Springer Publication 79. pp. 8, 999. [3] I. Land, S. Huettinger, P. Hoeher, J. Huber Bound on Information Combining, Submitted for International Symposium on Turbo Codes, 23. [4] C. E. Shannon, Coding theorems for a discrete source with a fidelity criterion. IRE National Convention Record, Part 4, pp. 42 63, 959. [5] A. Viterbi, Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm. IEEE Transactions on Information Theory, vol. IT-3, no. 2, pp. 26 269, Apr. 967.