An Introduction to Distributed Channel Coding

An Introduction to Distributed Channel Coding Alexandre Graell i Amat and Ragnar Thobaben Department of Signals and Systems, Chalmers University of Technology, Gothenburg, Sweden School of Electrical Engineering, Royal Institute of Technology (KTH), Stockholm, Sweden October 1, 2013 Abstract This chapter provides an introductory survey on distributed channel coding techniques for relay networks. The main focus is on decode-and-forward relaying for the basic three-node relay channel. We show how linear block code structures can be deduced from fundamental information theoretic communication strategies. Code design and optimization are discussed taking low-density parity-check (LDPC) block codes and spatially-coupled LDPC codes as particular examples. We also provide an overview on distributed codes that are based on convolutional codes and turbo-like codes, and discuss extensions to multi-source cooperative relay networks. 1

Contents 1 Introduction 3 2 The Three-node Relay Channel 6 2.1 Basic Model........................................ 6 2.2 Relaying Strategies.................................... 8 2.3 Fundamental Coding Strategies for Decode-and-Forward Relaying.......... 8 2.3.1 Full-duplex Relaying............................... 9 2.3.2 Half-Duplex Relaying............................... 11 2.3.3 Design Objectives: Achieving the Optimal Decode-and-Forward Rates.... 12 3 Distributed Coding for the Three-node Relay Channel 17 3.1 LDPC Code Designs for the Relay Channel....................... 17 3.1.1 Code Structures for Decode-and-Forward Relaying............... 17 3.1.2 Irregular LDPC Codes.............................. 21 3.1.3 Spatially-coupled LDPC Codes......................... 26 3.2 Distributed Turbo-Codes and Related Code Structures................ 31 3.2.1 Code optimization................................ 33 3.2.2 Noisy Relay.................................... 34 4 Relaying with Uncertainty at the Relay 34 4.1 Compress-and-Forward Relaying............................. 34 4.2 Soft-Information Forwarding and Estimate-and-Forward................ 35 5 Cooperation with Multiple Sources 36 5.1 Two-user Cooperative Network: Coded Cooperation.................. 36 5.2 Multi-source Cooperative Relay Network........................ 37 6 Summary and Conclusions 39 2

relay source direct link destination Figure 1: The three-node relay channel. A source communicates with a destination with the help of a relay. 1 Introduction Since Marconi s first radio link between a land-based station and a tugboat, wireless communications have witnessed a tremendous flourishing and have become central in our everyday life. In the past decades, wireless communications have expanded at an unprecedented pace. The number of worldwide mobile subscribers has increased from a few million in 1990 to more than 4 billion in 2010. To enable an ever-increasing number of wireless devices and applications, the challenge of researchers and engineers has always been to design communication systems that achieve high reliability, spectral and power efficiency, and are able to mitigate fading. A way to tackle this challenge is by exploiting diversity in time, frequency or space. A well-known technique to exploit spatial diversity consists of employing more than one antenna at the transmitter. However, many wireless devices have limited size or hardware capabilities, therefore it is not always possible to employ multiple antennas. Cooperative communications is a new concept that offers an alternative to achieve spatial diversity. In traditional wireless communication networks, communication is performed over point-topoint links and nodes operate as store-and-forward packet routers. In this scenario, if a source communicates with a destination and the direct link cannot provide error free transmission, the communication is performed in a multi-hop fashion through one or multiple relay nodes. However, this model is unnecessarily wasteful, because, due to the broadcast nature of the wireless channel, nodes within a certain range may overhear the transmission of other nodes. Therefore, it seems reasonable that these nodes help each other, i.e., cooperate somehow, to transmit information to the destination. This paradigm is known as cooperative communication. Similarly to multipleantenna systems, cooperative communications achieve transmit diversity by generating a virtual multiple-antenna transmitter, where the antennas are distributed over the wireless nodes. Cooperative communications have been shown to yield significant improvements in terms of reliability, throughput, power efficiency, and bandwidth efficiency. The basic principles of cooperative communications can be traced back to the 70s, when van der Meulen introduced the relay channel [1], depicted in Fig. 1. It is a simple three node cooperative network where one source communicates to a destination with the help of a relay, yet capturing the main features and characteristics of cooperation. In a conventional single-hop system, the destination in Fig. 1 would decode the message transmitted by the source solely based on the direct transmission. However, due to the broadcast nature of the channel, the device at the top overhears the transmission from the source. Therefore, it can help in improving the communication between the source and the destination by forwarding additional information about the source message. The destination can then decode the source message based on the combination of the two signals 3

from the source and the relay. As each transmission undergoes a different path, spatial diversity is achieved. For the classical three-node relay channel, Cover and El Gamal [2] described two fundamental relaying strategies where the relay either decodes (decode-and-forward), or compresses (compressand-forward) the received source transmission before forwarding it to the destination. As an alternative, the relay may simply amplify and retransmit the signal received from the source, a strategy known as amplify-and-forward. Cover and El Gamal also derived inner and outer bounds on the capacity of the relay channel [2]. The key result of this pioneering work is that, in many instances, the overall capacity is better than the capacity of the source-to-destination channel. In their work, it was assumed that all nodes operate in the same frequency band. Hence, the system can be decomposed into a broadcast channel from the viewpoint of the source and a multiple access channel from the viewpoint of the destination, leading to interference at the destination. They also assumed full-duplex operation at the relay, i.e., the relay is able to transmit and receive simultaneously in the same frequency band. Despite the early works by van der Meulen and Cover and El Gamal in the 70s, relaying and cooperative communications in wireless networks remained mostly unexplored for three decades. However, it has probably been one of the most intensively researched topics in the information theory and communication theory communities in the last ten years. The boom in research on cooperative communications occurred in the early 2000s and was triggered by the seminal paper by Laneman, Tse and Wornell [3] on cooperative diversity, and the work by Hunter and Nosratinia [4]. The goal of these works was to provide transmit diversity to single-antenna nodes in wireless networks through cooperation. To achieve cooperation, nodes typically exchange their messages in a first step and perform a cooperative transmission of all messages in a second step. These works triggered also a large amount of work in the information theory community, identifying the fundamental limits of cooperative strategies [5]. Nevertheless, although a great deal of work has been done in this field, it is remarkable that even the capacity of the basic three-node relay channel is only known for special cases. For example, for the degraded relay channel, decode-and-forward relaying achieves capacity. From an information theoretic point of view, the highest gains can be achieved when the source and the relay transmit over the same channel and full-duplex operation at the relay is considered [2]. Nevertheless, due to practical constraints, it is considered a challenge to provide fullduplex operation at the relay [6]. Likewise, without enforcing further multiple-access constraints, interference becomes another significant practical challenge [7]. It is therefore relevant to consider a scenario where transmission take place over orthogonal channels (using, e.g., time division multiple access (TDMA)), and the relay operates in half-duplex mode. Most of the relevant literature in cooperative communications make this assumption. Fundamental limits of the scenario with orthogonal channels have been derived in [3]-[9]. Distributed Channel Coding Since the early works on cooperative communications, the concept of cooperation has been extended to a myriad of communication networks. Many different cooperative strategies and network topologies, consisting of one or multiple transmitters, relays, and receivers, have been considered and studied in recent years. These cooperative strategies are often based on multiple-antenna techniques like, e.g., distributed space-time coding or beamforming. Other approaches have their roots in channel coding techniques. To harvest the potential gains of cooperative communications, point-to-point channel coding can indeed be extended to the network scenario, a concept that is 4

known as distributed channel coding. Consider for example that the source in Fig. 1 transmits an uncoded message and that the relay, after decoding it, forwards another copy of the source message. The destination receives two (noisy) versions of the same message, therefore, repetition coding distributed between the source and the relay has been realized. This trivial concept can be generalized to more sophisticated coding structures. Assume that each transmit node in the network uses an error correcting code, which may be very simple, e.g., a short block code, or very advanced, e.g., a low-density parity-check (LDPC) code. The main idea of distributed channel coding is that a more powerful code, distributed over the network nodes, can be constructed by properly joining together the codes used by each node. The way channel coding for cooperation is implemented in communication networks depends heavily on the network topology, the considered cooperative strategy, the channel model, and the purpose of the cooperation. Some code designs follow intuitively from the topology of the network, while other approaches are directly inspired by communication strategies proposed in the information theory literature. Yet another set of solutions use channel coding as a tool to implement distributed source coding schemes that are an integral part of compress-and-forward schemes. Depending on the network topology, ideas from network coding may be integrated, and depending on the purpose of the cooperation, the schemes may be optimized to perform close to the highest achievable rates or they may be optimized for diversity and outage. This chapter provides an introductory survey on distributed channel coding techniques for cooperative communications. From a code design perspective, decode-and-forward relaying is the most attractive relaying strategy, as it guarantees that the transmitted messages are known at the relay nodes such that a distributed coding scheme can be set up. Hence, our focus in this chapter is on decode-and-forward relaying. For pedagogical purposes, the main principles underlying distributed channel coding are developed for the basic three-node relay channel. We introduce the main information-theoretic concepts and show how these concepts translate in terms of channel code designs. We discuss code design and optimization taking LDPC block codes and the recently introduced spatially-coupled LDPC codes as examples. We also provide an overview of distributed channel coding based on convolutional and turbo-like codes and discuss extensions of the code constructions to other cooperative network topologies. The chapter is organized as follows. Section 2 introduces the basic model of the three-node relay channel, and gives an overview of the fundamental coding strategies for decode-and-forward relaying. In Section 3, a survey on distributed channel coding for the relay channel is provided, with focus on LDPC block codes, spatially-coupled LDPC codes, and turbo-like codes. In Section 4, we briefly discuss relaying strategies when reliable decoding cannot be guaranteed at the relay. In Section 5 we discuss generalizations of the distributed channel coding schemes of Section 3 to multi-source cooperative relay networks. Finally, Section 6 concludes the chapter and highlights some of the challenges of distributed channel coding. Notations To ease the presentation in the remainder of the chapter, we introduce the following notation. Throughout the chapter, we use bold lowercase letters a to denote vectors, bold uppercase letters A to denote matrices, and uppercase letters A to denote random variables. We assume all vectors to be row vectors. 5

Relay W Encoder X S Y R X R p(y R,Y X S,X R ) Y Decoder Ŵ Figure 2: Three-node relay channel model. 2 The Three-node Relay Channel In this section, we focus on the simplest cooperative channel model, i.e. the three-node relay channel [1]. After introducing the general model, we briefly describe the three main relaying strategies: amplify-and-forward, decode-and-forward, and compress-and-forward. We then summarize fundamental bounds on the achievable rates under the decode-and-forward relaying strategy, and discuss the fundamental communication strategies proposed in the information theory literature to achieve these bounds for both half-duplex and full-duplex relaying. 2.1 Basic Model The three-node relay channel describes the scenario where a source node conveys a message W {0,...,2 k 1} to a destination with the help of a single relay. The message W, which may equivalently be represented by a length-k binary vector b, is encoded into a codeword x S, and transmitted by the source. The corresponding vectors of channel observations at the relay and the destination are denoted as y R and y, respectively. The codeword transmitted from the relay is denoted by x R. The three-node relay channel is illustrated in Figure 2. In the most general case, the relation between the two channel input symbols X S and X R from the source and the relay, respectively, and the channel output symbols Y R and Y D at the relay and the destination is described by the conditional distribution p(y R,Y D X S,X R ). For independent channel observations Y R and Y D, we obtain p(y R,Y D X S,X R ) = p(y R X S,X R )p(y D X S,X R ). Note that in some cases (e.g., for the full-duplex relay channel with decode-and-forward relaying) the channel observations at the relay Y R depend on previously transmitted symbols by the relay X R due to the chosen transmit strategy. AWGN Relay Channel For the additive white Gaussian noise (AWGN) relay channel we can refine the model and characterize the input-output relation of the channel as Y R = H SR X S +Z R, Y = H SD X S +H RD X R +Z D, where Z R and Z D are independent real-valued white Gaussian noise samples with zero mean and unit variance, H SR, H SD, and H RD are channel coefficients on the source-relay, source-destination, and relay-destination links, and X S and X R are the code symbols with power constraints E{X 2 S } = P S and E{X 2 R } = P R which are transmitted from the source and the relay, respectively. In this 6

(a) H SR Z R H RD Y R Relay X R W Encoder X S Y Decoder Ŵ H SD Z D (b) H SR Z R H RD Z 2 W Encoder X S Y R Relay X R Y 2 Y 1 Decoder Ŵ (c) H SD Z 1 W Encoder Y R X R Y 2 BEC(ǫ SR ) Relay BEC(ǫ RD ) X S Y 1 BEC(ǫ SD ) Decoder Ŵ Figure 3: AWGN relay channel with competing transmissions from the source and the relay (a), AWGN relay channel with orthogonal transmissions from the source and the relay in (b), and binary erasure relay channel (c). model, the two competing transmissions from the source and the relay form a multiple-access channel (MAC). The model is depicted in Figure 3(a). As handling interference is in general a practical challenge [7], in some cases it is convenient to assume that transmissions from the source and the relay are carried out on orthogonal channels (see Figure 3(b)). In this case, the channel observation Y may be replaced by a vector of channel observations Y = [Y 1,Y 2 ], with Y 1 = H SD X S +Z 1, Y 2 = H RD X R +Z 2, where Z 1 and Z 2 denote independent real-valued white Gaussian noise samples with zero mean and unit variance. Binary Erasure Relay Channel From a code design point of view, it is also convenient to consider the binary erasure relay channel as shown in Figure 3(c). This model considers again orthogonal channels for the links to the destination, and the source-relay, source-destination, and relay-destination links are given by binary erasure channels (BECs) with erasure probabilities ǫ SR, ǫ SD, and ǫ RD, respectively. 7

2.2 Relaying Strategies According to the way the information is processed at the relay, it is possible to define several cooperative strategies. The three major relaying strategies, amplify-and-forward, decode-andforward, and compress-and-forward, are briefly described in the following. Amplify-and-forward: Amplify-and-forward is perhaps conceptually the most easy to understand cooperative strategy. The relay simply retransmits a scaled version of the signal it receives from the source, subject to a power constraint. The destination receives two independently-faded versions of the information and is thus able to make better decisions. The main drawback of this strategy is that it leads to a noise amplification. Decode-and-forward: The relay attempts to decode the received signal, then generates an estimate of the source message and re-encodes it prior to forwarding to the destination. The decode-and-forward strategy performs very well in the case of successful decoding at the relay. However, when the relay fails to correctly decode the received signal, an error propagation phenomenon is observed, and the decode-and-forward strategy may not be beneficial. For this reason, adaptive decode-and-forward methods have been proposed, where the relay detects and forwards the source information only in the case of high instantaneous source-to-relay link signal-to-noise ratio. Compress-and-forward: The relay is no longer required to decode the information transmitted by the source but simply to describe its observation to the destination. The compress-andforward strategy is used when the relay cannot decode the information sent by the source. The relay compresses the received signal using the side information from the direct link and forwards the compressed information to the destination. Unlike decode-and-forward, compress-and-forward remains beneficial even when the source-to-relay link is not error-free. Furthermore, as opposed to decode-and-forward, in compress-and-forward the relay does not use any knowledge of the codebook used by the source. In [10], a comparison of decode-and-forward and compress-and-forward was performed according to the relay location. It was shown that the achievable rate of decode-and-forward is higher when the relay is close to the source while compress-and-forward outperforms decode-and-forward when the relay gets closer to the destination. In this chapter, as our main focus is on distributed coding, we consider only the decode-and-forward strategy. 2.3 Fundamental Coding Strategies for Decode-and-Forward Relaying Among the different relaying strategies described in the previous section, decode-and-forward is the most relevant one when distributed coding is considered. In this section, we summarize fundamental coding strategies for decode-and-forward relaying for the AWGN relay channel of Figure 3(a). We consider both full-duplex and half-duplex relaying. We also discuss the corresponding achievable rates. Here, the proofs of achievability are typically based on random-coding arguments, and they do not directly provide practical coding schemes. However, as we will see later in this section, the achievability proofs provide guidance on how practical coding schemes can be designed. 8

2.3.1 Full-duplex Relaying A relay operating in full-duplex mode is capable of simultaneously transmitting and receiving on the same frequency band. Full-duplex relaying is beneficial since it leads to the most efficient utilization of the resources (compared to half-duplex relaying) and it enables the highest achievable rates. Unfortunately, hardware implementations of full-duplex relaying are still considered to be a challenge since the received power level of the self-interference exceeds by far the received power level of the desired signal. This issue has recently been addressed, e.g., in [11] where spatial filtering is proposed to mitigate the effect of self-interference. For further details, we refer the reader to [11] and the references therein. In the following, we follow the commonly used approach in the information and coding theory literature and do not explicitly address the issue of selfinterference. For decode-and-forward relaying, all rates R up to [2] R FD DF = sup min{i(x S ;Y R X R ),I(X S,X R ;Y)} (1) p(x S,X R ) are achievable. Here, we say that a rate is achievable if there exists a sequence of (2 nr,n) codes for which the error probability P e (n) can be made arbitrarily small for sufficiently large n. In the special cases of the physically degraded relay channel, the reversely degraded relay channel, the relay channel with feedback, and the deterministic relay channel, the rate in (1) coincides with the channel capacity [2]. For the general relay channel, however, (1) establishes an achievable rate, and it is not known whether it can be improved or not. In the following, we summarize two important strategies that show how the boundary R FD DF of the set of achievable rates R can be approached. Both strategies follow the same general approach. However, they differ in the rate allocation at the source and the relay. The main principle underlying both strategies is block Markov superposition coding, which requires that: 1. The message W, of length nrb bits, is split into B blocks W 1,...,W B of length nr bits each, i.e., W k {0,...,2 nr 1}, which are transmitted in successive time slots, 2. The codes C S and C R that are used by the source and the relay, respectively, are designed following the factorization p(x S X R )p(x R ) of the joint distribution p(x S,X R ). This is achieved by explicitly designing a code C S (x R ) for every codeword realization x R C R. The general steps for transmitting the B messages are as follows: consider the transmission of the message w k during the k-th block and assume that the relay has successfully decoded the previously transmitted messages w 1,...,w k 1 and the destination has successfully decoded the messages w 1,...,w k 2. For both strategies, the source transmits the current message w k by using a codeword x S [k] that is chosen from the code C S (x R [k]). Here, the code C S (x R [k]) is selected by the codeword x R [k] that is simultaneously sent from the relay during the k-th block. The source has knowledge of this codeword since the relay processes the received messages with a delay of one block. Accordingly, the codeword x R [k] carries information on the previous message w k 1. At the end of the k-th block, the destination decodes the messages w k 1 based on the channel observations y[k 1] and y[k] and by using its knowledge on the previously decoded messages w 1,...,w k 2. The transmission of B subsequent blocks using the two different strategies is illustrated in Figure 4 and Figure 5. Here, we assumed that the transmission is initialized by a predefined message w 0 = 0. It is important to note that the transmission is carried out over B+1 blocks, leading to a reduction in rate by a factor B/(B +1). This rate loss can however be made small for sufficiently large B and is therefore neglected in the following. 9

w 1 w 2 w 3 w B x S (w 1 0) x S (w 2 w 1 ) x S (w 3 w 2 ) x S (w B w B 1 ) y R (1) x R (w 1 ) y R (2) x R (w 2 ) x R (w B 1 ) y R (B) x R (w B ) y(1) y(2) y(3) y(b) y(b +1) ŵ 1 ŵ 2 ŵb 1 ŵ B Figure 4: Full-duplex decode-and-forward relaying with regular encoding and sliding window decoding. Strategy 1 (Regular encoding and sliding window decoding) The first strategy (see also [12] for a more detailed description) is based on regular encoding and sliding window decoding. The codes C S and C R employed by the source and the relay, respectively, have the same rate R. They are designed in two steps and used in the following way. 1. The relay generates a rate-r codec R of lengthn(with i.i.d. symbols following the distribution p(x R )). The code is used for encoding the message w k 1 in time slot k to codeword x R [k] = x R (w k 1 ) C R, where the notation x(w) is used to denote that message w is encoded to codeword x. 2. Since the source knows the previously transmitted message w k 1, which is transmitted in the k-th time slot from the relay, it generates the codebook of a length-n rate-r code C S (w k 1 ) conditioned on w k 1, and it uses the code for transmitting the message w k. That is, x S [k] = x S (w k w k 1 ). The destination decodes w k 1 based on the channel observations y[k], which depend on w k and w k 1, andy[k 1], which depend onw k 1 andw k 2 (see Figure 4). In order to do so, the destination makes use of the fact that it already has knowledge of the previously decoded message w k 2. On the other hand, the presence of the message w k is treated as interference. Strategy 2 (Binning) The second strategy employs a so-called binning scheme at the relay (see, e.g., [2]), which leads to an irregular rate assignment at the source and the relay with rates R and R 0, respectively. To implement the binning, the relay splits the message alphabet W = {0,...,2 nr 1} into 2 nr 0 disjoint sub-sets {S 0,...,S 2 nr 0 1 }, so-called bins, such that each message w W is assigned randomly according to a uniform distribution to the bins S s. Then, instead of directly forwarding the message w k 1 during time slot k, as for the previous strategy, the relay forwards the index s k 1 that identifies the bin S sk 1, which contains the message w k 1. The codes C S and C R that are now used for transmission from the source and the relay, respectively, are constructed as described above. We have however to take into account the irregular rate assignment, that is, the relay transmits x R [k] = x R (s k 1 ), with s k 1 satisfying w k 1 S sk 1, drawn from a rate-r 0 code C R of length n. For every possible bin index s, the source constructs a 10

w 2 w 1 w 3 {S 0,...,S 2 nr 0 1 } {S 0,...,S 2 nr 0 1 } {S 0,...,S 2 nr 0 1 } w B x S (w 1 0) s 1 s 2 x S (w 2 s 1 ) s B 1 x S (w 3 s 2 ) x S (w B s B 1 ) y R (1) x R (s 1 ) y R (2) x R (s 2 ) x R (s B 1 ) y R (B) x R (s B ) y(1) ŵ 1 Sŝ1 ŝ 1 ŝ 2 ŝ 3 y(2) y(3) Sŝ3 ŵ 2 Sŝ2 ŵ 3 y(b) ŵ B SŝB ŝ B y(b +1) Figure 5: Full-duplex decode-and-forward relaying with irregular encoding. rate-r code C S (s). Then, the source lets the bin index s k 1 select the code C S (s k 1 ) that is used for transmitting w k using the codeword x S [k] = x S (w k s k 1 ). In a first step, the destination decodes the codeword x R (s k 1 ) sent by the relay based on the observation y[k] to recover the bin index s k 1. Again, the message w k is treated as interference. In a second step, the receiver recovers w k 1 by using a list decoder based on y[k 1] and intersecting the resulting list with the bin S sk. 2.3.2 Half-Duplex Relaying In the case of half-duplex relaying, it is assumed that the relay cannot simultaneously transmit and receive on the same frequency band. Half-duplex relaying is therefore considered to be more practical since the self-interference issue is avoided. While in the full-duplex case coding over a large number of blocks is required in order to optimally utilize the capabilities of the full-duplex relay and to mitigate the loss due to processing delay at the relay, only two blocks are required for the transmission in the half-duplex case. The achievable rates are accordingly reduced compared to the full-duplex case. In the following, we assume that a total of n channel uses for the transmission, and the fractions of channel uses allocated to the first and second time slots are given by the time-sharing parameters α [0,1] and ᾱ = 1 α. For this setup, it has been shown in [13] that all rates up to R HD DF = sup min α {0,1},p(X S,X R T) { αi(xs ;Y R X R,T = 1)+ᾱI(X S ;Y X R,T = 2), αi(x S ;Y X R,T = 1)+ᾱI(X S,X R ;Y T = 2)) are achievable. Here, the random variable T indicates whether the first time slot (T = 1) or the second time slot (T = 2) is considered. Note that p(t = 1) = α and p(t = 2) = ᾱ. This distinction is relevant since the source may allocate power differently to the two time slots. It is furthermore convenient to keep track of the fact that X R [1] = 0 due to the half-duplex constraint. In order to achieve this rate, the message W {0,...,2 nr 1} is first split into two messages U U, with U = {0,...,2 nr U 1}, and V {0,...,2 nr V 1}, with R = R U + R V, such that W = [U,V]. The message U is transmitted during the first phase. The transmission is overheard by the relay and the destination. After successfully decoding, the relay uses a binning scheme as described in the previous section to split the message set into bins. In the second phase, the relay forwards the bin index to the destination. The source simultaneously transmits the message V using the same channel. Even though this strategy is very similar to the second full-duplex strategy presented in the previous section, we summarize the three different channel codes and the binning scheme that are used during the transmission for completeness: 11 } (2)

1. The source employs a rate-r S,1 code C S,1 of length αn for transmitting the message U to the relay and the destination during the first transmission phase, x S [1] = x S,1 (u). Clearly, R U = αr S,1. 2. The relay splits the message set U into 2 nr 0 bins {S 0,...,S 2 nr 0 1 } of equal size. For every message u that is successfully decoded at the end of the first phase, the relay determines the bin index s of the bin S s that contains the message u. 3. The relay transmits the bin index s using a length-ᾱn rate-r R code C R in the second phase, x R [2] = x R (s). Accordingly, we get the following relation between the binning rate R 0 and the rate R R : R 0 = ᾱr R. 4. Since the source knows the bin index s, it chooses to cooperate with the relay by generating for each realization s of the bin index S a code C S,2 (s), with rate R S,2 and length ᾱn that is used for encoding V, x S [2] = x S,2 (v s). Clearly, we have R V = ᾱr S,2. The source starts decoding in the second time slot. It first decodes the bin index s and then the second message v. In a second step, the message u is decoded by utilizing knowledge of the bin index. All steps are summarized in Figure 6. w u x S,1 (u) v {S 0,...,S 2 nr 0 1 } s x S,2 (v s) y R (1) x R (s) y(1) Sŝ ŝ y(2) û ˆv Figure 6: Half-duplex decode-and-forward relaying with irregular encoding. 2.3.3 Design Objectives: Achieving the Optimal Decode-and-Forward Rates In the following, we discuss the requirements that need to be fulfilled in order to achieve the optimal decode-and-forward rates and formulate design objectives for distributed channel coding. Full-Duplex Relaying Using Regular Encoding In this case, two codes, C S and C R, need to be designed, which are used by the source and the relay, respectively, and produce a desired joint distribution p(x S,X R ). This is achieved by exploiting the factorization p(x S X R )p(x R ). As we can see from (1), the joint distribution p(x S,X R ) is a design parameter that provides a generic description of the set of parameters (e.g., symbol alphabets, power allocation, time sharing, and correlation) that need to be optimized for maximizing the overall rate. 12

To identify the challenges in the design of the codes C S and C R we consider the case where the disturbances introduced by the channels are of a non-binary nature. In this case, a certain class of joint distributions can be realized by using superposition coding as follows. Assume that the relay employs a rate-r code C R of length n with power constraint E{XR 2} P R for transmitting w k 1 in the k-th block. Assume also that the source has available a rate-r code CS of length n, with symbols XS independent of the code symbols X R sent from the relay, for encoding w k in the k-th block. For convenience, we assume that this code has unit power. The codewords x S (w k w k 1 ) are then generated as a weighted superposition of the codewords x S (w k ) and x R (w k 1 ), x S (w k w k 1 ) = ( ) ρxs 1 ρ P S (w k )+ x R (w k 1 ), (3) P R where E{XS 2} P S defines the power constraint at the source. The factor ρ controls the allocation of the power at the source that is spent for the transmission of the message w k and the cooperative transmission of x R (w k 1 ). To get further insights, we specialize to the AWGN relay channel and assume the realizations of the channel coefficients h SR, h SD, and h RD to be fixed for the duration of n B channel uses. In this setup, the channel outputs at the source and the relay at the end of the k-th block are given by y R [k] = ρp S h SR x (1 ρ)p S S (w k )+ h SR x R (w k 1 )+z R [k] y[k] = ρp S h SD x S (w k )+ P R (1 ρ)p S P R h SD +h RD x R (w k 1 )+z[]k, respectively. Under the assumption that the previous decoding stages have been successful, interference from previously transmitted symbols can be removed. Thus, the relay decodes w k based on ŷ R [k] = ρp S h SR x S (w k )+z R [k]. (4) Similarly, the destination decodes w k based on y[k +1] = ρp S h SD x S (w k+1 )+ (1 ρ)p S P R h SD +h RD x R (w k )+z[k +1] (5) ŷ[k] = ρp S h SD x S (w k )+z[k], (6) treating the interference from the codeword x S (w k+1 ) as noise. The overall code structure that results from this coding strategy is illustrated in Figure 7. It can be interpreted as the concatenation of the codes CS and C R, which defines a length-2n rate-r/2 code C with codewords x = [x S (w k ),x R (w k )], where the first and second segments of the codewords are transmitted over different channels. Note that the first constraint on the achievable rate in (1) is induced by the channel that is described in (4), and the second constraint results from the channels described in (5) and (6). For Gaussian codebooks, it is now straightforward to show that (1) can be reformulated as R FD DF = sup min ρ [0,1] { 1 2 log(1+ρsnr SR) 1 2 log(1+snr SD +2 (1 ρ)snr SD SNR RD +SNR RD ) 13 }, (7)

w C C R C S x R x S Channel 1 Channel 2 y ŷ Decoder ŵ Figure 7: Overall code structure resulting from regular encoding. where we defined SNR ij = h 2 ijp i. We observe that the first bound is monotonically increasing in ρ, starting from zero for ρ = 0, and the second bound is monotonically decreasing. We can make three interesting observations regarding the optimal power allocation ρ, which affect the code design: 1. By evaluating the expression in (7) for ρ = 1, we see that whenever SNR SR < SNR SD + SNR RD, the optimal power allocation is ρ = 1. In this case, the link between the source and the relay limits the performance while the second constraint is inactive. It follows that the code CS has to be capacity achieving for the source-to-relay channel. Since the second bound in (7) is not tight, the overall code C does not need to be capacity achieving as long as it is decodable at the destination. 2. For SNR SR SNR SD + SNR RD, the optimal power allocation can be found by equating the first constraint with the second constraint. In other words, both constraints have to be satisfied simultaneously. This implies that both the code C S and the overall code C need to be capacity achieving. 3. Finally, if SNR SR SNR SD +SNR RD and the power allocation is not optimally chosen, two different cases can occur: (a) If ρ < ρ, the first bound is tight while the second bound is loose. Hence, the code C S has to be capacity achieving while the overall code C is not required to be capacity achieving as long as it is decodable at the destination. (b) If ρ > ρ, the second bound is tight while the first bound is loose. In this case, the overall code C has to be capacity achieving while the code CS only needs to be decodable at the relay (without achieving the capacity of the source-to-relay link). In the above discussion, the exact SNR constraints are only valid for Gaussian inputs, and they may hold in good approximation for other input distributions if very low SNR regimes are considered. Nevertheless, the three different design objectives for the different regimes identified above, namely Case 1: Capacity-achieving component code C S Case 2: Capacity-achieving component code C S and capacity achieving overall code C; and decodability of the overall code C; Case 3: Capacity-achieving overall code C and decodability of the component code C S ; will be relevant in other cases as well. For a related discussion on the special case of binary-input additive white Gaussian relay channels, we refer the reader to [14]. From the structure of the distributed code (see Figure 7) and the decoding schedule, it is apparent that the problem of designing good codes for the full-duplex relay channel with regular 14

encoding is closely related to the design of parallel concatenated codes and rate-compatible codes. This shows that distributed turbo-codes (see Section 3.2), which are typically considered to be an engineering approach to distributed channel coding, can indeed be related to the fundamental coding strategies provided by the information theory literature. As we will see in Section 3.1, rate-compatible code structures play an important role for the LDPC code design for the relay channel. Full-Duplex Relaying Using Irregular Encoding We start the discussion with the code C R that is used by the relay to forward the bin index s k 1 to the destination in time slot k. We assume again that superposition coding is employed for generating the codewords of the codes C S (s k 1 ), as described in (3), and that the code CS, which is used for encoding w k, is decodable at the relay but not at the destination. The code C R is solely used as a point-to-point code in order to forward the bin index to the destination. In contrast to the previous case, it does not become part of an extended code structure. The destination will decode the bin index s k 1 based on y[k] = ρp S h SD x S (w k )+ (1 ρ)p S h SD +h RD x R (s k 1 )+z[k], (8) P R treating the interference from the codeword x S (w k ) as noise. The optimization of the code C R can be done by using standard tools like extrinsic information transfer (EXIT) charts [15] or density evolution [16], taking into account the accurate distribution of the noise-plus-interference. After successfully decoding the bin index s k 1 and after removing the interference due to x S (w k 2 ) from the channel output y(k 1), the destination decodes w k 1 using ŷ(k 1) = ρp S h SD x S (w k 1 )+z(k 1). (9) Since the codec S is not directly decodable at the destination, decoding w k 1 is performed considering the code ĈS(s k 1 ), which contains codewords corresponding to the set of messages contained in the bin S sk 1, ĈS(s k 1 ) = {x S (w) C S w S s k 1 }. The code ĈS(s k 1 ) has the following properties: 1. Since each bin S s contains 2 n(r R 0) messages (the 2 nr messages are grouped into 2 nr 0 bins due to the binning) and codewords of length n are considered, it follows that the code ĈS(s) has rate ˆR = R R 0. 2. Since for a given s all codewords ˆx S ĈS(s) are as well codewords of the code CS, the codes CS and ĈS(s) form a pair of nested codes 1, CS being the fine code and ĈS(s) being the coarse code. In Section 3.1.1, we will see how nested codes can be implemented with linear codes. We can now identify the requirements that need to be satisfied to approach the boundary of the set of achievable rates in (1): 1. Whenever the first bound on the achievable rate is tight, the fine code CS has to be capacity achieving for the source-to-relay channel. This follows from the same arguments as in the regular-encoding case. 1 We say that two codes C and Ĉ are nested if Ĉ C, i.e. each codeword of Ĉ is also a codeword of C. We call C the fine code and Ĉ the coarse code. 15

2. Whenever the second bound in (1) is tight, the code C R has to be capacity achieving for the relay-destination link specified in (8) and the coarse code ĈS(s) has to be capacity achieving for the source-to-destination link described in (9). As a consequence, the binning rate R 0 has to be equal to the capacity of the relay-to-destination link. Clearly, whenever one of the constraints in (1) is loose, the capacity achieving properties of the respective codes can be relaxed and sub-optimal code designs are sufficient. Half-Duplex Relaying For the half-duplex relay channel, the optimization of the achievable rate in (2) involves the optimization of both the time-sharing parameter α and the power allocation at the source (the source has to distribute its power between the first and second time slot; for the second time slot, the source has to allocate power for its own transmission as well as for the cooperative transmission). Under the assumption that I(X S ;Y R X R,T = 1) > I(X S ;Y X R,T = 2), we can see that the first constraint in (2) is an increasing linear function in α. This assumption requires that the channel to the relay supports higher rates compared to the channel to the destination. It is a reasonable assumption for decode-and-forward relaying since the relay has to be able to decode at higher rates compared to the destination in order to be of any help. Since furthermore I(X S ;Y X R,T = 1) < I(X S,X R ;Y T = 2), it is easy to conclude that the second constraint in (2) is a decreasing function in α. For a given power allocation, the optimal time-sharing parameter α is therefore found by equating the first constraint in (2) with the second one. As a consequence, both bounds in (2) are always tight under half-duplex relaying if the time-sharing parameter is chosen optimally. This is in contrast to the full-duplex case, where the second constraint may be loose. If a suboptimal split of the channel uses for the first and second time slots is considered, the situation becomes similar to full-duplex relaying with a suboptimal power allocation, as discussed above: for α < α, only the first constraint needs to be considered when specifying the code design, and for α > α, only the second constraint needs to be taken into account. The optimal code design can now be found using the same arguments as in the previous discussion. The source uses the code C S,1 during the first time slot for transmitting u to the relay. In the second time slot, the relay uses the code C R for transmitting the bin index s, and the source encodes v using the code C S,2. The code C S,2(s) is then obtained by superposing codewords of the codes C S,2 and C R similarly to Equation (3). The destination decodes u by considering the code Ĉ S (s), which is obtained by restricting the codeword set of C S,1 to codewords that are included in the bin S s. C S,1 and ĈS(s) form a pair of nested codes similar to the previous case. We can now conclude the following design objectives: 1. In order to achieve the first bound in (2), C S,1 and CS,2 have to be designed to be capacity achieving for the interference-free source-to-relay channel in the first time slot and the interference-free source-to-destination channel in the second time slot, respectively. 2. Since CS,2 is designed to achieve the capacity of the interference-free source-to-destination link in the second time slot, C R has to achieve the capacity of the relay-destination link in the presence of interference from the codewords transmitted from the source during the second time slot in order to reach the second constraint in (2). 3. Achieving the second constraint in (2) requires furthermore that the code ĈS(s) with rate ˆR = R S,1 R 0 /α is capacity achieving over the interference-free source-to-destination link 16

during the first time slot. Therefore, both the fine code C S,1 and the coarse code ĈS(s) have to be designed to be capacity achieving. As a final remark, we note that the code structure that is illustrated in Figure 7 and that we discussed in the full-duplex case, can also be adopted for the half-duplex scenario. That is, the half-duplex rates are also achievable without explicit binning. Similarly to the full-duplex case, the destination considers the code C, with rate R = αr S,1 and codewords x(u) = [x S,1 (u),x R (u)], when decoding the message u, and it decodes using the channel observations of both time slots while treating the interference from x S,2 as noise. After successfully decoding u, the message v is decoded based on the interference-free channel outputs in the second time-slots. This coding scheme leads to the highest achievable rates if both the code C S,1 as well as the extended code C are capacity achieving for the considered channels and the respective rates. 3 Distributed Coding for the Three-node Relay Channel In Section 2.3, we summarized the fundamental coding strategies that achieve the decode-andforward rates in the three-node relay channel. We identified the different component codes that have to be used during the transmission, and we stated fundamental constraints that limit the achievable rates. In this section, we discuss distributed coding for the three-node relay channel. In particular our focus is on the extension of the two main families of modern codes, LDPC codes and turbo codes, to the relaying scenario. 3.1 LDPC Code Designs for the Relay Channel Distributed coding for relaying based on LDPC codes is highly inspired by the information theoric analysis addressed in Section 2.3. In this section, we discuss different code structures based on LDPC codes that are useful for implementing the coding strategies introduced in Section 2.3. We also discuss optimization of irregular LDPC codes and spatially-coupled LDPC (SC-LDPC) codes. 3.1.1 Code Structures for Decode-and-Forward Relaying In Section 2.3, we showed that in both the full-duplex and the half-duplex cases, the highest achievable rate under decode-and-forward relaying can be achieved either by using rate-compatible code structures or by using a binning scheme, which leads to a nested codes design. In the following, we introduce code structures that can be employed to realize the desired coding schemes. We start with different implementations of the binning scheme. Nested Codes Binning was introduced in Section 2.3.1 as a random partitioning of the message set of the source, performed at the relay. In Section 2.3.3, we have then shown that restricting the code used by the source to codewords contained in the bin, which is indicated by the relay, defines a pair of nested codes. Since we are interested in the design of linear codes, the question that arises is how pairs of good nested linear codes can be constructed. The answer to this question is given in [17], and we summarize the main points here. Let us consider a pair of length-n nested codes (C,Ĉ) with rates R and ˆR, respectively, such that Ĉ C. It follows ˆR < R. Let in the following H 1 denote the (n k 1 ) n parity-check matrix 17

H 1 H 2 Figure 8: Tanner graph of a bilayer/two-edge type expurgated code. that defines C and let H be the (n k 2 ) n parity check matrix of the code Ĉ. Accordingly, H 1 x T 1 = 0, for all x 1 C, and Hx T 2 = 0, for all x 2 Ĉ. Then, if [ ] H1 H =, (10) where H 2 is a (k 1 k 2 ) n matrix, the codes C and Ĉ indeed form a pair of nested linear codes, satisfying Ĉ C. This follows directly from the fact that the parity-check matrix H 1 is included in H, hence H 1 x T 2 = 0, for all x 2 Ĉ. On the other hand, it is clear that only some codewords x 1 C are also included in the coarse code Ĉ. Since the additional constraints defined by H 2 remove codewords from the codeword set of C, the code Ĉ is referred to as an expurgated code. Furthermore, the parity-check matrix H, as specified in (10), describes a bilayer linear block code, also referred to as two-edge type code. These two terms are motivated by the structure of the parity-check matrix and the corresponding Tanner graph, which is illustrated in Figure 8 for a simple example. As it can be seen from the figure, the Tanner graph consists of three different types of nodes, the variable nodes, the check nodes associated with the parity-check matrix H 1, and the check nodes associated with the parity-check matrix H 2, which are connected through two different types/layers of edges. The two layers are distinguished by solid and dashed lines in Figure 8. In the considered example, H 1 is the paritycheck matrix of a rate-2/3 code with regular variable node degree d v,1 = 2 and check node degree d c,1 = 6. By adding the check nodes specified by the matrix H 2, an overall rate-1/2 code with regular node degree d v = 3 and check node degree d c = 6 is obtained. In order to implement a binning scheme based on this code structure, we can assign to every codeword x 1 C a length-(k 1 k 2 ) syndrome s, defined as s = Hx T 1. Since there exist 2 k 1 k 2 unique syndrome vectors s, we can define a set of cosets of the coarse code Ĉ, with elements Ĉ(s) labeled by the syndrome vectors s as follows: { [ ] [ ]} Ĉ(s) = x Hx T H1 0 = x T =. s H 2 H 2 The union of all cosets Ĉ(s) reproduces the fine code C, i.e., C = Ĉ(s). s {0,1} k 1 k 2 We conclude that the cosets Ĉ(s) provide a structured approach for partitioning the fine code C into 2 k 1 k 2 disjoint bins of equal size. The bin index for a given codeword x C is then given by the corresponding syndrome s, and it can easily be calculated as s = H 2 x T. This code structure can now be applied to the relay channel in the following way (see, e.g., [14, 18 21]): The source uses the fine code C for transmitting its message (i.e., C corresponds to 18