The Case for Transmitter Training

he Case for ransmitter raining Christopher Steger, Ahmad Khoshnevis, Ashutosh Sabharwal, and Behnaam Aazhang Department of Electrical and Computer Engineering Rice University Houston, X 775, USA Email: {cbs, farbod, ashu, aaz}@rice.edu Abstract ransmitter side information enables techniques such as beamforming, power control, and rate control in fading channels. It is commonly accepted in the literature that the addition of transmitter information CSI to receiver information CSIR provides better performance than receiver information alone. In this work, we examine the performance of a symmetric, single-input, multiple-output SIMO channel in which CSI is acquired through the use of training symbols, and we have a genie-aided receiver. We give a closed form expression for outage probability at high SNR while accounting for the resources consumed by training. We also analyze the diversity-multiplexing tradeoff and find that, though the diversity falls far below that of systems with perfect CSI, it is still sufficiently superior to that achieved by CSIR-only systems to justify the cost of training. We show that, at zero multiplexing, transmitter training doubles the diversity order of a CSIR-only system and offers nonzero diversity at all achievable multiplexing gains. I. INRODUCION he utility of transmitter channel state information CSI is well documented [], [], [3], [4], [5], [6], [7] and used in many practical systems in the form of beamforming, power control, and rate control. Existing analytical comparisons between systems with receiver channel state information CSIR only and systems with perfect CSI in addition to CSIR have limited applicability to practical systems because they do not account for the resources consumed by providing channel state information to the transmitter either through feedback or reverse training and the possible errors in that information. In this work, we ask and answer the following questions. When we account for the resources expended in obtaining CSI and the associated estimation errors, does transmitter training still provide significant gains over CSIR-only systems? Under what conditions, if any, can we say that trainingbased CSI provides performance comparable to perfect CSI systems? In our proposed framework, we demonstrate that it is indeed beneficial to expend resources on transmitter training subject to sufficiently long coherence time. Given M receive antennas, we find that transmitter training yields dr M r ML for r ML diversity versus only M r for r in a CSIR-only system where d is diversity, r is multiplexing, is the coherence time, and LM is the total number of transmitter training symbols. In contrast, perfect CSI models predict infinite diversity [], []. herefore, we show that systems with transmitter training are not well approximated by any existing models. he remainder of the work is structured as follows. Section II defines our channel model and basic assumptions. Section III contains our principal contributions. We numerically evaluate the outage performance of our system in Section III-E. Section IV contains a summary of our results and ideas for future extensions. A. Channel Model II. CHANNEL AND SYSEM MODELS We consider a single input and M > output SIMO channel with flat fading and additive Gaussian noise. We define the channel realization as H C M. Let s be the complex scalar information bearing signal, Y be the channel output vector, and N be the additive noise. he input output relationship is given by Y Hs + N. Every element of the channel and noise vectors is an iid, zero mean, circularly symmetric, complex Gaussian ZMCSCG random variable. he channel is assumed to be block fading with a coherence time of symbol intervals, and each channel realization is independent from previous realizations. he assumption of block fading is essential to our results because it eliminates error due to changes in the channel between when the times of estimation and data reception, and it makes the error variance dependent on SNR alone. B. System Model We assume that the transmitter estimates the channel based on training symbols, but the receiver has perfect knowledge of both the channel and the transmitter s estimate of the channel. For the remainder of the work, we refer to such a receiver as being genie-aided. Our system is bidirectional and time-division duplexed DD such that signals can be sent from the transmitter to the receiver and from the receiver to the transmitter on the same channel. We further assume that the receiver sends training in advance of every transmission. hus, out of every symbol intervals, LM are dedicated to training the transmitter with L symbols for each of the M separate channels. We require that our training symbols be transmitted at a constant power equal to the average power constraint,, of the transmitter. We capture the inherent uncertainty in channel estimation by dividing the channel, H, into two parts, Ĥ and H such

that H Ĥ + H. We consider Ĥ to be the part of the channel that is known, the estimate, and H to be the part of the channel that is unknown, the estimation error [8]. We introduce the following terms to describe the channel: i h i, ˆ i ĥi, and i h i. Because we assume MMSE channel estimation, the variance of the channel estimation error is governed by the Cramér-Rao Lower Bound CRLB [9]. hus, we have σ h σ n, 3 L in which L represents the number of training symbols per channel, σn is the variance of the additive Gaussian noise, and is the transmission power. his model contrasts with previous works such as [] and [] in that the channel estimation error decreases monotonically with increasing SNR. For simplicity, we let σn. When the transmitter has channel state information CSI, it attempts to avoid outages through the use of power control while still satisfying an average power constraint,. We define our power control function as P Ĥ such that the amplitude of the transmitted signal is scaled by P AĤ Ĥ and the ZMCSCG input prior to power control is x. hus, becomes s AĤx Y HAĤx + N. 4 We assume that x has unit variance. In order to maximize the received SNR, the receiver performs maximal ratio combining MRC. We define the following three terms to describe the output of the maximal ratio combiner. he effective channel seen at the output of the combiner is M i i. he transmitter s estimate of is ˆ M i ˆ i. For convenience, we define M i i. Note that ˆ +. Next, we examine ˆ M ˆ h i h M M i h i, h i h i, h i +. 5 i i Direct analysis of ˆ proves unwieldy, so we define bounds ˆ L and ˆ U such that ˆ L ˆ ˆ U. We construct our bounds by applying the Cauchy-Schwarz inequality, M h i, h M M i h i h i 6 i i i i. 7 herefore, we can define ˆ L and ˆU +. Finally, we state the probability density functions of, ˆ and. he χ m distribution applies nominally to the sum of squared iid Gaussian random variables with zero mean and unit variance. o account for nonunit variances, we have f χ m x m σ Γ m x m e x σ, 8 in which we call the variance σ and we have m degrees of freedom. Substituting in the variances σh, σ from 3, and σ + ĥ σ, and our M degrees of freedom, we have the following PDFs: f ΓM M e 9 f σ M Γ M M e σ fˆ ˆ + σ M Γ M ˆM e ˆ + σ. A. Power Control III. RESULS Recalling that the receiver is genie-aided, we have the following expression for channel capacity [7] C log + P ˆ. o account for the fact that M L out of every symbol intervals are consumed by training symbols, we place a factor of ML in front of the log term. By inverting the expression and letting ˆ, we find that the minimum power, P OP e R ML, 3 such that we achieve a target rate, R. In [], the authors showed that such a power control scheme can achieve zero probability of outage for sufficiently large average power constraint,. In order to minimize the probability of falling below the target rate, we would like to approximate the optimal power control scheme. However, any power control system that relies on an imperfect estimate of the channel has the possibility of providing insufficient power. o mitigate the effect of the uncertain knowledge, we propose the following suboptimal power control scheme for systems with imperfect CSI k e R ML P ˆ. 4 ˆ We define k to be an increasing function of the average power constraint,, such that k for all of interest. hus, in the presence of perfect CSI, our power control would supply at least as much power as the optimal system. Intuitively, k serves as a multiplicative safety margin against supplying insufficient power due to channel estimation errors. hough we do not present the results in this work, an additive safety margin produces essentially identical performance. Power Constraint: We now find Eˆ [P ˆ] as a function of k as follows, P ˆfˆ ˆdˆ 5 k e R ML + σ M e R ML 6 k M. 7

We make the last approximation because we are interested in large where σ. Next, we solve for k as k M. 8 e R ML Now, we use 8 to restate 4 in terms of M and as P ˆ M ˆ. 9 Examining 9 reveals that increasing M allows the transmitter to use increased power over all channel states. Intuitively, increasing M decreases the probability of a deep fade, so the transmitter can redistribute the power saved by avoiding deep fades to serve as a larger safety margin against CSI errors. B. Outage Probability We assume that our coherence time,, is sufficiently long that the dominant source of error is fading induced error. hus, we require such that the error rate associated with the blocklength of the data codeword is no larger than the error rate due to fading induced error. herefore, if B min is the minimum blocklength, we require that LM > B min. Our first main result describes outage events in our system. heorem 3. Outage Events: Given a system in which the average power constraint is sufficiently large that outage probability equal to zero is achievable in the presence of perfect CSI and CSIR, a system with imperfect CSI and a genie-aided receiver will be in outage if and only if P ˆ < P OP. Proof: [of heorem 3.] An outage event is a state in which the target datarate, R, is above the rate supported by the channel. Because we are using ML of our symbol intervals for training, our datarate is equal to the mutual information of the channel scaled by ML. hus, we have the following series of equivalent inequalities: ML log + P ˆ log + P ˆ log + P ˆ P ˆ P ˆ a < R b < c < log R ML + e R ML d < e R ML e < P OP. Inequality a is the conventional outage definition, and b is division of both sides by ML, which we know to be positive. In c, the right side is exactly equal to the right side in b. By exponentiating and subtracting from both sides, we have step d, and e is simple substitution of 3. herefore, we have that P ˆ < P OP is both necessary and sufficient to declare an outage. Next, we state our second main result which defines the probability of outage of our system at high SNR. heorem 3. Outage Probability: At high SNR, the outage probability, Π, of a system with power control function 4 is given by Π Prob <. Proof: [of heorem 3.] We begin by stating the outage probability, Π, in terms of heorem 3. Π Prob P ˆ < P OP Prob < ˆ. his outage probability is analytically intractable, so we use our bounds on ˆ to define Π U Prob < ˆ U and Π L Prob < ˆ L such that Π L Π Π U. Fortunately, analysis of the two bounds is far more tractable. We begin with the lower bound, Π L Prob < ˆ L 3 Prob < 4 Prob k < + Prob k + < <.5 Recall that k, so k, and. herefore, the first probability term is zero, and Π L Prob k + < <. 6 k We observe that + >, so the first event implies the second, and we can simplify our expression further Π L Prob k + < 7 Prob <. 8 We make the final approximation for large k, which is equivalent to high SNR. Now, we turn our attention to Π U, Π U Prob < ˆ U 9 Prob < + 3 Prob k < 3 Prob <. 3 As k grows large and k k, we can see that the upper and lower bounds on outage probability converge to. herefore, we have an exact expression for probability of outage at high SNR, as desired.

C. Diversity Order Diversity, as defined in 33, describes the rate at which outage probability decays as SNR grows large [3], [6]. log Π, R d lim log 33 In our third main result, we give the maximum diversity order achieved by our system. heorem 3.3 Diversity Order: In a SIMO system with M receive antennas that gains partial CSI through training and has a genie-aided receiver, we have d M. Proof: [of heorem 3.3] We begin by evaluating our expression for Π from heorem 3. Π Prob < 34 f f d d. 35 In order to keep the notation manageable, we will start with the inner integral alone. From an integral table, we have the following [4] x m e ax dx e ax m r We solve the inner integral as f d r σ M ΓM σ M ΓM e M n m!x m r. 36 m r!ar+ σ M e σ d 37 M! M n M n! σ. 38 n+ Now, we evaluate the outer integral, substituting 9 for f Π M! σ M ΓM ΓM M n k M n M n! σ n+ M e M n e σ d. 39 Next, we apply 36 to the integral in 39 and get the following, in which we define M M n for clarity e + k M M! M m σ M m! + k σ m+ m M! + k M + σ 4 M! M n. k σ 4 We make the final approximation because k σ at large. After substituting 4 into 39 and simplifying, we have M M! σm M n! Π ΓM k M 4 M n! n σ M GM, 43 k where we define GM for notational simplicity. Now, it remains to use 3 and 8 to put Π in terms of e R M ML Π GM M L. 44 We conclude by calculating the diversity order log Π lim log M. 45 hus, we see that for we indeed have d M, as desired. Observe that, provided M LM <, the diversity order is independent of the number of training symbols. herefore, even a very large number of training symbols would be insufficient to provide diversity comparable to perfect CSI. D. Diversity Multiplexing radeoff Multiplexing gain, defined in 46, describes the rate at which the system s target datarate increases as SNR grows large [3] r lim R log. 46 Our fourth main result is the diversity-multiplexing tradeoff of our system. heorem 3.4 Diversity-Multiplexing: Given M receive antennas and L training symbols per antenna per coherence interval,, to estimate CSI in a SIMO system with a genieaided receiver, we have the following equation for diversity order as a function of multiplexing gain, dr M r ML r ML. 47 Proof: [of heorem 3.4] o study the tradeoff between diversity and multiplexing, we begin by performing the customary substitution and let R r log [3], then we have e R r ML ML for r >. We substitute into 8 k M r ML 48 r ML M. 49 Now, we find the outage probability as a function of r by substituting into 44 M Π GM M r ML. 5 LM

Perfect CSI Diversity, d M M ransmitter raining CSIR Only -ML Multiplexing, r Fig.. Upper and lower bounds on outage probability along with approximate outage probability curves for systems with M receive antennas Fig.. Diversity-multiplexing tradeoff curves for SIMO systems with perfect CSI, transmitter training, and CSIR only. With our expression for Π, we find d log Π lim log M r ML Hence, we have a linear tradeoff with dr M. 5 r ML for r ML, as desired. Our analysis reveals that increasing the number of antennas in a system can harm performance if it causes the training to become a significant fraction of the coherence time. E. Evaluation In this section, we use symbolic and numerical integration to evaluate our analytical results for outage probability and diversity-multiplexing tradeoff. Figure shows the upper and lower bounds on probability of outage for M and M 4. It also features approximations to the outage probabilities, and we observe the predicted convergence of the bounds at high SNR. In this example, we set the target rate R nat per symbol interval and used only M transmitter training symbol, assuming M. Figure features our the diversity-multiplexing tradeoffs of three contrasting systems: perfect CSI, transmitter training, and CSIR only. Clearly, transmitter training offers a significant improvement over CSIR alone, but it bears very little resemblance to the perfect CSI case. In addition, note that proper accounting for training resources allows r to approach only asymptotically in. IV. CONCLUSIONS In this work, we answer the question of whether it is beneficial to dedicate resources to transmitter training over a symmetrical channel. We discovered that transmitter training offers significant benefits over CSIR alone. However, we also showed that performance predicted by perfect CSI models is unreachable in a practical system because neither large numbers of training symbols nor large numbers of antennas provide sufficient gains when we consider the cost of training resources. In the future, we plan to extend the results from this work to examine cases in which both the transmitter and the receiver derive their channel state information from training. We intend to continue the practice of thorough resource accounting to produce a fair comparison between bidirectional training systems and those that use a combination of training and feedback. ACKNOWLEDGMEN his work is supported in part by NSF ANI-3597 and CCR- 3398, a Xilinx Summer Fellowship, and a grant from Nokia Corporation. REFERENCES [] G. Caire, G. aricco, and E. Biglieri, Optimum power control over fading channels, IEEE ransactions on Information heory, vol. 45, no. 5, pp. 468 489, July 999. [] A. Goldsmith and P. Varaiya, Capacity of fading channels with channel side information, Information heory, IEEE ransactions on, vol. 43, no. 6, pp. 986 99, November 997. [3] K. Mukkavilli, A. Sabharwal, E. Erkip, and B. Aazhang, On beamforming with finite rate feedback in multiple-antenna systems, IEEE rans. on Information heory, vol. 49, no., pp. 56 579, October 3. [4] D. Love, R. Heath, and. Strohmer, Grassmannian beamforming for multiple-input multiple-output wireless systems, Information heory, IEEE ransactions on, vol. 49, no., pp. 735 747, October 3. [5] S. Bhashyam, A. Sabharwal, and B. Aazhang, Feedback gain in multiple antenna systems, IEEE rans. on Communications, vol. 5, no. 5, pp. 785 798, May. [6] A. Khoshnevis and A. Sabharwal, On the asymptotic performance of multiple antenna channels with fast channel feedback, Submitted to IEEE ransactions on Information heory, September 5. [7] G. Caire and S. Shamai, On the capacity of some channels with channel state information, IEEE ransactions on Information heory, vol. 45, no. 6, pp. 7 9, September 999. [8] M. Medard, he effect upon channel capacity in wireless communications of perfect and imperfect knowledge of the channel, IEEE rans. on Information heory, vol. 46, no. 3, pp. 933 946, May. [9] H. Cramér, Mathematical Methods of Statistics. Princeton, NJ: Princeton University Press, 946. []. Yoo and A. Goldsmith, Capacity and optimal power allocation for fading mimo channels with channel estimation error, Submitted to IEEE rans. on Information heory, August 4. [] A. Lapidoth and S. Shamai, Fading channels: How perfect need perfect side information be? IEEE rans. on Information heory, vol. 48, no. 5, pp. 8 34, May. [] E. Biglieri, G. Caire, and G. aricco, Limiting performance of blockfading channels with multiple antennas, IEEE ransactions on Information heory, vol. 47, no. 4, pp. 73 89, May. [3] L. Zheng and D. N. C. se, Diversity and multiplexing: A fundamental tradeoff in multiple-antenna channels, IEEE rans. on Inform. heory, vol. 49, no. 5, pp. 73 96, May 3. [4] W. Beyer, Ed., Standard Mathematical ables and Formulae, 9th ed. London: CRC Press, 99.