Part I. Signal Processing and Detection

Size: px

Start display at page:

Download "Part I. Signal Processing and Detection"

Tracey James
5 years ago
Views:

1 Part I Signal Processing and Detection

2 Contents I Signal Processing and Detection Fundamentals of Discrete Data ransmission 4. Data Modulation and Demodulation Waveform Representation by Vectors Synthesis of the Modulated Waveform Vector-Space Interpretation of the Modulated Waveforms Demodulation MIMO Channel Basics Discrete Data Detection he Vector Channel Model Optimum Data Detection Decision Regions Irrelevant Components of the Channel Output he Additive White Gaussian Noise (AWGN) Channel Conversion from the Continuous AWGN to a Vector Channel Optimum Detection with the AWGN Channel Signal-to-Noise Ratio (SNR) Maximization with a Matched Filter Error Probability for the AWGN Channel Invariance to Rotation and ranslation Union Bounding he Nearest Neighbor Union Bound Alternative Performance Measures Block Error Measures General Classes of Constellations and Modulation Fair Comparisons Cubic Constellations Orthogonal Constellations Circular Constellations - M-ary Phase Shift Keying Rectangular (and Hexagonal) Signal Constellations Pulse Amplitude Modulation (PAM) Quadrature Amplitude Modulation (QAM) Constellation Performance Measures Hexagonal Signal Constellations in Dimensions Additive Self-Correlated Noise he Filtered (One-Shot) AWGN Channel Optimum Detection in the Presence of Self-Correlated Noise he Vector Self-Correlated Gaussian Noise Channel Performance of Suboptimal Detection with Self-Correlated Noise Chapter Exercises A Gram-Schmidt Orthonormalization Procedure 04

3 B he Q Function 05 3

A message sender at the transmitter communicates with a message receiver.

4 Chapter Fundamentals of Discrete Data ransmission Figure. illustrates discrete data transmission, which is the transmission of one message from a finite set of messages through a communication channel. A message sender at the transmitter communicates with a message receiver. he sender selects one message from the finite set, and the transmitter sends a corresponding signal (or waveform ) that represents this message through the communication channel. he receiver decides the message sent by observing the channel output. Successive transmission of discrete data messages is known as digital communication. Based on the noisy received signal at the channel output, the receiver uses a procedure known as detection to decide which message, or sequence of messages, was sent. Optimum detection minimizes the probability of an erroneous receiver decision on which message was transmitted. his chapter characterizes and analyzes optimum detection for a single message transmission through the channel. Dependencies between message transmissions can be important also, but the study of such inter-message dependency is deferred to later chapters. Such single-message analysis is often called one-shot analysis. he messages are usually digital sequences of bits, which are usually not compatible with transmission of physical analog signals through a communication channel. hus the messages are converted into analog signals that can be sent through the channel. Section. introduces both encoding and modulation to characterize such conversion of messages into analog signals by a transmitter. Encoding is the process of converting the messages from their innate form (typically bits) into vectors of real numbers that represent the messages. Modulation is a procedure for converting the encoder-output real-number vectors into analog signals for transmission through a physical channel. he last subsection of Section. introduces the increasingly relevant situation (particularly for multiple-input multiple-output or MIMO Message Sender Message Set Communication Channel Detector Message Receiver ransmitter Receiver Figure.: Discrete data transmission. 4

5 transmission that may coordinate the use of multiple transmission antennas or wires) of vectored-basisfunction transmission systems. his generalization of the basic modulation theory that is tacitly carried thereafter in Chapter but more explicitly revisited in later chapters. Section. studies the theory of optimal detection, which depends on a probabilistic model for the communication channel. he channel distorts the transmitted signals both deterministically and with random noise. he noisy channel output will usually not equal the channel input and will be described only in terms of conditional probabilities of various channel-output signals. he channel-input signals have probabilities equal to the probabilties of the messages that they represent. he optimum detector will depend only on the probabilistic model for the channel and the probability distribution of the messages at the channel input. he general optimum detector specializes in many important practical cases of interest. his chapter develops a theory of modulation and detection that uses a discrete vector representation for any set of continuous-time signals. his vector-channel approach was pioneered for educational purposes by Wozencraft and Jacobs in their classic text [] (Chapter 4). In fact, the first four sections of this chapter closely parallel their development (with some updating and rearrangement), before diverging in Sections.5.7 and in the remainder of this text. he general model for modulation and demodulation leads to a discussion of the relationship between continuous signals and their vector-channel representation, essentially allowing easier analysis of vectors to replace the more difficult analysis of continuous signals. Section. solves the general detection problem for the discrete vector channel. Section.3 shows that the most common case of a continuous Gaussian-noise channel maps easily into the discrete vector model without loss of generality. Section.3 then finds the corresponding optimum detector with Gaussian noise. Given the optimum detector, Section.4 shows methods to calculate and estimate average probability of error, P e, for a vector channel with Additive White Gaussian Noise (AWGN). Sections.5 and.6 discuss several popular modulation schemes and determine bounds for their probability of error with AWGN. Section.6 focuses in particular on signals derived from rectangular lattices, a popular signal transmission format. Section.7 then generalizes results for the case of self-correlated Gaussian noise. 5

6 Figure.: Discrete data transmission with greater detail.. Data Modulation and Demodulation Figure. adds more detail to the basic discrete data transmission system of Figure.. he messages emanate from a message source. A vector encoder converts each message into a symbol, which is a real vector x that represents the message. Each possible message corresponds to a distinct value of the symbol vector x. he words symbol and message are often used interchangeably, with the tacit understanding that the symbol actually represents the message via the action of the encoder. A message from the set of M possible messages m i i = 0,..., M is sent every seconds, where is the symbol period for the discrete data transmission system. hus, messages are sent at the symbol rate of / messages per second. he number of messages that can be sent is often measured in bits so that b = log (M) bits are sent every symbol period. hus, the data rate is R = b/ bits per second. he message is often considered to be a real integer equal to the index i, in which case the message is abbreviated m with possible values 0,...M. he modulator converts the symbol vector x that represents the selected message into a continuous time (analog) waveform that the transmitter outputs into the channel. here is a set of possible M signal waveforms {x i (t)} that is in direct one-to-one correspondence with the set of M messages. he demodulator converts continuous-time channel output signals back into a channel output vector y, from which the detector tries to estimate x and thus also the message sent. he messages then are provided by the receiver to the message sink. In any data transmission system, the physically transmitted signals are necessarily analog and continuous time. In general, the conversion of a discrete data signal into a continuous time analog signal is called modulation. he inverse process of converting the modulated signal back into its original discrete form is called demodulation. hus, the combination of encoding and modulation in the transmitter leads to the mapping: discrete message m i x i (t) continuous waveform. Conversely, the combination of demodulation and detection in the receiver leads to the mapping: continuous waveform y(t) ˆm discrete message. When the receiver output message is not equal to the transmitter input message, an error occurs. An optimum receiver minimizes the probability of such errors for a given communications channel and set of message waveforms. 6

Message Set Channel(f) x x ( t) = + cos( 50 t) 0 π ( t) = cos( 50 t) π 00 00 f Detector Message Receiver ransmitter Receiver Figure.

with a specific linear time-invariant channel that has the Fourier transform indicated. his channel essentially passes signals between 00 Hz and 00 Hz with 50 Hz having the largest gain.

7 Message Set Channel(f) x x ( t) = + cos( 50 t) 0 π ( t) = cos( 50 t) π f Detector Message Receiver ransmitter Receiver Figure.3: Example of channel for which volt and 0 volt binary transmission is inappropriate. EXAMPLE.. (binary phase-shift keying) Figure.3 repeats Figure. with a specific linear time-invariant channel that has the Fourier transform indicated. his channel essentially passes signals between 00 Hz and 00 Hz with 50 Hz having the largest gain. Binary logic familiar to most electrical engineers transmits some positive voltage level (say perhaps volt) for a and another voltage level (say 0 volts) for a 0 inside integrated circuits. Clearly such a constant /0 transmission on this DC-blocking channel would not pass through the channel, leaving 0 always at the output and making a receiver detection of the correct message difficult if not impossible. Instead the two modulated signals x 0 (t) = +cos(πt) and x (t) = cos(πt) will easily pass through this channel and be readily distinguishable at the channel output. his latter type of transmission is known as BPSK for binary phase-shift keying. If the symbol period is second and if successive transmission is used, the data rate would be bit per second ( bps). In more detail, the engineer could recognize the trivial vector encoder that converts the message bit of 0 or into the real one-dimensional vectors x 0 = + and x =. he modulator simply multiples this x i value by the function cos(πt). A variety of modulation methods are applied in digital communication systems. o develop a separate analysis for each of these formats would be an enormous task. Instead, this text uses a general vector representation for modulated signals. his vector representation leads to a single method for the analysis of the performance of the data transmission (or storage) system. his section describes the discrete vector representation of any finite or countably infinite set of continuous-time signals and the conversion between the vectors and the signals. he analysis of the detection process will simplify for an additive white Gaussian noise (AWGN) channel through the symbol-vector approach, which was pioneered by Wozencraft and Jacobs. his approach, indicated in Figure. by the real-valued vector symbols x i and y, decouples the probabilityof-error analysis from the specific modulation method. Each modulation method uses a set of basis functions that link the vector x i with the continuous waveform x i (t). he choice of modulation basis functions usually depends upon their spectral properties. his chapter investigates and enumerates a number of different basis functions in later sections. However, this chapter is mainly concerned with a single transmission. Each of such successive transmissions could be treated independently because by ignoring transients at the beginning or end of any message transmission as they would be negligible in time extent on such a channel. 7

8 ϕ ( t) x x ϕ ( t) ϕ 3 ( t) x 3 x Figure.4: Vector space... Waveform Representation by Vectors he reader should be familiar with the infinite-series decomposition of continuous-time signals from the basic electrical-engineering study of Fourier series in signals and systems. For the transmission and detection of a message during a finite time interval, this text considers the set of real-valued functions {f(t)} such that 0 f (t)dt < (technically known as the Hilbert space of continuous time functions and abbreviated as L [0, ]). his infinite dimensional vector space has an inner product, which permits the measure of distances and angles between two different functions f(t) and g(t), f(t), g(t) = 0 f(t) g(t)dt. Any well-behaved continuous time function x(t) defined on the interval [0, ] decomposes according to some set of N orthonormal basis functions {ϕ i (t)} as x(t) = N x n ϕ n (t) n= where ϕ n (t) satisfy ϕ n (t), ϕ m (t) = for n = m and 0 otherwise. he continuous function x(t) describes the continuous-time waveform that carries the information through the communication channel. he number of basis functions that represent all the waveforms {x i (t)} for a particular communication system may be infinite, i.e. N may equal. Using the set of basis functions, the function x(t) maps to a set of N real numbers {x i }; these real-valued scalar coefficients assemble into an N-dimensional real-valued vector x = x. x N. hus, the function x(t) corresponds to an N-dimensional point x in a vector space with axes defined by {ϕ i (t)} as illustrated for a three-dimensional point in Figure.4. Similarly a set of continuous time functions {x i (t)} corresponds to a set of discrete N-dimensional points {x i } known as a signal constellation. Such a geometric viewpoint advantageously enables the 8

9 visualization of the distance between continuous-time functions using distances between the associated signal points in R N, the space of N-dimensional real vectors. In fact, later developments show x (t), x (t) = x, x, where the right hand side is taken as the usual Euclidean inner product in R N (discussed later in Definition..6). his decomposition of continuous-time functions extends to random processes using what is known as a Karhunen-Loeve expansion. he basis functions also extend for all time, i.e. on the infinite time interval (, ), in which case the inner product becomes f(t), g(t) = f(t)g(t)dt. Decomposition of random processes is fundamental to demodulation and detection in the presence of noise. Modulation constructively assembles random signals for the communication system from a set of basis functions {ϕ n (t)} and a set of signal points {x i }. he chosen basis functions and signal points typically satisfy physical constraints of the system and determine performance in the presence of noise... Synthesis of the Modulated Waveform he description of modulation begins with the definition of a data symbol: Definition.. (Data Symbol) A data symbol is defined as any N-dimensional real vector x x = x.. (.) x N he data symbol is in lower-case boldface, indicating a vector, to distinguish it from its components, shown in lowercase Roman to indicate scalars. Unless specified otherwise, all quantities shall be realvalued in this chapter. Extensions of the definitions to complex-valued quantities occurs in succeeding chapters as necessary. he synthesis of a modulated waveform uses a set of orthonormal basis functions. Definition.. (Orthonormal Basis Functions) A set of N functions {ϕ n (t)} constitute an N-dimensional orthonormal basis if they satisfy the following property: ϕ m (t)ϕ n (t)dt = δ mn = { m = n 0 m n. (.) he discrete-time function δ mn will be called the discrete delta function. he construction of a modulated waveform x(t) appears in Figure.5: Definition..3 (Modulated Waveform) A modulated waveform, corresponding to the data symbol x, for the orthonormal basis ϕ n (t) is defined as x(t) = N x n ϕ n (t), (.3) n= hus, the modulated signal x(t) is formed by multiplying each of the components of the vector x by the corresponding basis function and summing the continuous-time waveforms, as shown in Figure.5. here are many possible choices for the basis functions ϕ n (t), and correspondingly many possible modulated waveforms x(t) for the same vector x. he specific choice of basis functions used in a communication system depends on physical limitations of the system. In practice, a modulator can construct a modulated waveform from any set of data symbols, leading to the concept of a signal constellation: δ mn is also called a Kronecker delta. 9

10 x x ϕ ( t) Σ x(t) ϕ ( t ) x 3 ϕ 3 ( t) Figure.5: Modulator. Definition..4 A signal constellation is a set of M vectors, {x i } i = 0,..., M. he corresponding set of modulated waveforms {x i (t)} i = 0,..., M is a signal set. Each distinct point in the signal constellation corresponds to a different modulated waveform, but all the waveforms share the same set of basis functions. he component of the i th vector x i along the n th basis function ϕ n (t) is denoted x in. he occurrence of a particular data symbol in the constellation determines the probability of the i th vector (and thus of the i th waveform), px(i). he power available in any physical communication system limits the average amount of energy required to transmit each successive data symbol. hus, an important concept for a signal constellation (set) is its average energy: Definition..5 (Average Energy) he average energy of a signal constellation is defined by Ex = E [ x ] = M i=0 x i px(i), (.4) where x i is the squared-length of the vector x i, x i = N n= x in. E denotes expected or mean value. (his definition assumes there are only M possible waveforms and M i=0 p x(i) =.) he average energy is also closely related to the concept of average power, which is Px = E x corresponding to the amount of energy per symbol period., (.5) he minimization of Ex places signal-constellation points near the origin; however, the distance between points shall relate to the probability of correctly detecting the symbols in the presence of noise. he geometric problem of optimally arranging points in a vector space with minimum average energy while maintaining at least a minimum distance between each pair of points is the well-studied spherepacking problem, said geometric viewpoint of communication appeared first in Shannon s 948 seminal famous work, A Mathematical heory of Communication (Bell Systems echnical Journal). 0

11 0 πt πt ( t ) ( ) x ( t) = ( ) π ϕ cos + = sin / / 0 πt ( t) ( ) π ϕ cos = 4 0 πt ( t) = ( ) x sin 0 0 / / 0 0 Figure.6: BPSK basis functions and waveforms. he following example at this point illustrates the utility of the basis-function concept: EXAMPLE.. A commonly used and previously discussed transmission method is Binary Phase-Shift Keying (BPSK), used in some satellite and deep-space transmissions as well as a number of simple transmission systems. A more general form of the basis functions, which cos [ πt cos [ πt π 4 are parameterized by variable, is ϕ (t) = + ] π 4 and ϕ (t) = for 0 t and 0 elsewhere. hese two basis functions (N = ), ϕ (t) and ϕ (t), are shown in Figure.6. he two basis functions are orthogonal to each other and both have unit energy, thus satisfying the orthonormality condition. he two possible modulated waveforms transmitted during the interval [0, ] also appear in Figure.6, where x 0 (t) = ϕ (t) ϕ (t) and x (t) = ϕ (t) ϕ (t). hus, the data symbols associated with the continuous waveforms are x 0 = [ ] and x = [ ] (a prime denotes transpose). he signal constellation appears in Figure.7. he resulting waveforms are x 0 (t) = sin( πt ) and x (t) = sin( πt ). his type of modulation is called binary phase-shift keying, because the two waveforms are shifted in phase from each other. Since only two possible waveforms are transmitted during each second time interval, the information rate is log () = bit per seconds. hus to transmit at Mbps, must equal µs. (Additional scaling may be used to adjust the BPSK transmit power/energy level to some desired value, but this simply scales all possible constellation points and transmit signals by the same constant value.) Another set of basis functions is known as FM code (FM is Frequency Modulation ) in the storage industry and also as Manchester Encoding in data communications. his method is used to write (modulate) in many commercial disk storage products. It is also used in a quite different area known as Ethernet in what is called 0Base- Ethernet (the lowest ethernet speed commonly used in local area networks for the internet). he basis functions are approximated in Figure.8 in practice, the sharp edges are somewhat smoother depending on the specific implementation. he two basis functions again satisfy ]

12 ϕ ( t) x ϕ ( t) x 0 Figure.7: BPSK and FM/Manchester signal constellation. ϕ ( t ) x 0 ( t) 0 0 / / 0 ϕ ( t) ( t) 0 x / / 0 Figure.8: Manchester/FM ( Ethernet ) basis functions and waveforms.

13 the orthonormality condition. he data rate equals one bit per seconds; for a data transfer rate into the disk of GByte/s or 8 Gbps, = /(8GHz) = 5ps; by contrast at the much lower data rate of 0 Mbps in Ethernet, = 00 ns. However, both modulation methods have the same signal constellation. hus, for the FM/Manchester example, only two signal-constellation points are used, x 0 = [ ] and x = [ ], as shown in Figure.7, although the basis functions differ from the previous example. he resulting modulated waveforms appear in Figure.8 and correspond to the write currents that are applied to the head in the storage system.(additional scaling may be used to adjust either the FM or Ethernet transmit power/energy level to some desired value, but this simply scales all possible constellation points and transmit signals by the same constant value.) he common vector space representation (i.e. signal constellation) of the Ethernet and BPSK examples allows the performance of a detector to be analyzed for either system in the same way, despite the gross differences in the overall systems. In either of the systems in Example.., a more compact representation of the signals with only one basis function is possible. (As an exercise, the reader should conjecture what this basis function could be and what the associated signal constellation would be.) Appendix A considers the construction of a minimal set of basis functions for a given set of modulated waveforms. wo more examples briefly illustrate vector components x n that are not necessarily binary-valued. EXAMPLE..3 (Short-Haul non-coherent Fiber Ethernet 80.3bm - BQ) 3 his transmission system digital phone-line service uses M = 4 waveforms while the number of basis functions N =. hus, the system transmits bits of information per seconds of channel use. he basis function is roughly approximated 4 by ϕ (t) = sinc( t ), where / = 53.5 GHz, and sinc(x) = sin(πx) πx. his basis function is not time limited to the interval [0,]. he associated signal constellation appears in Figure.9. Longer-distance fiber transmission (up to km) may occur at / the symbol rate (6.565 GHz) so at roughly 50 Gbps in other related IEEE 80.3 Ethernet standards. bits are transmitted using one 4-level (or quaternary ) symbol every seconds, hence the name BQ. By contrast, telephone companies also often transmit the much lower data rate.544 Mbps Service symmetrically on twisted pairs between the fiber termination point (often called an Optical Line terminal) and a small business (such a signal often carries twenty-four 64 kbps digital voice signals plus overhead signaling information of 8 kbps). A method, known as HDSL (High-bit-rate Digital Subscriber Lines), uses BQ with / = 39 khz, and thus transmits a data rate of 784 kbps on each of two phone lines for a total of.568 Mbps (.544 Mbps plus 4 kbps of additional HDSL management overhead). he range of this system is about miles of twisted pair, making it more cost-effective and quicker to use the existing twisted pairs than to attempt to drill through walls, dig up streets, etc to get digital service to the small businesses. A more recent version of this will use various levels of PAM to get.544 Mbps, and higher symmetric speeds, on a single twisted pair at different lengths that may be shorter than miles. his is known as Symmetric HDSL or just SDSL. he two very different transmission systems use the same constellation and can be analyzed identically. EXAMPLE..4 (V.3-3CR) 5 Consider a signal set with 3 waveforms (M = 3) 3 IEEE 80.3bm is a standard that contains specifications for short-length non-coherent transmission at (roughly) 00 Gbps on each of up to 8-6 parallel channels (8 wavelengths with each having two polarizations) on up to roughly 500m of fiber. IEEE 80.3 standards also use other constellations for alternatives on longer lengths of fiber. 4 Actually / sinc(t/ ), or some other Nyquist pulse shape is used, see Chapter 3 on Intersymbol Interference. 5 he IU has published a set of modem standards numbered V.XX. hese are older standards, but are still used on hundreds of millions of phone lines globally, particularly in developing countries, where higher-speed xdsl services are not yet available. Essentially these voiceband modems use the analog end-to-end plain-old-telephone-service (POS) connection for digital transmission. It is interesting to note that these standards, now almost a half-century old can be analyzed the same as very modern standards at much higher speeds on wired connections like cablemodems. 3

14 ϕ Figure.9: BQ signal constellation. and with basis functions (N = ) for transmission of 3 signals per channel use. he IU 3CR compatible 9600bps voiceband modems use basis functions that are equivalent to πt ϕ (t) = cos and ϕ πt (t) = sin for 0 t and 0 elsewhere. A raw bit rate of.0kbps 6 is achieved with a symbol rate of / = 400 Hz. he signal constellation is shown in Figure.0; the 3 points are arranged in a rotated cross pattern, called 3 CR or 3 cross. 5 bits are transformed into of 3 possible -dimensional symbols, hence the extension in the name V.3. he last two examples also emphasize another tacit advantage of the vector representation, namely that the details of the rates and carrier frequencies in the modulation format are implicit in the normalization of the basis functions, and they do not appear in the description of the signal constellation...3 Vector-Space Interpretation of the Modulated Waveforms A concept that arises frequently in transmission analysis is the inner product of two time functions and/or of two N-dimensional vectors: Definition..6 (Inner Product) he inner product of two (real) functions of time u(t) and v(t) is defined by u(t), v(t) = he inner product of two (real) vectors u and v is defined by u, v = u v = u(t)v(t)dt. (.6) N u n v n, (.7) where denotes vector transpose (and conjugate vector transpose in Chapter and beyond). 6 he actual user information rate is usually 9600 bps with the extra bits used for error-correction purposes as shown in Chapter 8. n= 4

15 ϕ ( t) ϕ ( t) -3 Figure.0: 3 Cross signal constellation. he two inner products in the above definition are equal under the conditions in the following theorem: heorem.. (Invariance of the Inner Product) If there exists a set of basis functions ϕ n (t), n =,..., N for some N such that u(t) = N n= u nϕ n (t) and v(t) = N n= v nϕ n (t) then u(t), v(t) = u, v. (.8) where u = u. u N and v = v. v N. (.9) he proof follows from u(t), v(t) = = N n= m= u(t)v(t)dt = N n= m= N u n v m ϕ n (t)ϕ m (t)dt = N u n v m ϕ n (t)ϕ m (t)dt (.0) N m= n= N u n v m δ nm = N u n v n (.) = u, v QED. (.) hus the inner product is invariant to the choice of basis functions and only depends on the components of the time functions along each of the basis functions. While the inner product is invariant to the choice of basis functions, the component values of the data symbols depend on basis functions. [ ( For example, for the V.3 example, one could recognize that the integral 0 cos πt [ ( cos πt ) ( + sin πt )] dt = + = 4. Parseval s Identity is a special case (with x = u = v) of the invariance of the inner product. n= ) + sin ( πt heorem.. (Parseval s Identity) he following relation holds true for any modulated waveform Ex = E [ x ] [ ] = E x (t)dt. (.3) 5 )]

16 he proof follows from the previous heorem.. with u = v = x E [ u(t), v(t) ] = E [ x, x ] (.4) [ N ] = E x n x n (.5) n= = E [ x ] (.6) = Ex QED. (.7) Parseval s Identity implies that the average energy of a signal constellation is invariant to the choice of basis functions, as long as they satisfy the orthonormality condition of Equation (.). As another V.3 example, one could recognize that the energy of the [,] point is [ ( 0 cos πt ) ( + sin πt )] dt = + = 5. he individual basis functions themselves have a trivial vector representation; namely ϕ n (t) is represented by ϕ n = [0 0,...,,..., 0], where the occurs in the n th position. hus, the data symbol x i has a representation in terms of the unit basis vectors ϕ n that is x i = N x in ϕ n. (.8) n= he data-symbol component x in can be determined as which, using the invariance of the inner product, becomes x in = x i (t), ϕ n (t) = x in = x i, ϕ n, (.9) x i (t)ϕ n (t)dt n =,..., N. (.0) hus any set of modulated waveforms {x i (t)} can be interpreted as a vector signal constellation, with the components of any particular vector x i given by Equation (.0). In effect, x in is the projection of the i th modulated waveform on the n th basis function. he Gram-Schmidt procedure can be used to determine the minimum number of basis functions needed to represent any signal in the signal set, as discussed in Appendix A of this chapter...4 Demodulation As in (.0), the data symbol vector x can be recovered, component-by-component, by computing the inner product of x(t) with each of the N basis functions. his recovery is called correlative demodulation because the modulated signal, x(t), is correlated with each of the basis functions to determine x, as is illustrated in Figure.. he modulated signal, x(t), is first multiplied by each of the basis functions in parallel, and the outputs of the multipliers are then passed into a bank of N integrators to produce the components of the data symbol vector x. Practical realization of the multipliers and integrators may be difficult. Any physically implementable set of basis functions can only exist over a finite interval in time, call it, the symbol period. 7 hen the computation of x n alternately becomes x n = 0 x(t)ϕ n (t)dt. (.) he computation in (.) is more easily implemented by noting that it is equal to x(t) ϕ n ( t) t=, (.) where indicates convolution. he component of the modulated waveform x(t) along the n th basis function is equivalently the convolution (filter) of the waveform x(t) with a filter ϕ n ( t) at output 6

17 Figure.: he correlative demodulator. ϕ ( t) x y( t) ϕ ( t)... ( t) ϕ N x x N t = Figure.: he matched-filter demodulator. 7

18 sample time. Such matched-filter demodulation is matched to the corresponding modulator basis function. Matched-filter demodulation is illustrated in Figure.. Figure. illustrates a conversion between the data symbol and the corresponding modulated waveform such that the modulated waveform can be represented by a finite (or countably infinite as N ) set of components along an orthonormal set of basis functions. he coming sections use this concept to analyze the performance of a particular modulation scheme on the AWGN channel...5 MIMO Channel Basics Multiple-Input-Multiple-Output (MIMO) channels also are vector channels, as illustrated in Figure.3. he simplest MIMO cases do not use the adder in the modulator. Rather, waveforms are sent on several parallel channels. MIMO basis functions need only be normalized, and not necessarily orthogonal on the different parallel channels, because the infrastructure itself ensures the orthogonality (as indicated by the parallel dashed lines through the MIMO channel). his text denotes the dimensionality L x instead of N, even though N and L x are interchangeable in this Chapter. It will later be possible that there are N orthonormal basis functions used on each of the L x channels, leading to an overall dimensionality of N total = L x N. Further discussion will defer such larger dimensionality and the ways of dealing with it to Chapters 4 and beyond, so for now L x and N are interchangeable. x ϕ ( t) y x ϕ ( t) MIMO channel ϕ ( t) y ϕ ( t) x φ Lx ( t ) y Lx φ Lx ( t ) x(t) y(t) t = Figure.3: he MIMO channel. For the MIMO channel the channel input becomes x(t) = N ϕ n (t) xvec n, (.3) n= where the ϕ n (t) are a common set of basis functions used on all the parallel channels 8. In effect the elements of the vector x n in Equation (.3 ) above need not be nonzero in only one component, although this chapter basically views L x = N and each of the parallel channels basically being equivalent to a dimension (so does in effect zero all components of x n except the n th. 7 he restriction to a finite time interval is later removed with the introduction of Nyquist Pulse shapes in Chapter 3, and the term symbol period will be correspondingly reinterpreted. 8 If different parallel channels had different functions, the total set would be the union of all sets as long as orthonormality is retained in creating the set of larger functions for modulation across all channels. 8

19 An example could be a system that has L x highly directional transmit antennas that each point at another set of L x highly directional receive antennas. In effect, each transmit antenna has an input component x i,n of a transmit vector x n with on the n th normalized basis function ϕ n (t) that passes only to the corresponding i th output antenna. he receivers as a set have corresponding components y n, which can be aggregated into a vector y. Similarly, L x parallel wires could be used between common end points to increase speed. For instance, the IEEE 803.3z Gbps Ethernet standard uses PAM on each of 4 parallel twisted pairs that each carry 50 Mbps of individual throughput (the actual speed is 3.5 Mbps because an extra 0% is used for overhead). hese sets of L x = 4 wires are often called cat-5 cables, connecting with the familiar RJ45 connectors for ethernet (if one looks closely, 8 wires or 4 twisted pairs are in those connectors). Yet another example occurs in the above mentioned IEEE 80.3 fiber standards for 40 and 00 Gbps where 4 wavelengths on the same fiber (with no interference between them) each carry /4 of the overall data rate. Sometimes there is leakage between the channels, known as crosstalk, which is similar to the intersymbol interference addressed in Chapter 3, but crosstalk is better described as intra-symbol interference. his topic is addressed in Chapters 4, 5, as well as, 3, and 4. Important here is that the MIMO channel also fits into the vector-channel analysis that is common then to all forms of transmission in this book. he inner product of vector functions simply generalizes to (a superscript of * denotes transpose here) f(t), g(t) = 0 f (t) g(t)dt, (.4) basically a sum of integrals instead of a single integral previously. Orthonormality still applies. Inner products of the components on the (now) vector basis functions again equal the sum-of-integral inner products. his entire section could be reread with the basis vector functions replacing the scalar basis functions, and the modulated signal being a vector of transmitted time-domain waveforms x(t) that results in a vector of channel output waveforms y(t). 9

20 m Discrete Modulator channel p y x Decision mˆ. Discrete Data Detection Figure.4: Vector channel model. In practice, the channel output waveform y(t) is not equal to the modulated signal x(t). In many cases, the essential information of the channel output y(t) is captured by a finite set of vector components, i.e. a vector y generated by the demodulation described in Section.. Specific important examples appear later in this chapter, but presently the analysis shall presume the existence of the vector y and proceed to study the detector for the channel. he detector decides which of the discrete channel input vectors x i i = 0,..., M was transmitted based on the observation of the channel output vector y. MIMO vector channels fit precisely also into the framework of this section with L x N... he Vector Channel Model he vector channel model appears in Figure.4. his model suppresses all continuous-time waveforms, and the channel produces a discrete vector output given a discrete vector input. he detector chooses a message m i from among the set of M possible messages {m i } i = 0,..., M transmitted over the vector channel. he encoder formats the messages for transmission over the vector channel by translating the message m i into x i, an N-dimensional real data symbol chosen from a signal constellation. he encoders of this text are one-to-one mappings between the message set and the signal-constellation vectors. he channel-input vector x corresponds to a channel-output vector y, an N-dimensional real vector. (hus, the transformation of y(t) y is here assumed to occur within the channel.) he conditional probability of the output vector y given the input vector x, py x, completely describes the discrete version of the channel. he decision device then translates the output vector y into an estimate of the transmitted message ˆx. A decoder (which is part of the decision device) reverses the process of the encoder and converts the detector output ˆx into the message decision ˆm. he particular message vector corresponding to m i is x i, and its n th component is x in. he n th component of y is denoted y n, n =,..., N. In the vector channel, x is a random vector, with discrete probability mass function px(i) i = 0,..., M. he output random vector y may have a continuous probability density or a discrete probability mass function py(v), where v is a dummy variable spanning all the possible N-dimensional outputs for y. his density is a function of the input and channel transition probability density functions: py(v) = M i=0 he average energy of the channel input symbols is Ex = M he corresponding average energy for the channel-output vector is i=0 py x(v i) px(i). (.5) x i px(i). (.6) Ey = v v py(v). (.7) 0

21 An integral replaces 9 the sum in (.8) for the case of a continuous density function py(v). As an example, consider the simple additive noise channel y = x + n. In this case py x = pn(y x), where pn( ) is the noise density, when n is independent of the input x... Optimum Data Detection For the channel of Figure.4, the probability of error is defined as the probability that the decoded message ˆm is not equal to the message that was transmitted: Definition.. (Probability of Error) he Probability of Error is defined as he corresponding probability of being correct is therefore P e = P { ˆm m}. (.8) P c = P e = P { ˆm m} = P { ˆm = m}. (.9) he optimum data detector chooses ˆm to minimize P e, or equivalently, to maximize P c. he probability of being correct is a function of the particular transmitted message, m i. he MAP Detector he probability of the decision ˆm = m i being correct, given the channel output vector y = v, is P c ( ˆm = m i, y = v) = P m y(m i v) py(v) = Px y(i v) py(v). (.30) hus the optimum decision device observes the particular received output y = v and, as a function of that output, chooses ˆm = m i i = 0,..., M to maximize the probability of a correct decision in (.3). his quantity is referred to as the à posteriori probability for the vector channel. hus, the optimum detector for the vector channel in Figure.4 is called the Maximum à Posteriori (MAP) detector: Definition.. (MAP Detector) he Maximum à Posteriori Detector is defined as the detector that chooses the index i to maximize the à posteriori probability px y(i v) given a received vector y = v. he MAP detector thus simply chooses the index i with the highest conditional probability px y(i v). For every possible received vector y the designer of the detector can calculate the corresponding best index i, which depends on the input distribution px(i). he á posteriori probabilities can be rewritten in terms of the a priori probabilities px and the channel transition probabilities py x by recalling the identity 0, px y(i v) py(v) = py x(v i) px(i). (.3) hus, Px y(i v) = p y x(v i) px(i) py(v) = py x(v i) px(i) M j=0 p y x(v j)px(j), (.3) for py(v) 0. If py(v) = 0, then that particular output does not contribute to P e and therefore is not of further concern. When maximizing (.33) over i, the denominator py(v) is a constant that is ignored. hus, Rule.. below summarizes the following MAP detector rule in terms of the known probability densities of the channel (py x) and of the input vector (px): Rule.. (MAP Detection Rule) ˆm m i if py x(v i) px(i) py x(v j) px(j) j i (.33) If equality holds in (.34), then the decision can be assigned to either message m i or m j without changing the minimized probability of error. 9 he replacement of a continuous probability density function by a discrete probability mass function is, in strictest mathematical terms, not advisable; however, we do so here, as this particular substitution prevents a preponderance of additional notation, and it has long been conventional in the data transmission literature. he reader is thus forewarned to keep the continuous or discrete nature of the probability density in mind in the analysis of any particular vector channel. 0 he more general form of this identity is called Bayes heorem, [].

22 he Maximum Likelihood (ML) Detector If all transmitted messages are of equal probability, that is if px(i) = M i = 0,..., M, (.34) then the MAP Detection Rule becomes the Maximum Likelihood Detection Rule: Rule.. (ML Detection Rule) ˆm m i if py x(v i) py x(v j) j i. (.35) If equality holds in (.36), then the decision can be assigned to either message m i or m j without changing the probability of error. As with the MAP detector, the ML detector also chooses an index i for each possible received vector y = v, but this index now only depends on the channel transition probabilities and is independent of the input distribution (by assumption). he ML detector essentially cancels the /M factor on both sides of (.34) to get (.36). his type of detector only minimizes P e when the input data symbols have equal probability of occurrence. As this requirement is often met in practice, ML detection is often used. Even when the input distribution is not uniform, ML detection is still often employed as a detection rule, because the input distribution may be unknown and thus assumed to be uniform. he Minimax heorem sometimes justifies this uniform assumption: heorem.. (Minimax heorem) he ML detector minimizes the maximum possible average probability of error when the input distribution is unknown if the conditional probability of error P e,ml/m=mi is independent of i. Proof: First, if P e,ml/i is independent of i, then And so, P e,ml = M i=0 = P e,ml/i px(i) P e,ml/i M max {px P } e,ml = max px(i) P e,ml/i {px } i=0 M = P e,ml = P e,ml i=0 px(i) Now, let R be any receiver other than the ML receiver. hen, M max {px P } e,r = max px(i) P e,r/i {px } M i=0 M i=0 = P e,ml i=0 M P e,r/i (Since max {px } P e,r P e,r for given {px}.) M P e,ml/i (Since the ML minimizes P e when px(i) = M for i = 0,..., M.)

23 x D x D x 4 D 4 x 3 D 3 Figure.5: Decision regions. So, max {px } P e,r P e,ml = max {px } P e,ml he ML receiver minimizes the maximum P e over all possible receivers. QED. he condition of symmetry imposed by the above heorem is not always satisfied in practical situations; but the likelihood of an application where both the inputs are nonuniform in distribution and the ML conditional error probabilities are not symmetric is rare. hus, ML receivers have come to be of nearly ubiquitous use in place of MAP receivers...3 Decision Regions In the case of either the MAP Rule in (.34) or the ML Rule in (.36), each and every possible value for the channel output y maps into one of the M possible transmitted messages. hus, the vector space for y is partitioned into M regions corresponding to the M possible decisions. Simple communication systems have well-defined boundaries (to be shown later), so the decision regions often coincide with intuition. Nevertheless, in some well-designed communications systems, the decoding function and the regions can be more difficult to visualize. Definition..3 (Decision Region) he decision region using a MAP detector for each message m i, i = 0,..., M is defined as D i = {v p y x(v i) px(i) py x(v j) px(j) j i}. (.36) With uniformly distributed input messages, the decision regions reduce to D i = {v p y x(v i) py x(v j) j i}. (.37) In Figure (.5), each of the four different two-dimensional transmitted vectors x i (corresponding to the messages m i ) has a surrounding decision region in which any received value for y = v is mapped to the message m i. In general, the regions need not be connected, and although such situations are rare in practice, they can occur (see Problem.). Section.3 illustrates several examples of decision regions for the AWGN channel. 3

24 ..4 Irrelevant Components of the Channel Output he discrete channel-output vector y may contain information that does not help determine which of the M messages has been transmitted. hese irrelevant components may be discarded without loss of performance, i.e. the input detected and the associated probability of error remain unchanged. Let us presume the L-dimensional channel output y can be separated into two sets of dimensions, those which do carry useful information y and those which do not carry useful information y. hat is, [ ] y y =. (.38) heorem.. summarizes the condition on y that guarantees irrelevance []: heorem.. (heorem on Irrelevance) If or equivalently for the ML receiver, y px (y,y ) = px y (.39) py (y,x) = py y (.40) then y is not needed in the optimum receiver, that is, y is irrelevant. Proof: For a MAP receiver, then clearly the value of y does not affect the maximization of px (y,y ) if px (y,y ) = px (y ) and thus y is irrelevant to the optimum receiver s decision. Equation (.40) can be written as p (x,y,y ) p (y,y ) = p (x,y ) py (.4) or equivalently via cross multiplication p (x,y,y ) p (x,y ) = p (y,y ) py, (.4) which is the same as (.4). QED. he reverse of the theorem of irrelevance is not necessarily true, as can be shown by counterexamples. wo examples (due to Wozencraft and Jacobs, []) reinforce the concept of irrelevance. In these examples, the two noise signals n and n are independent and a uniformly distributed input is assumed: EXAMPLE.. (Extra Irrelevant Noise) Suppose y is the noisy channel output shown in Figure.6. In the first example, p y y, x = p n = p y y, thus satisfying the condition for y to be ignored, as might be obvious upon casual inspection. he extra independent noise signal n tells the receiver nothing given y about the transmitted message x. In the second example, the irrelevance of y given y is not quite as obvious as the signal is present in both the received channel output components. Nevertheless, p y y, x = p n (v v ) = p y y. Of course, in some cases the output component y should not be discarded. A classic example is the following case of noise cancelation. EXAMPLE.. (Noise Cancelation) Suppose y is the noisy channel output shown in Figure.7 while y may appear to contain only useless noise, it is in fact possible to reduce the effect of n in y by constructing an estimate of n using y. Correspondingly, p y y, x = p n (v (v x i )) p y y. 4

25 n n n + x y x + + y n y y p y = py = p [ y, x] y y p y = p y [ y, x] y y is irrelevant Figure.6: Extra irrelevant noise. x + y n n + y Figure.7: Noise can be partially canceled. 5

26 x channel y y G G MAP y x MAP y Figure.8: Reversibility theorem illustration. Reversibility An important result in digital communication is the Reversibility heorem, which will be used several times over the course of this book. his theorem is, in effect, a special case of the heorem on Irrelevance: heorem..3 (Reversibility heorem) he application of an invertible transformation on the channel output vector y does not affect the performance of the MAP detector. Proof: Using the heorem on Irrelevance, if the channel output is y and the result of the [ invertible transformation is y = G(y ), with inverse y = G (y ) then [y y ] = y G (y ) ]. hen, p x/(y,y ) = p x/y, which is definition of irrelevance. hus, either of y or y is sufficient to detect x optimally.qed. Equivalently, Figure.8 illustrates the reversibility theorem by constructing a MAP receiver for the output of the invertible transformation y as the cascade of the inverse filter G and the MAP receiver for the input of the invertible transformation y. 6

27 n ( t) x( t) +! y ( t) Figure.9: AWGN channel..3 he Additive White Gaussian Noise (AWGN) Channel Perhaps the most important, and certainly the most analyzed, digital communication channel is the AWGN channel shown in Figure.9. his channel passes the sum of the modulated signal x(t) and an uncorrelated Gaussian noise n(t) to the output. he Gaussian noise is assumed to be uncorrelated with itself (or white ) for any non-zero time offset τ, that is E [n(t)n(t τ)] = N 0 δ(τ), (.43) and zero mean, E[n(t)] = 0. With these definitions, the Gaussian noise is also strict sense stationary (See Annex C of Chapter for a discussion of stationarity types). he analysis of the AWGN channel is a foundation for the analysis of more complicated channel models in later chapters. he assumption of white Gaussian noise is valid in the very common situation where the noise is predominantly determined by front-end analog receiver thermal noise. Such noise has a power spectral density given by the B oltzman equation: N(f) = hf e hf k k for small f < 0, (.44) where Boltzman s constant is k = Joules/degree Kelvin, Planck s constant is h = Watt-s, and is the temperature on the Kelvin (absolute) scale. his power spectral density is approximately -74 dbm/hz (0 7.4 mw/hz) at room temperature (larger in practice). he Gaussian assumption is a consequence of the fact that many small noise sources contribute to this noise, thus invoking the Central Limit heorem. For the MIMO case, white noise is simply the noise above added to each output dimension (equal variance on all paths) and the noise is independent on each dimension from all other dimensions. Colored noise is considered in Section Conversion from the Continuous AWGN to a Vector Channel In the absence of additive noise in Figure.9, y(t) = x(t), and the demodulation process in Subsection..3 would exactly recover the transmitted signal. his section shows that for the AWGN channel, this demodulation process provides sufficient information to determine optimally the transmitted signal. he resulting components y l = y(t), ϕl (t), l =,..., N comprise a vector channel output, y = [y,..., y N ] that is equivalent for detection purposes to y(t). he analysis can thus convert the continuous channel y(t) = x(t) + n(t) to a discrete vector channel model, y = x + n, (.45) All proofs in this section then generalize easily to the case where scalar x, y, and ϕ are generalized to vectors with the more general definition of inner product at the end of Subsection..5 7

28 where n = [n n... n N ] and n l = n(t), ϕl (t). he vector channel output is the sum of the vector equivalent of the modulated signal and the vector equivalent of the demodulated noise. Nevertheless, the exact noise sample function may not be reconstructed from n, or equivalently, n(t) y(t) N n l ϕ l (t) = ˆn(t), (.46) l= N y l ϕ l (t) = ŷ(t). (.47) l= here may exist a component of n(t) that is orthogonal to the space spanned by the basis functions {ϕ (t)... ϕ N (t)}. his unrepresented noise component is A lemma quickly follows: ñ(t) = n(t) ˆn(t) = y(t) ŷ(t). (.48) Lemma.3. (Uncorrelated noise samples) he noise samples in the demodulated noise vector are independent for AWGN and of equal variance N0. Proof: Write [ ] E [n k n l ] = E n(t)n(s)ϕ k (t)ϕ l (s)dt ds (.49) = N 0 ϕ k (t)ϕ l (t)dt (.50) = N 0 δ kl. QED. (.5) he development of the MAP detector could have replaced y by y(t) everywhere and the development would have proceeded identically with the tacit inclusion of the time variable t in the probability densities (and also assuming stationarity of y(t) as a random process). he heorem of Irrelevance would hold with [y y ] replaced by [ŷ(t) ñ(s)], as long as the relation (.4) holds for any pair of time instants t and s. In a non-mathematical sense, the unrepresented noise is useless to the receiver, so there is nothing of value lost in the vector demodulator, even though some of the channel output noise is not represented. he following algebra demonstrates that ñ(s) is irrelevant: First, [ ] N N E [ñ(s) ˆn(t)] = E ñ(s) n l ϕ l (t) = ϕ l (t)e [ñ(s) n l ]. (.5) and, l= l= Second, E [ñ(s) n l ] = E [(n(s) ˆn(s)) n l ] (.53) [ ] [ N ] = E n(s)ϕ l (τ)n(τ)dτ E n k n l ϕ k (s) (.54) = N 0 k= δ(s τ)ϕ l (τ)dτ N 0 ϕ l(s) (.55) = N 0 [ϕ l(s) ϕ l (s)] = 0. (.56) p x ŷ(t),ñ(s) = p x,ŷ(t),ñ(s) pŷ(t),ñ(s) (.57) 8

29 = p x,ŷ(t) p ñ(s) pŷ(t) pñ(s) (.58) = p x,ŷ(t) pŷ(t) (.59) = p x ŷ(t). (.60) Equation (.6) satisfies the theorem of irrelevance, and thus the receiver need only base its decision on ŷ(t), or equivalently, only on the received vector y. he vector AWGN channel is equivalent to the continuous-time AWGN channel. Rule.3. (he Vector AWGN Channel) he vector AWGN channel is given by y = x + n (.6) and is equivalent to the channel illustrated in Figure.9. he noise vector n is an N- dimensional Gaussian random vector with zero mean, equal-variance, uncorrelated components in each dimension. he noise distribution is pn(u) = (πn 0 ) N e N 0 u = ( πσ ) N e σ u. (.6) Application of y(t) to either the correlative demodulator of Figure. or to the matched-filter demodulator of Figure., generates the desired vector channel output y at the demodulator output. he following section specifies the decision process that produces an estimate of the input message, given the output y, for the AWGN channel..3. Optimum Detection with the AWGN Channel For the vector AWGN channel in (.6), py x(v i) = pn(v x i ), (.63) where pn is the vector noise distribution in (.63). hus for AWGN the MAP Decision Rule becomes ˆm m i if e N 0 v x i px(i) e N 0 v x j px(j) j i, (.64) where the common factor of (πn 0 ) N has been canceled from each side of (.65). As noted earlier, if equality holds in (.65), then the decision can be assigned to any of the corresponding messages without change in minimized probability of error. he log of (.65) is the preferred form of the MAP Decision Rule for the AWGN channel: Rule.3. (AWGN MAP Detection Rule) ˆm m i if v x i N 0 ln{px(i)} v x j N 0 ln{px(j)} j i (.65) If the channel input messages are equally likely, the ln terms on both sides of (.66) cancel, yielding the AWGN ML Detection Rule: Rule.3.3 (AWGN ML Detection Rule) ˆm m i if v x i v x j j i. (.66) he ML detector for the AWGN channel in (.67) has the intuitively appealing physical interpretation that the decision ˆm = m i corresponds to choosing the data symbol x i that is closest, in terms of the Euclidean distance, to the received vector channel output y = v. Without noise, the received vector is y = x i the transmitted symbol, but the additive Gaussian noise results in a received symbol most likely in the neighborhood of x i. he Gaussian shape of the noise implies the probability of a received 9

30 - D D 0 + x 0 v < 0 v > 0 x ϕ Figure.0: Binary ML detector. - D D 0 + x 0 ( 0) ( ) σ px ln x x 0 px v > 0 x ϕ Figure.: Binary MAP detector. point decreases as the distance from the transmitted point increases. As an example consider the decision regions for binary data transmission over the AWGN channel illustrated in Figure.0. he ML receiver decides x if y = v 0 and x 0 if y = v < 0. (One might have guessed this answer without need for theory.) With d defined as the distance x x 0, the decision regions are offset in the MAP detector by σ d ln{ p x px (j) (i) } with the decision boundary shifting towards the data symbol of lesser probability, as illustrated in Figure.. Unlike the ML detector, the MAP detector accounts for the à priori message probabilities. he decision region for the more likely symbol is extended by shifting the boundary towards the less likely symbol. Figure. illustrates the decision region for a two-dimensional example of the QPSK signal set, which uses the same basis functions as the V.3 example (Example..4). he points in the signal constellation are all assumed to be equally likely. General Receiver Implementation While the decision regions in the above examples appear simple to implement, in a digital system, the implementation may be more complex. his section investigates general receiver structures and the detector implementation. he MAP detector minimizes the quantity (the quantity y now replaces v averting strict mathematical notation, because probability density functions are used less often in the subsequent analysis): y x i N 0 ln{px(i)} (.67) over the M possible messages, indexed by i. he quantity in (.68) expands to y y, x i + x i N 0 ln{px(i)}. (.68) Minimization of (.69) can ignore the y term. he MAP decision rule then becomes where c i is the constant (independent of y) ˆm m i if y, x i + c i y, x j + c j j i, (.69) N 0 c i = ln{p x(i)} x i. (.70) A system design can precompute the constants {c i } from the transmitted symbols {x i } and their probabilities px(i). he detector thus only needs to implement the M inner products, y, x i i = 0,..., M. When all the data symbols have the same energy (Ex = x i i) and are equally probable (i.e. MAP 30

31 ϕ x x D D 3 D D 0 ϕ x 3 x 0 Figure.: QPSK decision regions. c 0 ϕ ( t) Matrix Mul*ply + y ( t) ϕ ( t)... ( t) ϕ N ' x0 ' xm M N y c +. c M + Max & Decode mˆ t = Figure.3: Basis detector (y(t) would be just L x = N parallel signals y(t) in the MIMO case). = ML), then the constant c i is independent of i and can be eliminated from (.70). he ML detector thus chooses the x i that maximizes the inner product (or correlation) of the received value for y = v with x i over i. here exist two common implementations of the MAP receiver in (.70). he first, shown in Figure.3, called a basis detector, computes y using a matched filter demodulator. his MAP receiver computes the M inner products of (.70) digitally (an M N matrix multiply with y), adds the constant c i of (.7), and picks the index i with maximum result. Finally, a decoder translates the index i into the desired message m i. Often in practice, the signal constellation is such (see Section.6 for examples) that the max and decode function reduces to simple truncation of each component in the received vector y. he second form of the demodulator eliminates the matrix multiply in Figure.3 by recalling the inner product equivalences between the discrete vectors x i, y and the continuous time functions x i (t) and y(t). hat is Equivalently, y, x i = 0 y(t)x i (t)dt = y(t), x i (t). (.7) y, x i = y(t) x i ( t) t= (.7) 3

32 c 0 x 0 ( - t) + y ( t) x X M - ( - t)... ( - t) c +. c M + Max & Decode mˆ t = Figure.4: Signal detector. (y(t) would be just L x = N parallel signals y(t) in the MIMO case, but there would be several filters for each parallel-channel output - signal detectors are rarely used with MIMO) n( t) SNR is max when h( t) = x( t) s x ( t) + h( t) t = s Figure.5: SNR maximization by matched filter. where indicates convolution. his type of detector is called a signal detector and appears in Figure.4. EXAMPLE.3. (pattern recognition as a signal detector) Pattern recognition is a digital signal processing procedure that is used to detect whether a certain signal is present. An example occurs when an aircraft takes electronic pictures of the ground and the corresponding electrical signal is analyzed to determine the presence of certain objects. his is a communication channel in disguise where the two inputs are the usual terrain of the ground and the terrain of the ground including the object to be detected. A signal detector consisting of two filters that are essentially the time reverse of each of the possible input signals, with a comparison of the outputs (after adding any necessary constants), allows detection of the presence of the object or pattern. here are many other examples of pattern recognition in voice/command recognition or authentication, written character scanning, and so on. he above example/discussion illustrates that many of the principles of digital communication theory are common to other fields of digital signal processing and science..3.3 Signal-to-Noise Ratio (SNR) Maximization with a Matched Filter SNR is a good measure for a system s performance, describing the ratio of signal power (message) to unwanted noise power. he SNR at the output of a filter is defined as the ratio of the modulated signal s energy to the mean-square value of the noise. he SNR can be defined for both continuous- and discrete-time processes; the discrete SNR is SNR of the samples of the received and filtered waveform. he matched filters shown in Figure.4 satisfy the SNR maximization property, which the following theorem summarizes: 3

33 heorem.3. (SNR Maximization) For the system shown in Figure.5, the filter h(t) that maximizes the signal-to-noise ratio at sample time s is given by the matched filter h(t) = x( s t). Proof: Compute the SNR at sample time t = s as follows. Signal Energy = [x(t) h(t) t=s ] (.73) [ = x(t) h( s t) dt] = [ x(t), h( s t) ]. (.74) he sampled noise at the matched filter output has energy or mean-square [ ] Noise Energy = E n(t)h( s t)dt n(s)h( s s)ds (.75) N 0 = δ(t s)h( s t)h( s s)dtds (.76) = N 0 h ( s t)dt (.77) (.78) = N 0 h. (.79) he signal-to-noise ratio, defined as the ratio of the signal power in (.75) to the noise power in (.80), equals SNR = N 0 [ x(t), h( s t) ] h. (.80) he Cauchy-Schwarz Inequality states that [ x(t), h( s t) ] x h (.8) with equality if and only if x(t) = kh( s t), where k is some arbitrary constant. hus, by inspection, (.8) is maximized over all choices for h(t) when h(t) = x( s t). he filter h(t) is matched to x(t), and the corresponding maximum SNR (for any k) is SNR max = N 0 x. (.8) An example of the use of the SNR maximization property of the matched filter occurs in time-delay estimation, which is used for instance in radar: EXAMPLE.3. (ime-delay estimation) Radar systems emit electromagnetic pulses and measure reflection of those pulses off objects within range of the radar. he distance of the object is determined by the delay of the reflected energy, with longer delay corresponding to longer distance. By processing the received signal at the radar with a filter matched to the radar pulse shape, the signal level measured in the presence of a presumably fixed background white noise will appear largest relative to the noise. hus, the ability to determine the exact time instant at which the maximum pulse returned is improved by the use of the matched filter, allowing more accurate estimation of the position of the object. For MIMO, this becomes a vector filter h(t) and the integrals in this development become sums of integrals consistent with the more general MIMO inner-product definition in Subsection

34 .4 Error Probability for the AWGN Channel his section discusses the computation of the average probability of error of decoding the transmitted message incorrectly on an AWGN channel. From the previous section, the AWGN channel is equivalent to a vector channel with output given by y = x + n. (.83) he computation of P e often assumes that the inputs x i are equally likely, or px(i) = M. Under this assumption, the optimum detector is the ML detector, which has decision rule ˆm m i if v x i v x j j i. (.84) he P e associated with this rule depends on the signal constellation {x i } and the noise variance N0. wo general invariance theorems in Subsection.4. facilitate the computation of P e. he exact P e, P e = M M P e/i (.85) i=0 = M M P c/i (.86) may be difficult to compute, so convenient and accurate bounding procedures in Subsections.4. through.4.4 can alternately approximate P e..4. Invariance to Rotation and ranslation he orientation of the signal constellation with respect to the coordinate axes and to the origin does not affect the P e. his result follows because () the error depends only on relative distances between points in the signal constellation, and () AWGN is spherically symmetric in all directions. First, the probability of error for the ML receiver is invariant to any rotation of the signal constellation, as summarized in the following theorem: heorem.4. (Rotational Invariance) If all the data symbols in a signal constellation are rotated by an orthogonal transformation, that is x i Qx i for all i = 0,..., M (where Q is an N N matrix such that QQ = Q Q = I), then the probability of error of the ML receiver remains unchanged on an AWGN channel. Proof: he AWGN remains statistically equivalent after rotation by Q. In particular consider ñ = Q n, a rotated Gaussian random vector. (ñ is Gaussian since a linear combination of Gaussian random variables remains a Gaussian random variable). A Gaussian random vector is completely specified by its mean and covariance matrix: he mean is E[ñ] = 0 since E[n i ] = 0, i = 0,..., N. he covariance matrix is E[ññ ] = Q E[nn ]Q = N0 I. hus, ñ is statistically equivalent to n. he channel output for the rotated signal constellation is now ỹ = x + n as illustrated in Figure.6. he corresponding decision rule is based on the distance from the received signal sample ỹ = ṽ to the rotated constellation points x i. i=0 ṽ x i = (ṽ x i ) (ṽ x i ) (.87) = (v x i ) Q Q (v x i ) (.88) = v x i, (.89) where y = x+qn. Since ñ = Q n has the same distribution as n, and the distances measured in (.90) are the same as in the original unrotated signal constellation, the ML detector for the rotated constellation is the same as the ML detector for the original (unrotated) constellation in terms of all distances and noise variances. hus, the probability of error must be identical. QED. 34

35 n is equivalent to n ˆ = Q'n ~ x = Qx + ~ y = Qy x = Q' ~ x y = Q y + Unrotated constellation with backward rotated noise Figure.6: Rotational invariance with AWGN. x ϕ ϕ D x D D 0 x 0 D 3 x 3 Figure.7: QPSK rotated by 45 o. An example of the QPSK constellation appears in Figure., where N =. With Q be a 45 o rotation matrix, [ ] cos π Q = 4 sin π 4 sin π 4 cos π, (.90) 4 then the rotated constellation and decision regions are shown in Figure.7. From Figure.7, clearly the rotation has not changed the detection problem and has only changed the labeling of the axes, effectively giving another equivalent set of orthonormal basis functions. Since rotation does not change the squared length of any of the data symbols, the average energy remains unchanged. he invariance does depend on the noise components being uncorrelated with one another, and of equal variance, as in (.5); for other noise correlations (i.e., n(t) not white, see Section.7) rotational invariance does not hold. Rotational invariance is summarized in Figure.8. Each of the three diagrams shown in figures.7 and.8 have identical P e when used with identical AWGN. he probability of error is also invariant to translation by a constant vector amount for the AWGN, because again P e depends only on relative distances and the noise remains unchanged. heorem.4. (ranslational Invariance) If all the data symbols in a signal constellation are translated by a constant vector amount, that is x i x i a for all i = 0,..., M, then the probability of error of the ML detector remains unchanged on an AWGN channel. 35

36 Contours of equal probability density magnitude Axes and decision regions coincident Axes, constella9on and decision regions rotated Figure.8: Rotational invariance summary. Proof: Note that the constant vector a is common to both y and to x, and thus subtracts from (v a) (x i a) = v x i, so (.85) remains unchanged. QED. An important use of the heorem of ranslational Invariance is the minimum energy translate of a signal constellation: Definition.4. (Minimum Energy ranslate) he minimum energy translate of a signal constellation is defined as that constellation obtained by subtracting the constant vector E{x} from each data symbol in the constellation. o show that the minimum energy translate has the minimum energy among all possible translations of the signal constellation, write the average energy of the translated signal constellation as Ex a = = M i=0 M i=0 x i a px(i) (.9) [ x i x i, a + a ] px(i) = Ex + a E{x}, a (.9) From (.93), the energy Ex a is minimized over all possible translates a if and only if a = E{x}, so min Ex a = M i=0 [ xi E{x} px(i) ] = Ex [E(x)]. (.93) hus, as transmitter energy (or power) is often a quantity to be preserved, the engineer can always translate the signal constellation by E{x}, to minimize the required energy without affecting performance. (However, there may be practical reasons, such as complexity and synchronization, where this translation is avoided in some designs.) 36

37 .4. Union Bounding Specific examples of calculating P e appear in the next two subsections. his subsection illustrates this calculation for binary signaling in N dimensions for use in probability-of-error bounds. Suppose a system has two signals in N dimensions, as illustrated for N = dimension in Figure.0 with an AWGN channel. hen the probability of error for the ML detector is the probability that the component of the noise vector n along the line connecting the two data symbols is greater than half the distance along this line. In this case, the noisy received vector y lies in the incorrect decision region, resulting in an error. Since the noise is white Gaussian, its projection in any dimension, in particular, the segment of the line connecting the two data symbols, is of variance σ = N0, as was discussed in the proof of heorem.4.. hus, P e = P { n, ϕ d }, (.94) where ϕ is a unit norm vector along the line between x 0 and x and d = x 0 x. his error probability is P e = = = Q d d σ [ d σ πσ e σ u du π e u du ]. (.95) he Q-function is defined in Appendix B of this chapter. As σ = N0, (.96) can also be written [ ] d P e = Q. (.96) N0 Minimum Distance Every signal constellation has an important characteristic known as the minimum distance: Definition.4. (Minimum Distance, d min ) he minimum distance, d min (x) is defined as the minimum distance between any two data symbols in a signal constellation x = {x i } i=0,...,m. he argument (x) is often dropped when the specific signal constellation is obvious from the context, thus leaving d min = min i j x i x j i, j. (.97) Equation (.96) is useful in the proof of the following theorem for the probability of error of a ML detector for any signal constellation with M data symbols: heorem.4.3 (Union Bound) he probability of error for the ML detector on the AWGN channel, with an M-point signal constellation with minimum distance d min, is bounded by [ ] dmin P e (M )Q. (.98) σ he proof of the Union Bound defines an error event ε ij as the event where the ML detector chooses ˆx = x j while x i is the correct transmitted data symbol. he conditional probability of error given that x i was transmitted is then M P e/i = P {ε i0 ε i... ε i,i ε i,i+... ε i,m } = P { ε ij }. (.99) j=0 (j i) 37

38 x i x k P{ ε ij } x j includes this region P { x, x } i j adds this region Figure.9: Probability of error regions. x k x i P { x, x } i k x j P { x, x } i j Figure.30: NNUB PSK constellation. Because the error events in (.00) are mutually exclusive (meaning if one occurs, the others cannot), the probability of the union is the sum of the probabilities, where because P e/i = M j=0 (j i) P {ε ij } M j=0 (j i) P (x i, x j ), (.00) P (x i, x j ) = P { y is closer to x j than to x i }, (.0) P {ε ij } P (x i, x j ). (.0) As illustrated in Figure.9, P {ε ij } is the probability the received vector y lies in the shaded decision region for x j given the symbol x i was transmitted. he incorrect decision region for the probability P (x i, x j ) includes part (shaded red in Figure.9) of the region for P {ε ik }, which explains the inequality in Equation (.03). hus, the union bound overestimates P e/i by integrating pairwise on overlapping half-planes. Using the result in (.96), [ ] xi x j P (x i, x j ) = Q. (.03) σ 38

39 x 0 x x E x x 7 π 8 x 3 x 6 x 5 x 4 Figure.3: 8 Phase Shift Keying. Substitution of (.04) into (.0) results in P e/i M j=0 (j i) [ ] xi x j Q σ, (.04) and thus averaging over all transmitted symbols P e M i=0 M j=0 (j i) [ xi x j Q σ ] px(i). (.05) Q(x) is monotonically decreasing in x, and thus since d min x i x j, [ ] [ ] xi x j dmin Q Q. (.06) σ σ Substitution of (.07) into (.06), and recognizing that d min is not a function of the indices i or j, one finds the desired result P e M i=0 (M )Q [ dmin σ ] [ ] dmin px(i) = (M )Q σ. (.07) QED. Since the constellation contains M points, the factor M equals the maximum number of neighboring constellation points that can be at distance d min from any particular constellation point. Examples he union bound can be tight (or exact) in some cases, but it is not always a good approximation to the actual P e, especially when M is large. wo examples for M = 8 show situations where the union bound is a poor approximation to the actual probability of error. hese two examples also naturally lead to the nearest neighbor bound of the next subsection. EXAMPLE.4. (8PSK) he constellation in Figure.3 is often called eight phase or 8PSK. For the maximum likelihood detector, the 8 decision regions correspond to sectors bounded by straight lines emanating from the origin as shown in Figure.30. he union bound for 8PSK equals [ E x sin( π 8 P e 7Q ) ], (.08) σ 39

40 n x i 0 n Figure.3: 8PSK P e bounding. and d min = Ex sin( π 8 ). Figure.3 magnifies the detection region for one of the 8 data symbols. By symmetry the analysis would proceed identically, no matter which point is chosen, so P e/i = P e. An error can occur if the component of the additive white Gaussian noise, along either of the two directions shown, is greater than d min /. hese two events are not mutually exclusive, although the variance of the noise along either vector (with unit vectors along each defined as ϕ and ϕ ) is σ. hus, P e = P {( < n, ϕ > > d min ) ( < n, ϕ > > d min )} (.09) P {(n > d min )} + P {(n > d min )} (.0) [ ] dmin = Q, (.) σ which is a tighter union bound on the probability of error. Also P { n > d min } P e, (.) yielding a lower bound on P e, thus the upper bound in (.) is tight. his bound is graphically illustrated in Figure.30. he bound in (.) overestimates the P e by integrating the two half planes, which overlap as clearly depicted in the doubly shaded region of figure.9. he lower bound of (.3) only integrates over one half plane that does not completely cover the shaded region. he multiplier in front of the Q function in (.) equals the number of nearest neighbors for any one data symbol in the 8PSK constellation. he following second example illustrates problems in applying the union bound to a -dimensional signal constellation with 8 or more signal points on a rectangular grid (or lattice): EXAMPLE.4. (8AMPM) Figure.33 illustrates an 8-point signal constellation called 8AMPM (amplitude-modulated phase modulation), or 8 Square. he union bound for P e yields [ ] P e 7Q. (.3) σ By rotational invariance the rotated 8AMPM constellation shown in Figure.34 has the same P e as the unrotated constellation. he decision boundaries shown are pessimistic at the corners of the constellation, so the P e derived from them will be an upper bound. For notational brevity, let Q = Q[d min /σ]. he probability of a correct decision for 8AMPM is P c = 7 P c/i px(i) = i=0 i,4 P c/i 8 + i=,4 P c/i 8 (.4) > 6 8 ( Q)( Q) + 8 ( Q) (.5) 40

41 x 0 3 x 6 x x 7-3 x - x x 4 x 5 Figure.33: 8AMPM signal constellation. x 0 x x x 6 x x3 x 7 4 x 5 Figure.34: 8AMPM rotated by 45 o with decision regions. 4

42 = 3 ( 3Q + Q ) + ( 4Q + 4Q ) (.6) 4 4 = 3.5Q +.5Q. (.7) hus P e is upper bounded by [ ] dmin P e = P c < 3.5Q σ, (.8) which is tighter than the union bound in (.4). As M increases for constellations like 8AMPM, the accuracy of the union bound degrades, since the union bound calculates P e by pairwise error events and thus redundantly includes the probabilities of overlapping halfplanes. It is desirable to produce a tighter bound. he multiplier on the Q-function in (.9) is the average number of nearest neighbors (or decision boundaries) = 4 ( ) = 3.5 for the constellation. his rule of thumb, the Nearest-Neighbor Union bound (NNUB), often used by practicing data transmission engineers, is formalized in the next subsection..4.3 he Nearest Neighbor Union Bound he Nearest Neighbor Union Bound (NNUB) provides a tighter bound on the probability of error for a signal constellation by lowering the multiplier of the Q-function. he factor (M ) in the original union bound is often too large for accurate performance prediction as in the preceding section s two examples. he NNUB requires more computation. However, it is easily approximated. he development of this bound uses the average number of nearest neighbors: Definition.4.3 (Average Number of Nearest Neighbors) he average number of neighbors, N e, for a signal constellation is defined as N e = M i=0 N i px(i), (.9) where N i is the number of neighboring constellation points of the point x i, that is the number of other signal constellation points sharing a common decision region boundary with x i. Often, N e is approximated by N e M i=0 Ñ i px(i), (.0) where Ñi is the set of points at minimum distance from x i, whence the often used name nearest neighbors. his approximation is often very tight and facilitates computation of N e when signal constellations are complicated (i.e., coding is used - see Chapters 6, 7, and 8). hus, N e also measures the average number of sides of the decision regions surrounding any point in the constellation. hese decision boundaries can be at different distances from any given point and thus might best not be called nearest. N e is used in the following theorem: heorem.4.4 (Nearest Neighbor Union Bound) he probability of error for the ML detector on the AWGN channel, with an M-point signal constellation with minimum distance d min, is bounded by [ ] dmin P e N e Q. (.) σ In the case that N e is approximated by counting only nearest neighbors, then the NNUB becomes an approximation to probability of symbol error, and not necessary an upper bound. 4

43 Proof: Note that for each signal point, the distance to each decision-region boundary must be at least d min /. he probability of error for point x i, P e/i is upper bounded by the union bound as [ ] dmin P e/i N i Q. (.) σ hus, QED. P e = M i=0 [ ] M dmin P e/i px(i) Q σ i=0 [ ] dmin N i px(i) = N e Q σ. (.3) he previous Examples.4. and.4. show that the Q-function multiplier in each case is exactly N e for that constellation. As signal set design becomes more complicated in Chapters 7 and 8, the number of nearest neighbors is commonly taken as only those neighbors who also are at minimum distance, and N e is then approximated by (.). With this approximation, the P e expression in the NNUB consequently becomes only an approximation rather than a strict upper bound..4.4 Alternative Performance Measures he optimum receiver design minimizes the symbol error probability P e. Other closely related measures of performance can also be used. An important measure used in practical system design is the Bit Error Rate. Most digital communication systems encode the message set {m i } into bits. hus engineers are interested in the average number of bit errors expected. he bit error probability will depend on the specific binary labeling applied to the signal points in the constellation. he quantity n b (i, j) denotes the number of bit errors corresponding to a symbol error when the detector incorrectly chooses m j instead of m i, while P {ε ij } denotes the probability of this symbol error. he bit error rate P b obeys the following bound: Definition.4.4 (Bit Error Rate) he bit error rate is M P b = i=0 px(i)p {ε ij }n b (i, j) (.4) j j i where n b (i, j) is the number of bit errors for the particular choice of encoder when symbol i is erroneously detected as symbol j. his quantity, despite the label using P, is not strictly a probability. he bit error rate will always be approximated for the AWGN in this text by: P b M i=0 N i j= Q [ dmin σ px(i)p {ε ij }n b (i, j) (.5) ] M i=0 N i px(i) n b (i, j) j= [ ] M < dmin P b Q px(i)n b (i) σ i=0 [ ] < dmin N b Q σ (.6) 43

44 where n b (i) = N i n b (i, j), (.7) and the Average otal Bit Errors per Error Event, N b, is defined as: N b = M i=0 An expression similar to the NNUB for P b is j= P b N b Q [ dmin σ px(i) n b (i). (.8) ], (.9) where the approximation comes from Equation (.6), which is an approximation because of the reduction in the number of included terms in the sum over other points. he accuracy of this approximation is good as long as those terms corresponding to distant neighbors have small value in comparison to nearest neighbors, which is a reasonable assumption for good constellation designs. he bit error rate is sometimes a more uniform measure of performance because it is independent of M and N. On the other hand, P e is a block error probability (with block length N) and can correspond to more than one bit in error (if M > ) over N dimensions. Both P e and P b depend on the same distance-to-noise ratio (the argument of the Q function). While the notation for P b is commonly expressed with a P, the bit error rate is not a probability and could exceed unity in value in aberrant cases. A better measure that is a probability is to normalize the bit-error rate by the number of bits per symbol: Normalization of P b produces a probability measure because it is the average number of bit errors divided by the number of bits over which those errors occur - this probability is the desired probability of bit error: Lemma.4. (Probability of bit error P b.) he probability of bit error is defined by P b = P b. (.30) b he corresponding average total number of bit errors per bit is N b = N b b. (.3) he bit error rate can exceed one, but the probability of bit error never exceeds one. Furthermore, comparison of values of P e between systems of different dimensionality is not fair (for instance to compare a BQ system operating at P e = 0 7 against a multi-dimensional design consisting of 0 successive BQ dimensions decoded jointly as a single symbol also with P e = 0 7, the latter system really has 0 8 errors per dimension and so is better.) A more fair measure of symbol error probability normalizes the measure by the dimensionality (or number of bits per symbol) of the system to compare systems with different block lengths. Definition.4.5 (Normalized Error Probability P e.) he normalized error probability is defined by P e = P e N. (.3) he normalized average number of nearest neighbors is: Definition.4.6 (Normalized Number of Nearest Neighbors) he normalized number of nearest neighbors, Ne, for a signal constellation is defined as hus, the NNUB is N e = M i=0 N i N p x(i) = N e N. (.33) [ ] P e N dmin e Q σ 44. (.34)

45 EXAMPLE.4.3 (8AMPM) he average number of bit errors per error event for 8AMPM using the octal labeling indicated by the subscripts in Figure.33 is computed by hen Also, so that and N b = 7 i=0 8 n b(i) = [( + + ) + ( )+ 8 (.35) ( + + ) + ( + + 3) + ( )+ (.36) +( + + ) + (3 + + ) + ( + + )] (.37) = 44 = 5.5. (.38) 8 [ ] dmin P b 5.5 Q σ N e = 3.5 [ ] dmin P e.65 Q σ P b 5.5 [ ] dmin 3 Q σ. (.39) =.65 (.40), (.4). (.4) hus the bit error rate is somewhat higher than the normalized symbol error rate. Careful assignment of bits to symbols can reduce the bit error rate slightly..4.5 Block Error Measures Higher-level engineering of communication systems may desire knowledge of message errors within packets of several messages cascaded into a larger message. An entire packet may be somewhat useless if any part of it is in error. hus, the concept of a symbol from this perspective of analysis may be the entire packet of messages. he probability of block or packet error is truly identical to the probability of symbol error already analyzed as long as the entire packet is considered as a single symbol. If a packet contains B bits, each of which independently has probability of bit error P b, then the probability of packet (or block) error is often approximated by B P b. Clearly then if one says the probability of packet error is 0 7 and there are 5 bytes per packet, or 000 bits per packet, then the probability of bit error would then be 0 0. Low packet error rate is thus a more stringent criterion on the detector performance than is low probability of bit error. Nonetheless, analysis can proceed exactly as in this section. As B increases, the approximation above of P e = B P b can become inaccurate as we see below. An errored second is often used in telecommunications as a measure of performance. An errored second is any second in which any bit error occurs. Obviously, fewer errored seconds is better. A given fixed number of error seconds translates into increasingly lower probability of bit error as the data rate of the channel increases. An error-free second is a second in which no error occurs. If a second contains B independent bits, then the exact probability of an error-free second is while the exact probability of an errored second is P e = P efs = P efs = ( P b ) B (.43) B i= ( Bi ) ( P b ) B i P i b. (.44) 45

46 Dependency between bits and bit errors will change the exact nature of the above formulae, but is usually ignored in calculations. More common in telecommunications is the derived concept of percentage error free seconds which is the percentage of seconds that are error free. hus, if a detector has P b = 0 7 and the data rate is 0 Mbps, then one might naively guess that almost every second contains errors according to P e = B P b, and the percentage of error free seconds is thus very low. o be exact, P efs = ( 0 7 ) 07 =.368, so that the link has 36.8% error free seconds, so actually about 63% of the seconds have errors. ypically large telecommunications networks strive for five nines reliability, which translates into % error free seconds. At 0 Mbps, this means that the detector has P b = e 0 7 ln(.99999) =.3 0. At lower data rates, five nines is less stringent on the channel error probability. Data networks today, often designed for bit error rates above 0 operate at 0 Mbps with external error detection and retransmission protocols. Retransmission may not be acceptable for continuous signals like voice or video, so that five nines reliability is often not possible on the data network this has become a key issue in the convergence of telecommunications networks (designed with 5 nines reliability normally for voice transmission over the last 5 decades) and data networks, designed with higher error rates for data transmission over the last 3 decades. (Often though, the data network probability error is much better than the specification, so systems may work fine without any true understanding of the designers as to exactly why. However, Voice ( VoIP ) and video ( IPV ) signals on data networks often exhibit quality issues, which is a function of the packet error rate being too high and also of the failure of retransmission approaches to restore the quality.) In any case, the probability of symbol and bit error are fundamental to all other measures of network performance and can be used by the serious communication engineer to evaluate carefully a system s performance. 46

47 .5 General Classes of Constellations and Modulation his section describes three classes of modulation that abound in digital data transmission. Each of these three classes represent different geometric approaches to constellation construction. hree successive subsections examine the choice of basis functions for each modulation class and develop corresponding general expressions for the average probability of error P e for these modulation classes when used on the AWGN channel. Subsection.5. discusses cubic constellations (Section.6 also investigates some important extensions to the cubic constellations). Subsection.5.3 examines orthogonal constellations, while Subsection.5.4 studies circular constellations. o compare constellations, some constellation measures are developed first. he cost of modulation depends upon transmitted power (energy per unit time). A unit of time translates to a number of dimensions, given a certain system bandwidth, so the energy per dimension is essentially a measure of power. Given a wider bandwidth, the same unit in time and power will correspond to proportionately more dimensions, but a lower power spectral density. While somewhat loosely defined, a system with symbol period and bandwidth 3 W, has a number of dimensions available for signal constellation construction N = W dimensions. (.45) he reasons for this approximation will become increasingly apparent, but all the methods of this section will follow this simple rule when the reasonable and obvious definition of bandwidth is applied. Systems in practice all follow this rule (or have fewer dimensions than this practical maximum) even though it may be possible to construct signal sets with slightly more dimensions theoretically. he number of dimensions in any case is a combined measure of the system resources of bandwidth and time - thus, performance measures and energy are thus often normalized by N for fair comparison. he data rate concept thus generalizes to the number of bits per dimension: Definition.5. (Average Number of Bits Per Dimension) he average number of bits per dimension, b, for a signal constellation x, is he related quantity, data rate, is Using (.46), one can compute that b = b N. (.46) R = b. (.47) b = R W, (.48) the spectral efficiency of a modulation method which is often used by transmission engineers to describe an efficiency of transmission (how much data rate per unit of bandwidth). Spectral efficiency is often described in terms of the unit bits/second/hz, which is really a measure of double the number of bits/dimension. Engineers often abbreviated the term bits/second/hz to say bits/hz, which is an (unfortunately) often used and confusing term because the units are incorrect. Nonetheless, experienced engineers automatically translate the verbal abbreviation bits/hz to the correct units and interpretation, bits-per-second/hz, or simply double the number of bits/dimension. he concept of power also generalizes to energy per dimension: Definition.5. (Average Energy Per Dimension) he average energy per dimension, Ēx, for a signal constellation x, is A related quantity is the average power, Ēx = E x N. (.49) Px = E x. (.50) 3 It is theoretically not possible to have finite bandwidth and finite time extent, but in practice this can be approximated closely. 47

48 Clearly N cannot exceed the actual number of dimensions in the constellation, but the constellation may require fewer dimensions for a complete representation. For example the two-dimensional constellation in Figure.7 can be described using only one basis vector simply by rotating the constellation by 45 degrees. he average power, which was also defined earlier, is a scaled quantity, but consistently defined for all constellations. In particular, the normalization of basis functions often absorbs gain into the signal constellation definition that may tacitly conceal complicated calculations based on transmission-channel impedance, load matching, and various non-trivially calculated analog effects. hese effects can also be absorbed into bandlimited channel models as is the case in Chapters, 3, 4, 0 and. he energy per dimension allows the comparison of constellations with different dimensionality. he smaller the Ēx for a given P e and b, the better the design. he concatenation of two successively transmitted N-dimensional signals taken from the same N-dimensional signal constellation as a single N-dimensional signal causes the resulting N-dimensional constellation, formed as a Cartesian product of the constituent N-dimensional constellations, to have the same average energy per dimension as the N-dimensional constellation. hus, simple concatenation of a signal set with itself does not improve the design. However, careful packing of signals in increasingly larger dimensional signals sets can lead to a reduction in the energy per dimension required to transmit a given set of messages, which will be of interest in this section and throughout this text. he average power is the usual measure of energy per unit time and is useful when sizing the power requirements of a modulator or in determining scale constants for analog filter/driver circuits in the actual implementation. he power can be set equal to the square of the voltage over the load resistance. he noise energy per dimension for an N-dimensional AWGN channel is σ = N l= σ N = σ = N 0. (.5) While AWGN is inherently infinite dimensional, by the theorem of irrelevance, a computation of probability of error need only consider the noise components in the N dimensions of the signal constellation. For AWGN channels, the signal-to-noise ratio (SNR) is used often by this text to characterize the channel: Definition.5.3 (SNR) he SNR is SNR = Ēx σ (.5) As shown in Section.4, the performance of a constellation in the presence of AWGN depends on the minimum distance between any two vectors in the constellation. Increasing the distance between points in a particular constellation increases the average energy per dimension of the constellation. he Constellation Figure of Merit. 4 combines the energy per dimension and the minimum distance measures: Definition.5.4 (Constellation Figure of Merit - CFM ) he constellation figure of merit, ζx for a signal constellation x, is ζx = ( d ) min Ēx, (.53) a unit-less quantity, defined only when b. he CFM ζx will measure the quality of any constellation used with an AWGN channel. A higher CFM ζx generally results in better performance. he CFM should only be used to compare systems with equal numbers of bits per dimension b = b/n, but can be used to compare systems of different dimensionality. A different measure, known as the energy per bit, measures performance in systems with low average bit rate of b (see Chapter 0). 4 G. D. Forney, Jr., 8/89 IEEE Journal on Selected Areas in Communication. 48

49 Definition.5.5 (Energy Per Bit) he energy per bit, E b, in a signal constellation {x} is: E b = E x b = Ēx b. (.54) his measure is only defined when b and has no meaning in other contexts. Definition.5.6 (margin) he margin of a transmission system is the amount by which the argument of the Q-function can be reduced while retaining the probability of error below a specified maximum that is associated with the margin. Margin s are often quoted in transmission design as they give a level of confidence to designers that unforseen noise increases or signal attenuation will not cause the system performance to become unacceptable. EXAMPLE.5. (Margin in DSL) Digital Subscriber Line systems deliver 00 s of kilobits to 0 s of megabits of data over telephone lines and use sophisticated adaptive modulation systems described later in Chapters 4 and 5. he two modems are located at the ends of the telephone line at the telephone-company central office and at the customer s premise. However, they ultimately also have probability of error specified by a relation of the form N e Q(d min /σ). Because noise sources can be unpredictable on telephone lines, which tend to sense everything from other phone lines signals to radio signals to refrigerator doors and fluorescent and other lights, and because customer-location additional wiring to the modem can be poor grade or long, a margin of at least 6 db is mandated at the data rate of service offered if the customer is to be allowed service. his 6 db essentially allows performance to be degraded by a combined factor of 4 in increased noise before costly manual maintenance or repair service would be necessary..5. Fair Comparisons A fair comparison of two transmission systems requires consideration of the following 5 parameters:. data rate R = b/,. power E x /, 3. total bandwidth W, 4. total time or symbol period, and 5. probability of error P b (or P e ). For MIMO systems, the number of parallel channels should be held the same also. Any comparison thus holds 4 of the 5 parameters constant while varying the 5th. However, a simplification can be achieved with dimensionality as the normalizer instead of W and. In this case, a fair comparison uses. bits per dimension b,. energy per dimension Ēx, and 3. probability of error per dimension (or P e ). Any two of these 3 can be held constant and the 3rd compared. ransmission history is replete with examples of engineers who should have known better not keeping 4 of the 5, or the simpler of the 3 constant, before comparing the 3rd. his is one of the advantages of the concept of normalizing to a number of dimensions. he constellation figure of merit presumes b fixed and then looks at the ratio of d min to Ēx, essentially holding Ēx fixed and looking at P e (equivalent to d min if nearest neighbors are ignored on the AWGN). he normalization essentially prevents an excess of symbol period or bandwidth from letting one modulation method look better than another, tacitly including the third and fourth parameters (bandwidth and symbol period) from the list of parameters in a comparison. 49

ϕ ( t) N= ϕ ( t) ϕ ( t) N= ϕ ( t) ϕ 3 ( t) N=3 ϕ ( t) Figure.35: Cubic constellations for N =,, and 3..5. Cubic Constellations Cubic constellations are commonly used on simple data communication channels.

50 ϕ ( t) N= ϕ ( t) ϕ ( t) N= ϕ ( t) ϕ 3 ( t) N=3 ϕ ( t) Figure.35: Cubic constellations for N =,, and Cubic Constellations Cubic constellations are commonly used on simple data communication channels. Some examples of cubic signal constellations are shown in Figure.35 for N =,, and 3. he construction of a cubic constellation directly maps a sequence of N = b bits into the components of the basis vectors in a corresponding N-dimensional signal constellation. For example, the bit stream may be grouped as the sequence of two-dimensional vectors... (0)(00)(0)... he resulting constellation is uniformly scaled in all dimensions, and may be translated or rotated in the N-dimensional space it occupies. he simplest cubic constellation appears in Figure.35, where N = b = b =. his constellation is known as binary signaling, since only two possible signals are transmitted using one basis function ϕ (t). Several examples of binary signaling are described next. Binary Antipodal Signaling In binary antipodal signaling, the two possible values for x = x are equal in magnitude but opposite in sign, e.g. x = ± d. As for all binary signaling methods, the average probability of error is [ ] dmin P e = P b = Q. (.55) σ he CFM for binary antipodal signaling equals ζx = (d/) /[(d/) ] =. Particular types of binary antipodal signaling differ only in the choice of the basis function ϕ (t). In practice, these basis functions may include Nyquist pulse shaping waveforms to avoid intersymbol interference. Chapter 3 further discusses waveform shaping. Besides the time-domain shaping, the basis function ϕ (t) shapes the power spectral density of the resultant modulated waveform. hus, different basis functions may require different bandwidths. Binary Phase Shift Keying Binary Phase Shift Keying (BPSK) uses a sinusoid to modulate the sequence of data symbols {± Ex}. { πt ϕ (t) = sin 0 t (.56) 0 elsewhere 50

51 his representation uses the minimum number of basis functions N = to represent BPSK, rather than N = as in Example... Bipolar (NRZ) transmission Bipolar signaling, also known as baseband binary or Non-Return-to-Zero (NRZ) signaling, uses a square pulse to modulate the sequence of data symbols {± Ex}. ϕ (t) = { 0 t 0 elsewhere (.57) Manchester Coding (Bi-Phase Level) Manchester Coding, also known as biphase level (BPL) or, in magnetic and optical recording, as frequency modulation, uses a sequence of two opposite phase square pulses to modulate each data symbol. In NRZ signaling long runs of the same bit result in a constant output signal with no transitions until the bit changes. Since timing recovery circuits usually require some transitions, Manchester or BPL guarantees a transition occurs in the middle of each bit (or symbol) period. he basis function is: ϕ (t) = 0 t < / / t < 0 elsewhere (.58) he power spectral density of the modulated signal is related to the Fourier transform Φ (f) of the pulse ϕ (t). he Fourier transform of the NRZ square pulse is a sinc function with zero crossings spaced at Hz. he basis function for BPL in Equation (.59) requires approximately twice the bandwidth of the basis function for NRZ in Equation (.58), because the Fourier transform of the biphase pulse is a sinc function with zero crossings spaced at Hz. Similarly BPSK requires double the bandwidth of NRZ. Both BPSK and BPL are referred to as rate / transmission schemes, because for the same bandwidth they permit only half the transmitted transmission rate compared with NRZ. On-Off Keying (OOK) On-Off Keying, used in direct-detection optical data transmission, as well as in gate-to-gate transmission in most digital circuits, uses the same basis function as bipolar transmission. ϕ (t) = { 0 t 0 elsewhere (.59) Unlike bipolar transmission, however, one of the levels for x is zero, while the other is nonzero ( Ex). Because of the asymmetry, this method includes a DC offset, i.e. a nonzero mean value. he CFM is ζx =.5, and thus OOK is 3dB inferior to any type of binary antipodal transmission. he comparison between signal constellations is 0 log 0 [ζx,ook/ζx,nrz] = 0 log 0 (0.5) = 3 db. As for any binary signaling method, OOK has Vertices of a Hypercube (Block Binary) P e = P b = Q [ dmin σ ]. (.60) Binary signaling in one dimension generalizes to the corners of a hypercube in N-dimensions, hence the name cubic constellations. he hypercubic constellations all transmit an average of b = bit per dimension. For two dimensions, the most common signal set is QPSK. 5

52 Quadrature Phase Shift Keying (QPSK) he basis functions for the two dimensional QPSK constellation are { πt ϕ (t) = cos 0 t (.6) 0 elsewhere { πt ϕ (t) = sin 0 t 0 elsewhere. (.6) he transmitted signal is a linear combination of both an inphase (cos) component and a quadrature (sin) component. he four possible data symbols are Ex [ ] Ex [x x ] = [ + ]. (.63) Ex [+ ] Ex [+ + ] he additional basis function does not require any extra bandwidth with respect to BPSK, and the average energy Ex remains unchanged. While the minimum distance d min has decreased by a factor of two, the number of dimensions has doubled, thus the CFM for QPSK is ζx = again, as with BPSK. For performance evaluation, it is easier to compute the average probability of a correct decision P c rather than P e for maximum likelihood detection on the AWGN channel with equally probable signals. By symmetry of the signal constellation, P c i is identical i = 0,..., 3. P c = = 3 P c/i px(i) = P c/i i=0 ( [ dmin Q σ = Q [ dmin σ ]) ( [ dmin Q σ ] ( [ dmin + Q σ ]) (.64) ]). (.65) o prove the step from the first to second line, note that the noise in the two dimensions is independent. he probability of a correct decision requires both noise components to fall within the decision region, which gives the product in (.65). hus P e = P c (.66) [ ] ( [ ]) dmin dmin [ ] dmin = Q Q < Q, (.67) σ σ σ where d min = ( [ Ex = Ē x /. For reasonable error rates (P d ]) e < 0 ), the Q min σ term in (.68) is negligible, and the bound on the right, which is also the NNUB, is tight. With a reasonable mapping of bits to data symbols (e.g. the Gray code 0 and +), the probability of a bit error P b = P e for QPSK. P e for QPSK is twice P e for BPSK, but P e is the same for both systems. Comparing P e is usually more informative. Block Binary For hypercubic signal constellations in three or more dimensions, N 3, the signal points are the vertices of a hypercube centered on the origin. In this case, the probability of error generalizes to ( [ ]) dmin N [ ] dmin P e = Q < NQ. (.68) σ σ 5

53 ϕ ( t) ϕ ( t) N= ϕ ( t) ϕ 3 ( t) N=3 ϕ ( t) Figure.36: Block orthogonal constellations for N =, and 3. where d min = Ē / x. he basis functions are usually given by ϕ n(t) = ϕ(t n ), where ϕ(t) is the square pulse given in (.58). he transmission of one symbol with the hypercubic constellation requires a time interval of length N. Alternatively, scaling of the basis functions in time can retain a symbol period of length, but the narrower pulse will require N times the bandwidth as the width pulses. For this case again ζx =. As N, P e. While the probability of any single dimension being correct remains constant and less than one, as N increases, the probability of all dimensions being correct decreases. Ignoring the higher order terms Q i, i, the average probability of error is approximately P e Q(d min /(σ)), which equals P e for binary antipodal signaling. his example illustrates that increasing dimensionality does not always reduce the probability of error unless the signal constellation has been carefully designed. As block binary constellations are just a concatenation of several binary transmissions, the receiver can equivalently decode each of the independent dimensions separately. However, with a careful selection of the transmitted signal constellation, it is possible to drive the probability of both a message error P e and a bit error P b to zero with increasing dimensionality N, as long as the average number of transmitted bits per unit time does not exceed a fundamental rate known as the capacity of the communication channel. (Chapter 8).5.3 Orthogonal Constellations In orthogonal signal sets, the dimensionality increases linearly with the number of points M = αn in the signal constellation, which results in a decrease in the number of bits per dimension b = log (M) N = log (αn) N. Block Orthogonal Block orthogonal signal constellations have a dimension, or basis function, for each signal point. he block orthogonal signal set thus consists of M = N orthogonal signals x i (t), that is x i (t), x j (t) = Exδ ij. (.69) Block orthogonal signal constellations appear in Figure.36 for N = and 3. he signal constellation vectors are, in general, [ x i = ] Ex = Exϕ i+. (.70) he CFM should not be used on block orthogonal signal sets because b <. 53

54 As examples of block orthogonal signaling, consider the following two dimensional signal sets. Return to Zero (RZ) Signaling RZ uses the following two basis functions for the two-dimensional signal constellation shown in Figure.36: ϕ (t) = ϕ (t) = { 0 t 0 elsewhere 0 t < / / t < 0 elsewhere (.7) (.7) Return to zero indicates that the transmitted voltage (i.e. the real value of the signal waveform) always returns to the same value at the beginning of any symbol interval. As for any binary signal constellation, [ ] [ ] dmin E x P b = P e = Q = Q σ σ. (.73) RZ is 3 db inferior to binary antipodal signaling, and uses twice the bandwidth of NRZ. Frequency Shift Keying (FSK) Frequency shift keying uses the following two basis functions for the two dimensional signal constellation shown in Figure.36. { πt ϕ (t) = sin 0 t (.74) 0 elsewhere { πt ϕ (t) = sin 0 t (.75) 0 elsewhere he term frequency-shift indicates that the sequence of s and 0 s in the transmitted data shifts between two different frequencies, /( ) and /. As for any binary signal constellation, [ ] [ ] dmin E x P b = P e = Q = Q σ σ FSK is also 3 db inferior to binary antipodal signaling.. (.76) FSK can be extended to higher dimensional signal sets N > by adding the following basis functions (i 3): { iπt ϕ i (t) = sin 0 t (.77) 0 elsewhere he required bandwidth necessary to realize the additional basis functions grows linearly with N for this FSK extension. P e Computation for Block Orthogonal he computation of P e for block orthogonal signaling returns to the discussion of the signal detector in Figure.4. Because all the signals are equally likely and of equal energy, the constants c i can be omitted (because they are all the same constant c i = c). In this case, the MAP receiver becomes ˆm m i if y, x i y, x j j i, (.78) 54

55 By the symmetry of the block orthogonal signal constellation, P e/i = P e or P c/i = P c for all i. For convenience, the analysis calculates P c = P c i=0, in which case the i th elements of y are y 0 = Ex + n 0 (.79) y i = n i i 0. (.80) If a decision is made that message 0 was sent, then y, x 0 y, x i or equivalently y 0 y i i 0. he probability of this decision being correct is P c/0 = P {y 0 y i i 0 given 0 was sent}. (.8) If y 0 takes on a particular value v, then since y i = n i i 0 and since all the noise components are independent, P c/0,y0=v = P {n i v, i 0} (.8) = N i= P {n i v} (.83) = [ Q(v/σ)] N. (.84) he last equation uses the fact that the n i are independent, identically distributed Gaussian random variables N(0, σ ). Finally, recalling that y 0 is also a Gaussian random variable N( Ex, σ ). yielding P c = P c/0 = πσ e σ (v Ex) [ Q(v/σ)] N dv, (.85) P e = (v πσ e σ Ex ) [ Q(v/σ)] N dv. (.86) his function must be evaluated numerically using a computer. A simpler calculation yields the NNUB, which also coincides with the union bound because the number of nearest neighbors M equals the total number of neighbors to any point for block orthogonal signaling. he NNUB is given by [ ] [ ] dmin E x P e (M )Q = (M )Q σ σ. (.87) A plot of performance for several values of N appears in Figure.37. As N gets large, performance improves without increase of SNR, but at the expense of a lower b. Simplex Constellation For block orthogonal signaling, the mean value of the signal constellation is nonzero, that is E[x] = ( Ex/M)[... ]. ranslation of the constellation by E[x] minimizes the energy in the signal constellation without changing the average error probability. he translated signal constellation, known as the simplex constellation, is [ x s i = E x E M,..., x M, Ex( ] E M ), x E M,... x, (.88) M where the Ex( M ) occurs in the ith position. he superscript s distinguishes the simplex constellation {x s i } from the block orthogonal constellation {x i} from which the simplex constellation is constructed. he energy of the simplex constellation equals E s x = M M E x, (.89) 55

56 0 0 probability of symbol error N= N=3 N=4 N=5 N=6 N= SNR in db Figure.37: Plot of probability of symbol error for orthogonal constellations. which provides significant energy savings for small M. he set of data symbols, however, are no longer orthogonal. x s i, x s j = (x i E[x]) (x j E[x]) (.90) = Exδ ij E[x], (x i + x j ) + E x M = Exδ ij E x M + E x M (.9) (.9) = Exδ ij E x M. (.93) By the theorem of translational invariance, P e of the simplex signal set equals P e of the block orthogonal signal set given in (.87) and bounded in (.88). Pulse Duration Modulation Another signal set in which the signals are not orthogonal, as usually described, is pulse duration modulation (PDM). he number of signals points in the PDM constellation increases linearly with the number of dimensions, as for orthogonal constellations. PDM is commonly used, with some modifications, in readonly optical data storage (i.e. compact disks and CD-ROM). In optical data storage, data is recorded by the length of a hole or pit burned into the storage medium. he signal set can be constructed as illustrated in Figure.38. he minimum width of the pit (4 in the figure) is much larger than the separation ( ) between the different PDM signal waveforms. he signal set is evidently not orthogonal. A second performance-equivalent set of waveforms, known as a Pulse Position Modulation (PPM), appears in Figure.39. he PPM constellation is a block orthogonal constellation, which has the previously derived P e. he average energy of the PDM constellation clearly exceeds that of the PPM constellation, which in turn exceeds that of a corresponding Simplex constellation. Nevertheless, constellation energy minimization is usually not important when PDM is used; for example in optical storage, the optical channel physics mandate the minimum pit duration and the resultant energy increase is not of concern. 56

57 x 0 ( t) t x ( t ) t x ( t ) t x M ( t ) t Figure.38: Pulse Duration Modulation (PDM). x 0 ( t) x ( t) x x M ( t ) ( t ) t t t t Figure.39: Pulse-Position Modulation (PPM). 57

58 Biorthogonal Signal Constellations A variation on block orthogonal signaling is the biorthogonal signal set, which doubles the size of the signal set from M = N to M = N by including the negative of each of the data symbol vectors in the signal set. From this perspective, QPSK is both a biorthogonal signal set and a cubic signal set. he probability of error analysis for biorthogonal constellations parallels that for block orthogonal signal sets. As with orthogonal signaling, because all the signals are equally likely and of equal energy, the constants c i in the signal detector in Figure.4 can be omitted, and the MAP receiver becomes ˆm m i if < y, x i > < y, x j > j i. (.94) By symmetry P e/i = P e or P c/i = P c for all i. Let i = 0. hen y 0 = Ex + n 0 (.95) y i = n i i 0. (.96) If x 0 was sent, then a correct decision is made if y, x 0 y, x i or equivalently if y 0 y i i 0. hus P c/0 = P {y 0 y i, i 0 0 was sent}. (.97) Suppose y 0 takes on a particular value v [0, ), then since the noise components n i are iid P c/0,y0=v = N i= P { n i v} (.98) = [ Q(v/σ)] N. (.99) If y 0 < 0, then an incorrect decision is guaranteed if symbol zero was sent. (he reader should visualize the decision regions for this constellation). hus P c = P c/o = 0 πσ e σ (v Ex) [ Q(v/σ)] N dv, (.00) yielding P e = 0 πσ e σ (v Ex) [ Q(v/σ)] N dv. (.0) his function can be evaluated numerically using a computer. Using the NNUB, which is slightly tighter than the union bound because the number of nearest neighbors is M for biorthogonal signaling, [ ] [ ] dmin E x P e (M )Q = (N )Q σ σ.5.4 Circular Constellations - M-ary Phase Shift Keying. (.0) Examples of Phase Shift Keying appeared in Figures. and.3. In general, M-ary PSK places the data symbol vectors at equally spaced angles (or phases) around a circle of radius Ex in two dimensions. Only the phase of the signal changes with the transmitted message, while the amplitude of the signal envelope remains constant, thus the origin of the name. PSK is often used on channels with nonlinear amplitude distortion where signals that include information content in the time varying amplitude would otherwise suffer performance degradation from nonlinear amplitude distortion. he minimum distance for M-ary PSK is given by d min = Ex sin π M = Ēx sin π M. (.03) he CFM is ζx = sin ( π M ), (.04) 58

59 which is inferior to block binary signaling for any constellation with M > 4. he NNUB on error probability is tight and equal to ] for all M. P e < Q [ E x sin π M σ, (.05) 59

60 x 0 x x M x M M + m = 0 m = M m = M m = M ( M ) d 5d 3d d + d 3d 5d ( ) M d ϕ Figure.40: PAM constellation..6 Rectangular (and Hexagonal) Signal Constellations his section studies several popular signal constellations for data transmission. hese constellations use equally spaced points on a one- or two-dimensional lattice. his study introduces and uses some basic concepts, namely SNR, the shaping gain, the continuous approximation, and the peak-to-average power ratio. hese concepts and/or their results measure the quality of signal constellations and are fundamental to understanding performance. Subsection.6. studies pulse amplitude modulation (PAM), while Subsection.6. studies quadrature amplitude modulation (QAM). Subsection.6.3 discusses several measures of constellation performance. he Continuous Approximation For constellations that possess a degree of geometric uniformity, the continuous approximation computes the average energy of a signal constellation by replacing the discrete sum of energies with a continuous integral. he discrete probability distribution of the signal constellation is approximated by a continuous distribution that is uniform over a region defined by the signal points. In defining the region of the constellation, each point in the constellation is associated with an identical fundamental volume, Vx. Each point in the constellation is centered within an N-dimensional decision region (or Voronoi Region) with volume equal to the fundamental volume. his region is the decision region for internal points in the constellation that have a maximum number of nearest neighbors. he union of these Voronoi regions for the points is called the Voronoi Region Dx of the constellation and has volume MVx. he discrete distribution of the constellation is approximated by a uniform distribution over the Voronoi Region with probability density px(u) = MVx u in D x. Definition.6. (Continuous Approximation) he continuous approximation to the average energy equals Ex Ẽx = u du, (.06) Dx MVx where the N-dimensional integral covers the Voronoi region Dx. For large size signal sets with regular spacing between points, the error in using the continuous approximation is small, as several examples will demonstrate in this section..6. Pulse Amplitude Modulation (PAM) Pulse amplitude modulation, or amplitude shift keying (ASK), is a one-dimensional modulated signal set with M = b signals in the constellation, for b Z +, the set of positive integers. Figure.40 illustrates the PAM constellation. he basis function can be any unit-energy function, but often ϕ (t) is ϕ (t) = sinc 60 ( t ) (.07)

61 or another Nyquist pulse shape (see Chapter 3). he data-symbol amplitudes are ± d, ± 3d, ± 5d (M )d,..., ±, and all input levels are equally likely. he minimum distance between points in a PAM constellation abbreviates as d min = d. (.08) Both binary antipodal and BQ are examples of PAM signals. he average energy of a PAM constellation is Ex = Ēx = M/ M () ( ) k d (.09) = d M/ M = d M = d M k= [ 4 [ M 3 k= (4k 4k + ) (.0) ( (M/) M 6 ] + (M/) + (M/) ) ( (M/) 4 + (M/) ) + M ] 6 (.). (.) his PAM average energy is expressed in terms of the minimum distance and constellation size as Ex = Ēx = d [ M ]. (.3) he PAM minimum distance is a function of Ex and M: E x d = M. (.4) Finally, given distance and average energy, b = log M = ) log (Ēx d +. (.5) Figure.40 shows that the decision region for an interior point of PAM extends over a length d interval centered on that point. he Voronoi region of the constellation thus extends for an interval of Md over [ L, L] where L = Md. he continuous approximation for PAM assumes a uniform distribution on this interval ( L, L), and thus approximates the average energy of the constellation as Ex = Ēx L L x L dx = L 3 = M d. (.6) he approximation for the average energy does not include the constant term d, which becomes insignificant as M becomes large. Since M = b, then M = 4 b, leaving alternative relations ( b = b for N = ) for (.4) and (.5) and Ex = Ēx = d [ 4 b ] [ ] = d 4 b d =, (.7) E x 4 b. (.8) he following recursion derives from increasing the number of bits, b = b, in a PAM constellation while maintaining constant minimum distance between signal points: Ēx(b + ) = 4Ēx(b) + d 4. (.9) 6

62 d σ for P e = 0 ) 6 SNR = SNR increase = b = b M Q ( d min σ (M ) M (M ) 3.7dB 3.7dB 4 3.7dB 0.7dB 7dB dB 7.0dB 6.3dB dB 33.0dB 6.0dB dB 39.0dB 6.0dB able.: PAM constellation energies. hus for moderately large b, the required signal energy increases by a factor of 4 for each additional bit of information in the signal constellation. his corresponds to an increase of 6dB per bit, a measure commonly quoted by communication engineers as the required SNR increase for a transmission scheme to support an additional bit-per-dimension of information. he PAM probability of correct symbol detection is P c = M i=0 P c i px(i) (.0) ( [ dmin Q σ = M M ( M 4 + = M ( = M ]) + M ] ) [ dmin Q σ ] ) Q [ dmin σ ( [ ]) dmin Q σ (.) (.) (.3) hus, the PAM probability of symbol error is ( P e = P e = ) [ ] [ ] dmin dmin Q < Q b σ σ. (.4) he average number of nearest neighbors for the constellation is ( /M); thus, the NNUB is exact for PAM. hus ( P e = ) ( ) 3 Q M M SNR (.5) For P e = 0 6, one determines that d σ 4.75 (3.5dB). able. relates b = b, d M, σ, the SNR, and the required increase in SNR (or equivalently in Ēx) to transmit an additional bit of information at a probability of error P e = 0 6. able. shows that for b = b >, the approximation of 6dB per bit is very accurate. Pulse amplitude constellations with b > are typically known as 3BO - three bits per octal signal (for 8 PAM) and 4BH (4 bits per hexadecimal signal), but are rare in use with respect to the more popular quadrature amplitude modulation of Section.6.. EXAMPLE.6. (56K voiceband modem) he 56 kbps voiceband modem of Figure.4 can be viewed as a PAM modem with 8 levels, or b = 7, and a symbol rate of 8000 Hz. hus, the data rate is R = b/ = 56, 000 bits per second. he 8000Hz is thus consistent with the maximum allowed bandwidth of a voice-only telephone-network connection (4000 Hz) that is imposed by network switches that sample also at 8000Hz and the clock is supplied from the network to the internet service provider s modem as shown. he modulator is curiously and tacitly implemented by the combination of the network connection and the eventual DAC (digital-to-analog converter) at the beginning of the customer s analog phone 6

90 analog modem ISP Customer computer * the PAM levels are not equally spaced in v.

63 8000 Hz Network Clock Source (Internet Service Provider) Digital PAM Encoder * v.90 digital modem Digital connec>on (no modulator) Digital Voice Network Connec>on (4000 Hz max) modulator DAC & Filter elephone line Receiver v.90 analog modem ISP Customer computer * the PAM levels are not equally spaced in v.90 instead they are chosen to match the levels of the so- called voice codecs (µ- law in US, A- law interna>onally) that are matched to best capture of analog voice signals and unfortunately all that is available for the v.90 modem path even if worse than equally spaced PAM for data traffic. 7 effec>ve bits/symbol are transmi[ed, but an 8 th redundant bit is also sent for use with coding as in Chapters 8-0. Figure.4: 56K Modem PAM example. line. A special receiver structure known as a DFE (see Chapter 3) converts the telephone line into an AWGN with an SNR sufficiently high to carry 8PAM. his particular choice of modulator implementation avoids the distortion that would be introduced by an extra ADC and DAC on the Internet Service Provider s phone line (and which are superfluous as that Internet Service Provider most often has a digital high-speed connection by fiber or other means to the network). he levels are not equally spaced in most implementations, an artifact only of the fact that the network DAC is not a uniform-dac, and instead chooses its levels for best voice transmission, which is not the best for data transmission. Yet higher-speed DSL versions of the same system, allow a higher-speed connection through the network, and use a new DAC in what is called a DSLAM that replaces the voice connection to the subscriber..6. Quadrature Amplitude Modulation (QAM) QAM is a two dimensional generalization of PAM. he two basis functions are usually ( t ϕ (t) = sinc ( t ϕ (t) = sinc ) cos ω c t, (.6) ) sin ω c t. (.7) he sinc(t/ ) term may be replaced by any Nyquist pulse shape as discussed in Chapter 3. he ω c is a radian carrier frequency that is discussed for in Chapters and 3; for now, ω c π/. he QAM Square Constellation Figure.4 illustrates QAM Square Constellations. hese constellations are the Cartesian products 5 of -PAM with itself and 4-PAM with itself, respectively. Generally, square M-QAM constellations derive 5 A Cartesian Product, a product of two sets, is the set of all ordered pairs of coordinates, the first coordinate taken from the first set in the Cartesian product, and the second coordinate taken from the second set in the Cartesian product. 63

64 ( M ) d ϕ. 5d 6QAM 3d ( M ) d 4QAM d 5d 3d d d d 3d 5d ( ) M d ϕ 3d 5d. ( M ) d Figure.4: QAM constellations. 64

65 from the Cartesian product of two M-PAM constellations. For b bits per dimension, the M = 4 b signal points are placed at the coordinates ± d, ± 3d, ± 5d,..., ± ( M )d in each dimension. he average energy of square QAM constellations is easily computed as E M QAM = Ex = Ēx = M M i,j= ( x i + x j) M M (.8) = M M i= x i + M j= x j (.9) = M x i (.30) M i= = E M PAM (.3) ( ) M = d (.3) 6 hus, the average energy per dimension of the M-QAM constellation ( ) M Ēx = d, (.33) equals the average energy of the constituent M-PAM constellation. he minimum distance d min = d can be computed from Ex (or Ēx) and M by 6E x d = M = Ēx M. (.34) Since M = 4 b, alternative relations for (.34) and (.35) in terms of the average bit rate b are and Finally, Ēx = E [ ] x = d 4 b, (.35) Ēx d = 4 b. (.36) ( ) 6E b = log x d + = ( ) log Ēx d +, (.37) the same as for a PAM constellation. For large M, Ēx d M = d 4 b, which is the same as that obtained by using the continuous approximation. he continuous approximation for two dimensional QAM uses a uniform constellation over the square defined by [±L, ±L], Ex L L L L x + y 4L dxdy = L 3, (.38) or L =.5Ex. Since the Voronoi region for each signal point in a QAM constellation has area d M 4 L d his result agrees with Equation.34 for large M. = 6E x d = Ēx d. (.39) 65

66 As the number of points increases, the energy-computation error caused by using the continuous approximation becomes negligible. Increasing the number of bits, b, in a QAM constellation while maintaining constant minimum distance, we find the following recursion for the increase in average energy: Ex(b + ) = Ex(b) + d 6. (.40) Asymptotically the average energy increases by 3dB for each added bit per two dimensional symbol. he probability of error can be exactly computed for QAM by noting that the conditional probability of a correct decision falls into one of 3 categories:. corner points (4 points with only nearest neighbors) P c corner =. inner points ( M ) points with 4 nearest neighbors) P c inner = ( [ ]) d Q (.4) σ ( [ ]) d Q (.4) σ 3. edge points 4( M ) points with 3 nearest neighbors) ( [ d P c edge = Q σ ]) ( Q [ ]) d σ. (.43) he probability of being correct is then (abbreviating Q Q [ d σ ] ) P c = M i=0 P c/i px(i) (.44) = 4 M ( Q) + ( M ) ( Q) + 4( M ) ( Q) ( Q) (.45) M M = [ (4 8Q + 4Q ) + (4 M 8)( 3Q + Q ) (.46) M +(M 4 ] M + 4)( 4Q + 4Q ) (.47) = [ M + (4 M 4M)Q + (4 8 M + 4M)Q ] (.48) M = + 4( )Q + 4( ) Q (.49) M M hus, the probability of symbol error is ( P e = 4 ) [ ] ( d Q 4 ) [ ]) ( d (Q < 4 ) [ ] d Q M σ M σ M σ. (.50) he average number of nearest neighbors for the constellation equals 4( / M), thus for QAM the NNUB is not exact, but usually tight. he corresponding normalized NNUB is ( P e ) [ ] ( d Q = ) [ ] 3 Q b σ b M SNR, (.5) which equals the PAM result. For P e = 0 6, one determines that d σ 4.75 (3.5dB). able. relates 66

67 d σ for P e = 0 6 ) SNR = SNR increase = b = b M Q ( d min σ (M ) M (M ) db/bit 4 3.7dB 3.7dB dB 0.7dB 7.0dB 3.5dB dB 7.0dB 6.3dB 3.5dB dB 33.0dB 6.0dB 3.0dB dB 39.0dB 6.0dB 3.0dB dB 45.0dB 6.0dB 3.0dB 4 6, dB 5.0dB 6.0dB 3.0dB able.: QAM constellation energies. b, M, d σ, the SNR, and the required increase in SNR (or equivalently in Ēx) to transmit an additional bit of information. As with PAM for average bit rates of b >, the approximation of 3dB per bit per two-dimensional additional for the average energy increase is accurate. he constellation figure of merit for square QAM is ζx = 3 M = 3 4 b = 3 b. (.5) When b is odd, it is possible to define a SQ QAM constellation by taking every other point from a b + SQ QAM constellation. (See Problem.4.) A couple of examples illustrate the wide use of QAM transmission. EXAMPLE.6. (Cable Modem) Cable modems use existing cable-v coaxial cables for two-way transmission (presuming the cable V provider has sent personnel to the various unidirectional blocking points in the network and replaced them with so-called diplex filters). Early cable modem conventions (i.e., DOCSIS) use 4QAM in both directions of transmission. he downstream direction from cable V end to customer is typically at a carrier frequency well above the used V band, somewhere between 300 MHz and 500 MHz. he upstream direction is below 50 MHz, typically between 5 and 40 MHz. he symbol rate is typically / =MHz so the data rate is 4 Mbps on any given carrier. ypically about 0 carriers can be used (so 40 Mbps maximum) and each is shared by various subgroups of customers, each customer within a subgroup typically getting 384 kbps in fair systems (although some customers can hog all the 4 Mbps and systems for resolving such customeruse are evolving). EXAMPLE.6.3 (Satellite V Broadcast) Satellite television uses 4QAM in for broadcast transmission at one of 0 carrier frequencies between. GHz to.7 GHz from satellite to customer receiver for some suppliers and satellites. Corresponding carriers between 7.3 and 7.8 GHz are used to send the signals from the broadcaster to the satellite, again with 4 QAM. he symbol rate is / = 9.5 MHz, so the aggregate data rate is Mbps on any of the 0 carriers. A typical digital V signal is compressed into about -3 Mbps allowing for 4-6 channels per carrier/qam signal. (Some stations watched by many, for instance sports, may get a larger allocation of bandwidth and carry a higher-quality image than others that are not heavily watched. A high-definition V channel (there are only 4 presently) requires 0 Mbps if sent with full fidelity. Each carrier is transmitted in a 4 MHz transponder channel on the satellite these 4 MHz channels were originally used to broadcast a single analog V channel, modulated via FM unlike terrestrial analog broadcast television (which uses only 6 MHz for analog V). 67

68 b-3 points b-3 points Inner Square b- points b-3 points b-3 points Figure.43: Cross constellations. QAM Cross Constellations he QAM cross constellation also allows for odd numbers of bits per symbol in QAM data transmission. o construct a QAM cross constellation with b bits per symbol one augments a square QAM constellation for b bits per symbol by adding b data symbols that extend the sides of the QAM square. he corners are excluded as shown in Figure.43. One computation of average energy of QAM cross constellations doubles the energy of the two large rectangles ([ b 3 + b ] b ) and then subtracts the energy of the inner square ( b b ). he energy of the inner square is Ex(inner) = d 6 (b ). (.53) he total sum of energies for all the data symbols in the inner-square-plus-two-side-rectangles is (looking only at one quadrant, and multiplying by 4 because of symmetry) b 3 E = d 4 (4) = d 4 (4) k= [ 3 b 5 l= 3 b 5 [ = d 4 (4) b 7 = d 4 = d 4 [ (k ) + (l ) ] (.54) ( 3b 3 b 6 ) + b 3 (7 3b 9 3 b 3 ) ( 3b 3 b + b 5 (9 3b 9 b 3 6 )] )] (.55) (.56) [ b 3 b + 9 b 5 b ] (.57) [ ] 3 3 b b. (.58) hen Ex = E [ b Ex(inner) b = d b 3 b + ] 3 [( = d ) b ] 6 3 (.59) (.60) 68

69 ϕ ϕ d d Figure.44: 3CR constellation. = d 4 [ 3 48 b ] [ ] = d M (.6) he minimum distance d min = d can be computed from Ex (or Ēx) and M by d = 6Ex 3 3 M = Ēx 3 3 M = Ēx b. (.6) In (.6), for large M, Ex 3d 9 M = 3d 4 b, 9 the same as the continuous approximation. he following recursion derives from increasing the number of bits, b, in a QAM cross constellation while maintaining constant minimum distance:. Ex(b + ) = Ex(b) + d 6. (.63) As with the square QAM constellation asymptotically the average energy increases by 3 db for each added bit per two dimensional symbol. he probability of error can be bounded for QAM Cross by noting that a lower bound on the conditional probability of a correct decision falls into one of two categories: {. inner points ( b 4 ( 3 b 3 b 5 )} = P c/inner = ( ) (. side points (4 3 b 3 b 5 = 4 b { b 4 ( b )} ) with four nearest neighbors ( [ ]) dmin Q (.64) σ ) ) with three nearest neighbors. (his calculation is only a bound because some of the side points have fewer than three neighbors at distance d min ) ( [ ]) ( [ ]) dmin dmin P c/outer = Q Q. (.65) σ σ [ d ] he probability of a correct decision is then, abbreviating Q = Q min σ, P c M [ ) ] 4 ( b ( Q)( Q) (.66) 69

70 d σ for P e = 0 6 ) SNR = SNR increase = b = b M Q ( d min σ ([3/3] M ) [3/3] M [3/3] (M ) db/bit dB 3.7 db dB 9.8dB 6.dB 3.05dB dB 35.8dB 6.0dB 3.0dB dB 4.8dB 6.0dB 3.0dB dB 47.8dB 6.0dB 3.0dB 5 3, dB 53.8dB 6.0dB 3.0dB able.3: QAM Cross constellation energies. + [{ )} b 4 ( b ( Q) ] (.67) M = ] ] [4 b ( 3Q + Q ) + [ b b+3 ( 4Q + 4Q ) (.68) M ] ] = [ 3 b + 4 Q + [ 5 b 5 b + 4 Q (.69) hus, the probability of symbol error is bounded by ( P e 4 ) [ ] ( dmin Q 4 M σ ( < 4 ) [ ] dmin Q M σ < 4Q [ dmin σ ) ( [ ]) dmin Q (.70) M σ ]. (.7) he average number of nearest neighbors for the constellation is 4( / M); thus the NNUB is again accurate. he normalized probability of error is ( P e ) [ ] dmin Q, (.7) σ b+.5 which agrees with the PAM result when one includes an additional bit in the constellation, or equivalently an extra.5 bit per dimension. o evaluate (.73), Equation.63 relates that ( ) dmin = 3 SNR σ 3 3 M (.73) able.3 lists the incremental energies and required SNR for QAM cross constellations in a manner similar to able.. here are also square constellations for odd numbers of bits that Problem.4 addresses. Vestigial Sideband Modulation (VSB), CAP, and OQAM From the perspective of performance and constellation design, there are many alternate basis function choices for QAM that are equivalent. hese choices sometimes have value from the perspective of implementation considerations. hey are all equivalent in terms of the fundamentals of this chapter when implemented for successive transmission of messages over an AWGN. In successive transmission, the basis functions must be orthogonal to one another for all integer translations of the symbol period. hen successive samples at the demodulator output at integer multiples of will be independent; then, the one-shot optimum receiver can be used repeatedly in succession to detect successive messages without loss of optimality on the AWGN (see Chapter 3 to see when successive transmission can degrade in the presence of intersymbol interference on non-awgn channels with bandwidth limitations in practice.) he PAM basis function always exhibits this desirable translation property on the AWGN, and so do the QAM basis functions as long as ω c π/. he QAM basis functions are not unique with respect to satisfaction of the translation property, with VSB/SSB, CAP, and OQAM all being variants: 70

71 VSB Vestigial sideband modulation (VSB) is an alternative modulation method that is equivalent to QAM. In QAM, typically the same unit-energy basis function ( / sinc(t/ )) is double-side-band modulated independently by the sine and cosine of a carrier to generate the two QAM basis functions. In VSB, a double bandwidth sinc function and its Hilbert transform (see Chapter for a discussion of Hilbert transforms) are single-side-band modulated. he two VSB basis functions are 6 ( ) t ϕ (t) = sinc cos ω c t,. (.74) ( ) t ϕ (t) = sinc sin( πt ) sin ω ct. (.75) A natural symbol-rate choice for successive transmission with these two basis functions might appear to be /, twice the rate associated with QAM. However, successive translations of these basic functions by integer multiples of / are not orthogonal that is < ϕ i (t), ϕ j (t /) > δ ij ; however, < ϕ i (t), ϕ j (t k ) >= δ ij for any integer k. hus, the symbol rate for successive orthogonal transmissions needs to be /. VSB designers often prefer to exploit the observation that < ϕ (t), ϕ (t k/) >= 0 for all odd integers k to implement the VSB transmission system as one-dimensional time-varying at rate / dimensions per second. he first and second dimensions are alternately implemented at an aggregate rate of / dimensions per second. he optimum receiver consists of two matched filters to the two basis functions, which have their outputs each sampled at rate / (staggered by / with respect to one another), and these samples interleaved to form a single one-dimensional sample stream for the detector. Nonetheless, those same designers call the VSB constellations by two-dimensional names. hus, one may hear of 6 VSB or 64 VSB, which are equivalent to 6 QAM (or 4 PAM) and 64 QAM (8 PAM) respectively. VSB transmission may be more convenient for upgrading existing analog systems that are already VSB (i.e., commercial television) to digital systems that use the same bandwidths and carrier frequencies - that is where the carrier frequencies are not centered within the existing band. VSB otherwise has no fundamental advantages or differences from QAM. CAP Carrierless Amplitude/Phase (CAP) transmission systems are also very similar to QAM. he basis functions of QAM are time-varying when ω c is arbitrary that is, the basis functions on subsequent transmissions may differ. CAP is a method that can eliminate this time variation for any choice of carrier-frequency, making the combined transmitter implementation appear carrierless and thus timeinvariant. CAP has the same one-shot basis functions as QAM, but also has a time-varying encoder constellation when used for successive transmission of two-dimensional symbols. he time-varying CAP encoder implements a sequence of additional two-dimensional constellation rotations that are known and easily removed at the receiver after the demodulator and just before the detector. he time-varying encoder usually selects the sequence of rotations so that the phase (argument of sines and cosines) of the carrier is the same at the beginning of each symbol period, regardless of the actual carrier frequency. Effectively, all carrier frequencies thus appear the same, hence the term carrierless. he sequence of rotations has an angle that increases linearly with time and can often be very easily implemented (and virtually omitted when differential encoding - see Chapter 4 - is implemented). See Section.4. OQAM Offset QAM (OQAM) or staggered QAM uses the alternative basis functions ( ) ( ) t πt ϕ (t) = sinc cos ( ) ( ) t / πt ϕ (t) = sinc sin (.76) (.77) effectively offseting the two dimensions by /. For one-shot transmission, such offset has no effect (the receiver matched filters effectively re-align the two dimensions) and OQAM and QAM are the same. For 6 his simple description is actually single-side-band (SSB), a special case of VSB. VSB uses practical realizable functions instead of the unrealizable sinc functions that simplify fundamental developments in Chapter. 7

72 successive transmission, the derivative (rate of change) of x(t) is less for OQAM than for QAM, effectively reducing the bandwidth of transmitted signals when the sinc functions cannot be perfectly implemented. OQAM signals will never take the value x(t) = 0, while this value is instantaneously possible with QAM thus nonlinear transmitter/receiver amplifiers are not as stressed by OQAM. here is otherwise no difference between OQAM and QAM. he Gap he gap, Γ, is an approximation introduced by Forney for constellations with b / that is empirically evident in the PAM and QAM tables. Specifically, if one knows the SNR for an AWGN channel, the number of bits that can be transmitted with PAM or QAM according to b = log ( + SNR Γ ). (.78) At error rate P e = 0 6, the gap is 8.8 db. For P e = 0 7, the gap is 9.5 db. If the designer knows the SNR and his desired performance level ( P e ) or equivalently the gap, then the number of bits per dimension (and thus the achievable data rate R = b/ ) are immediately computed. Chapters 8-0 will introduce more sophisticated encoder designs where the gap can be reduced, ultimately to 0 db, enabling a highest possible data rate of.5 log ( + SNR), sometimes known as the channel capacity of the AWGN channel. QAM and PAM are thus about 9 db away in terms of efficient use of SNR from ultimate limits..6.3 Constellation Performance Measures Having introduced many commonly used signal constellations for data transmission, several performance measures compare coded systems based on these constellations. Coding Gain Of fundamental importance to the comparison of two systems that transmit the same number of bits per dimension is the coding gain, which specifies the improvement of one constellation over another when used to transmit the same information. Definition.6. (Coding Gain) he coding gain (or loss), γ, of a particular constellation with data symbols {x i } i=0,...,m with respect to another constellation with data symbols { x i } i=0,...,m is defined as ( ) d γ = min ( (x)/ēx ) = ζ x d min ( x)/ē ζ, (.79) x x where both constellations are used to transmit b bits of information per dimension. A coding gain of γ = (0dB) implies that the two systems perform equally. A positive gain (in db) means that the constellation with data symbols x outperforms the constellation with data symbols x. As an example, we compare the two constellations in Figures.3 and.33 and obtain γ = ζ x(8ampm) ζx(8psk) 0 = sin ( π.37 (.4dB). (.80) 8 ) Signal constellations are often based on N-dimensional structures known as lattices. (A discussion of lattices appears in Chapter 8.) A lattice is a set of vectors in N-dimensional space that is closed under vector addition that is, the sum of any two vectors is another vector in the set. A translation of a lattice produces a coset of the lattice. Most good signal constellations are chosen as subsets of cosets of lattices. he fundamental volume for a lattice measures the region around a point: 7

73 Definition.6.3 (Fundamental Volume) he fundamental volume V(Λ) of a lattice Λ (from which a signal constellation is constructed) is the volume of the decision region for any single point in the lattice. his decision region is also called a Voronoi Region of the lattice. he Voronoi Region of a lattice, V(Λ), is to be distinguished from the Voronoi Region of the constellation, Vx the latter being the union of M of the former. For example, an M-QAM constellation as M is a translated subset (coset) of the twodimensional rectangular lattice Z, so M-QAM is a translation of Z as M. Similarly as M, the M-PAM constellation becomes a coset of the one dimensional lattice Z. he coding gain, γ of one constellation based on x with lattice λ and volume V(Λ) with respect to another constellation with x, Λ, and V(Λ) can be rewritten as ( ) d (x) ( ) min V(Λ) /N V(Λ) /N γ = ( ) ( Ēx ) (.8) d ( x) min V(Λ) /N V(Λ) /N Ē x = γ f + γ s (db) (.8) he two quantities on the right in (.83) are called the fundamental gain γ f and the shaping gain γ s respectively. Definition.6.4 (Fundamental Gain) he fundamental gain γ f of a lattice, upon which a signal constellation is based, is ( ) d (x) min V(Λ) /N γ f = ( ). (.83) d ( x) min V(Λ) /N he fundamental gain measures the efficiency of the spacing of the points within a particular constellation per unit of fundamental volume surrounding each point. Definition.6.5 (Shaping Gain) he shaping gain γ s of a signal constellation is defined as ) γ s = ( ( V(Λ) /N Ēx V(Λ) /N Ē x ). (.84) he shaping gain measures the efficiency of the shape of the boundary of a particular constellation in relation to the average energy per dimension required for the constellation. Using a continuous approximation, the designer can extend shaping gain to constellations with different numbers of points as ( ) V(Λ) /N γ s = Ēx b(x) ( ). (.85) V(Λ) /N Ē x b( x) Peak-to-Average Power Ratio (PAR) For practical system design, the peak power of a system may also need to be limited. his constraint can manifest itself in several different ways. For example if the modulator uses a Digital-to-Analog Converter (or Analog-to-Digital Converter for the demodulator) with a finite number of bits (or finite dynamic range), then the signal peaks can not be arbitrarily large. In other systems the channel or modulator/demodulator may include amplifiers or repeaters that saturate at high peak signal voltages. Yet another way is in adjacent channels where crosstalk exists and a high peak on one channel can couple 73

74 into the other channel, causing an impulsive noise hit and an unexpected error in the adjacent system. hus, the Peak-to-Average Power Ratio (PAR) is a measure of immunity to these important types of effects. he peak energy is: Definition.6.6 (Peak Energy) he N-dimensional peak energy for any signal constellation is E peak. N E peak = max x in. (.86) i he peak energy of a constellation should be distinguished from the peak squared energy of a signal x(t), which is max i,t x i (t). his later quantity is important in analog amplifier design or equivalently in however the filters ϕ n (t) are implemented. he peak energy of a constellation concept allows precise definition of the PAR: n= Definition.6.7 (Peak-to-Average Power Ratio) he N dimensional Peak-to-Average Power Ratio, PARx, for N-dimensional Constellation is PARx = E peak Ex (.87) For example 6SQ QAM has a PAR of.8 in two dimensions. For each of the one-dimensional 4-PAM constellations that constitute a 6SQ QAM constellation, the one-dimensional PAR is also.8. hese two ratios need not be equal, however, in general. For instance, for 3CR, the two-dimensional PAR is 34/0 =.7, while observation of a single dimension when 3CR is used gives a one-dimensional PAR of 5/(.75(5) +.5(5)) =.5. ypically, the peak squared signal energy is inevitably yet higher in QAM constellations and depends on the choice of ϕ(t)..6.4 Hexagonal Signal Constellations in Dimensions he most dense packing of regularly spaced points in two dimensions is the hexagonal lattice shown in Figure.45. he volume (area) of the decision region for each point is V = 6( )(d )( d 3 ) = d 3. (.88) If the minimum distance between any two points is d in both constellations, then the fundamental gain of the hexagonal constellation with respect to the QAM constellation is γ f = d 3d = 3 =.65dB. (.89) he encoder/detector for constellations based on the hexagonal lattice may be more complex than those for QAM. 74

75 d Figure.45: Hexagonal lattice. 75

76 n ( t ) x ( t) ( t) ( ) x t h + y ( t ) Figure.46: Filtered AWGN channel..7 Additive Self-Correlated Noise In practice, additive noise is often Gaussian, but its power spectral density is not flat. Engineers often call this type of noise self-correlated or colored. he noise remains independent of the message signals but correlates with itself from time instant to time instant. he origins of colored noise are many. Receiver filtering effects, noise generated by other communications systems ( crosstalk ), and electromagnetic interference are all sources of self-correlated noise. A narrow-band transmission of a radio signal that somehow becomes noise for an unintended channel is another common example of self-correlated noise and called RF noise (RF is an acroynm for radio frequency ). Self-correlated Gaussian noise can significantly alter the performance of a receiver designed for white Gaussian noise. his section investigates the optimum detector for colored noise and also considers the performance loss when using a suboptimum detector designed for AWGN case in the presence of Additive Correlated Gaussian Noise (ACGN). his study is facilitated by first investigating the filtered one-shot AWGN channel in Subsection.7.. Subsection.7. then finds the optimum detector for additive self-correlated Gaussian noise, by adding a whitening filter that transforms the self-correlated noise channel into a filtered AWGN channel. Subsection.7.3 studies the vector channel, for which (for some unspecified reason) the noise has not been whitened and describes the optimum detector given this vector channel. Finally, Subsection.7.4 studies the degradation that occurs when the noise correlation properties are not known for design of the receiver, and an optimum receiver for the AWGN is used instead..7. he Filtered (One-Shot) AWGN Channel he filtered AWGN channel is illustrated in Figure.46. he modulated signal x(t) is filtered by h(t) before the addition of the white Gaussian noise. When h(t) δ(t), the filtered signal set { x i (t)} may differ from the transmitted signal set {x i (t)}. his transformation may change the probability of error as well as the structure of the optimal detector. his section still considers only one use of the channel with M possible messages for transmission. ransmission over this type of channel can incur a significant penalty due to intersymbol interference between successively transmitted data symbols. In the oneshot case, however, analysis need not consider this intersymbol interference. Intersymbol interference is considered in Chapters 3, 4, and 5. For any channel input signal x i (t), the corresponding filtered output equals x i (t) = h(t) x i (t). Decomposing x i (t) by an orthogonal basis set, x i (t) becomes x i (t) = h(t) x i (t) (.90) N = h(t) x in ϕ n (t) (.9) n= 76

77 = = N x in {h(t) ϕ n (t)} (.9) n= N x in φ n (t), (.93) n= where Note that: φ n (t) = h(t) ϕ n (t). (.94) he set of N functions {φ n (t)} n=,...,n is not necessarily orthonormal. For the channel to convey any and all constellations of M messages for the signal set {x i (t)}, the basis set {φ n (t)} must be linearly independent. he first observation can be easily proven by finding a counterexample, an exercise for the interested reader. he second observation emphasizes that if some dimensionality is lost by filtering, signals in the original signal set that differed only along the lost dimension(s) would appear identical at the channel output. For example consider the two signals x k (t) and x j (t). x k (t) x j (t) = N (x kn x jn )φ n (t) = 0, (.95) n= If the set {φ n (t)} is linearly independent then the sum in (.96) must be nonzero: a contradiction to (.96). If this set of vectors is linearly dependent, then (.96) can be satisfied, resulting in the possibility of ambiguous transmitted signals. Failure to meet the linear independence condition could mandate a redesign of the modulated signal set or a rate reduction (decrease of M). he dimensionality loss and ensuing redesign of {x i (t)} i=0:m is studied in Chapters 4 and 5. his chapter assumes such dimensionality loss does not occur. If the set {φ n (t)} is linearly independent, then the Gram-Schmidt procedure in Appendix A generates an orthonormal set of N basis functions {ψ n (t)} n=,...,n from {φ n (t)} n=,...,n. A new signal constellation { x i } i=0:m can be computed from the filtered signal set { x i (t)} using the basis set {ψ n (t)}. x in = x i (t)ψ n (t)dt = x i (t), ψ n (t). (.96) Using the previous analysis for AWGN, a tight upper bound on message error probability is still given by [ ] dmin P e N e Q, (.97) σ where d min is the minimum Euclidean distance between any two points in the filtered signal constellation { x i } i=0:m. he matched filter implementation of the demodulator/detector does not need to compute {ψ n (t)} n=,...,n for the signal detector as shown in Figure.47. (For reference the reader can reexamine the detector for the unfiltered constellation in Figure.4). In the analysis of the filtered AWGN, the transmitted average energy Ex is still measured at the channel input. hus, while E x can be computed, its physical significance can differ from that of E x. If, as is often the case, the energy constraint is at the input to the channel, then the comparison of various signaling alternatives, as performed earlier in this chapter could change depending on the specific filter h(t)..7. Optimum Detection in the Presence of Self-Correlated Noise he Additive Self-Correlated Gaussian Noise (ACGN) channel is illustrated in Figure.48. he only change with respect to Figure.9 is that the autocorrelation function of the additive noise r n (τ), 77

78 c 0 x 0 ( - t) + y ( t) h( t) x X M - ( - t)... ( - t) c +. c M + Max & Decode mˆ t = Figure.47: Modified signal detector (each filter can become L x filters for MIMO). n c ( t) S n N 0 ( f ) = S ( f ) n x( t) + Figure.48: ACGN channel. ŷ( t) 78

79 need not equal N0 δ(τ). Simplification of the ensuing development defines and uses a normalized noise autocorrelation function r n (τ) = r n(τ) N 0. (.98) he power spectral density of the unnormalized noise is then where S n (f) is the Fourier ransform of r n (τ). he Whitening filter S n (f) = N 0 S n (f), (.99) he whitening-filter analysis of the ACGN channel whitens the colored noise with a whitening filter g(t), and then uses the results of the previous section for the filtered AWGN channel where the filter h(t) = g(t). o ensure no loss of information when filtering the noisy received signal by g(t), the filter g(t) should be invertible. By the reversibility theorem, the receiver can use an optimal detector for this newly generated filtered AWGN without performance loss. Actually, the condition on invertibility of g(t) is sufficient but not necessary. For a particular signal set, a necessary condition is that the filter be invertible over that signal set. For the filter to be invertible on any possible signal set, g(t) must necessarily be invertible. his subtle point is often overlooked by most works on this subject. For g(t) to whiten the noise, [ Sn (f) ] = G(f). (.300) In general many filters G(f), may satisfy Equation (.30) but only some of the filters shall possess realizable inverses. o ensure the existence of a realizable inverse S n (f) must satisfy the Paley-Wiener Criterion. heorem.7. (Paley-Wiener Criterion) If ln S n (f) + f df <, (.30) then there exists a G(f) satisfying (.30) with a realizable inverse. (hus the filter g(t) is a -to- mapping). If the Paley-Wiener criterion were violated by a noise signal, then it is possible to design transmission systems with infinite data rate (that is when S n (f) = 0 over a given bandwidth) or to design transmission systems for each band overwhich Paley-Wiener is satisfied (that is the bands where noise is essentially of finite energy). his subsection s analysis always assumes Equation (.30) is satisfied. 7 With a -to- g(t) that satisfies (.30), the ACGN channel converts into an equivalent filtered white Gaussian noise channel as shown in Figure.46 replacing h(t) with g(t). he performance analysis of ACGN is identical to that derived for the filtered AWGN channel in Subsection.7.. A further refinement handles the filtered ACGN channel by whitening the noise and then analyzing the filtered AWGN with h(t) replaced by h(t) g(t). Analytic continuation of S n (s) determines an invertible g(t): ( S n (s) = S n f = s ), (.30) πj where S n (s) can be canonically (and uniquely) factored into causal (and causally invertible) and anticausal (and anticausally invertible) parts as S n (s) = S + n (s) S n (s), (.303) 7 Chapters 4 and 5 expand to the correct form of transmission that should be used when (.30) is not satisfied. 79

80 where S + n (s) = S n ( s). (.304) If S n (s) is rational, then S n + (s) is minimum phase, i.e. all poles and zeros of S n + (s) are in the left half plane. he filter g(t) is then given by { } g(t) = L S n + (.305) (s) where L is the inverse Laplace ransform. he matched filter g( t) is given by g( t) = πj G(s)e st ds, or equivalently by { } g( t) = L S n. (.306) (s) g( t) is anticausal and cannot be realized. Practical receivers instead realize g( t), where is sufficiently large to ensure causality. In general g(t) may be difficult to implement by this method; however, the next subsection considers a discrete equivalent of whitening that is more straightforward to implement in practice. When the noise is complex (see Chapter ), Equation (.305) generalizes to S + n (s) = [ S n ( s ) ]. (.307) Whitening of the MIMO channel will typically consist of an inverted matrix-square-root constant matrix to remove any spatial correlation between noises of different parallel channels, then followed by a scalar further whitening in time-frequency as in this subsection. (Practically, the same noise is hitting the different antennas or wires and this is spatially removed by the square root with the remaining core noise then whitened. More generally, this whitening can be handled in the practical discrete-time cases as shown in Chapters 4 and 5.).7.3 he Vector Self-Correlated Gaussian Noise Channel his subsection considers a discrete equivalent of the ACGN where the autocorrelation matrix of the noise vector n is y = x + n, (.308) E [nn ] = R n = R n σ. (.309) Both R n and R n are positive definite matrices. his discrete ACGN channel can often be substituted for the continuous ACGN channel. MIMO applies here directly with the correlation of noise now introducing a dependency of sorts between the parallel channels. All analysis proceeds identically, MIMO or not. he discrete noise vector can be whitened, transforming R n into an identity matrix. he discrete equivalent to whitening y(t) by g(t) is a matrix multiplication. he N N whitening matrix in the discrete case corresponds to the whitening filter g(t) in the continuous case. Cholesky factorization determines the invertible whitening transformation according (see Appendix A of Chapter 3): R n = R / R /, (.30) where R / is lower triangular and R / is upper triangular. hese matrices constitute the matrix equivalent of a square root, and both matrices are invertible. Noting the definitions, and [ R ] / = R/ [ R ] / = R /, (.3), (.3) 80

81 x ϕ x - ϕ x 3 x 0 - Figure.49: Equivalent signal constellation for Example.7.. o whiten n, the receiver passes y through the matrix multiply R /, he autocorrelation matrix for ñ is ỹ = R / y = R / x + R / n = x + ñ. (.33) E [ññ ] = R / E [nn ] R / = R / ( R/ R / σ ) R / = σ I. (.34) hus, the covariance matrix of the transformed noise ñ is the same as the covariance matrix of the AWGN vector. By the theorem of reversibility, no information is lost in such a transformation. EXAMPLE.7. (QPSK with correlated noise) For the example shown in Figure. suppose that the noise is colored with correlation matrix [ ] R n = σ (.35) hen R / = [ ] 0 (.36) and R / = [ 0 ]. (.37) From (.37), and [ ] R / 0 = [ ] R / = 0 he signal constellation after the whitening filter becomes [ ] [ ] [ x 0 = R / 0 x 0 = = (.38). (.39) ( ) ], (.30) 8

82 and similarly x = [ ( + ) ], x = [ ( ) ], and x3 = [ ( + ) ]. his new constellation forms a parallelogram in two dimensions, where the minimum distance is now along the shorter diagonal (between x and x 3 ), rather than along the sides and d min =.64 >. his new constellation appears in Figure.49. hus, the optimum detector for this channel with self-correlated Gaussian noise has larger minimum distance than for the white noise case, illustrating the important fact that having correlated noise is sometimes advantageous. he example shows that correlated noise may lead to improved performance measured with respect to the same channel and signal constellation with white noise of the same average energy. Nevertheless, the autocorrelation matrix of the noise is often not known in implementation, or it may vary from channel use to channel use. hen, the detector is designed as if white noise were present anyway, and there is a performance loss with respect to the optimum detector. he next subsection deals with the calculation of this performance loss..7.4 Performance of Suboptimal Detection with Self-Correlated Noise A detector designed for the AWGN channel is obviously suboptimum for the ACGN channel, but is often used anyway, as the correlation properties of the noise may be hard to know in the design stage. In this case, the detector performance will be reduced with respect to optimum. Computation of the amount by which performance is reduced uses the error event vectors ɛ ij = x i x j x i x j. (.3) he component of the additive noise vector along an error event vector is n, ɛ ij. he variance of the noise along this vector is σij = E { n, ɛ ij }. hen, the NNUB becomes [ { }] xi x j P e N e Q min. (.3) i j For Example.7., the worst case argument of the Q-function in (.33) is /σ, which represents a factor of (.64/) =.7dB loss with respect to optimum. his loss varies with rotation of the signal set, but not translation. If the signal constellation in Example.7. were rotated by 45 o, as in Figure.7, then the increase in noise variance is ( + /)/=.3 db, but d min remains at for this sub-optimum detector, so performance is 3 db inferior than the optimum detector for the unrotated constellation. However, the optimum receiver for the rotated case would also have changed to have 3 db worse performance for this rotation, so in this case the optimum rotated and sub-optimum rotated receiver have the same performance. σ ij 8

83 Chapter Excercises. Our First Constellation. a. Show that the following two basis functions are orthonormal. ( pts) φ (t) = φ (t) = { (cos (πt)) if tɛ [0, ] 0 otherwise { (sin (πt)) if tɛ [0, ] 0 otherwise b. Consider the following modulated waveforms. x 0 (t) = x (t) = x (t) = x 3 (t) = x 4 (t) = x 5 (t) = x 6 (t) = x 7 (t) = { (cos (πt) + sin (πt)) if tɛ [0, ] 0 otherwise { (cos (πt) + 3 sin (πt)) if tɛ [0, ] 0 otherwise { (3 cos (πt) + sin (πt)) if tɛ [0, ] 0 otherwise { (3 cos (πt) + 3 sin (πt)) if tɛ [0, ] 0 otherwise { (cos (πt) sin (πt)) if tɛ [0, ] 0 otherwise { (cos (πt) 3 sin (πt)) if tɛ [0, ] 0 otherwise { (3 cos (πt) sin (πt)) if tɛ [0, ] 0 otherwise { (3 cos (πt) 3 sin (πt)) if tɛ [0, ] 0 otherwise x i+8 (t) = x i (t) i = 0,, 7 Draw the constellation points for these waveforms using the basis functions of (a). ( pts) c. Compute Ex and Ēx (Ēx = Ex/N) where N is the number of dimensions (i) for the case where all signals are equally likely. ( pts) (ii) for the case where ( pts) p(x 0 ) = p(x 4 ) = p(x 8 ) = p(x ) = 8 d. Let where and p(x i ) = 4 i =,, 3, 5, 6, 7, 9, 0,, 3, 4, 5 y i (t) = x i (t) + 4φ 3 (t) φ 3 (t) = { if tɛ [0, ] 0 otherwise Compute Ey for the case where all signals are equally likely. ( pts) 83

84 . Inner Products. Consider the following signals: x 0 (t) = x (t) = x (t) = { cos( πt + π 6 ) if tɛ [0, ] 0 otherwise { cos( πt + 5π 6 ) if tɛ [0, ] 0 otherwise { cos( πt + 3π ) if tɛ [0, ] 0 otherwise a. Find a set of orthonormal basis functions for this signal set. Show that they are orthonormal. Hint: Use the identity for cos(a + b) = cos(a) cos(b) sin(a) sin(b). (4 pts) b. Find the data symbols corresponding to the signals above for the basis functions you found in (a). (3 pts) c. Find the following inner products: (3 pts) (i) < x 0 (t), x 0 (t) > (ii) < x 0 (t), x (t) > (iii) < x 0 (t), x (t) >.3 Multiple sets of basis functions. Consider the following two orthonormal basis functions: n c ( t) S n N 0 ( f ) = S ( f ) n x( t) + ŷ( t) Figure.50: Basis functions. a. Use the basis functions given above to find the modulated waveforms u(t) and v(t) given the data symbols u = [ ] and v = [ ]. It is sufficient to draw u(t) and v(t). ( pts) b. For the same u(t) and v(t), a different set of two orthonormal basis functions is employed for which u = [ 0] produces u(t). Draw the new basis functions and find the v which produces v(t). (3 pts).4 Minimal orthonormalization with MALAB. Each column of the matrix A given below is a data symbol that is used to construct its corresponding modulated waveform from the set of orthonormal basis functions {φ (t), φ (t),..., φ 6 (t)}. he set of modulated waveforms described by the columns of A can be represented with a smaller number of basis functions. 84

85 A = [a 0 a... a 7 ] (.33) = (.34) he transmitted signals a i (t) are represented (with a superscript of * meaning matrix or vector transpose) as φ (t) a i (t) = a φ (t) i. (.35) φ 6 (t) hus, each row of A(t) is a possible transmitted signal. A(t) = A ϕ(t). (.36) a. Use MALAB to find an orthonormal basis for the columns of A. Record the matrix of basis vectors. he MALAB commands help and orth will be useful. In particular, if one executes Q = orth(a) in matlab, a 6 3 orthogonal matrix Q is produced such that Q Q = I and A = [A Q] Q. he columns of Q can be thought of as a new basis thus try writing A(t) and interpreting to get a new set of basis functions and description of the 8 possible transmit waveforms. Note that help orth will give a summary of the orth command. o enter the matrix B in matlab (for example) shown below, simply type B=[ ; 3 4]; ( pts) [ ] B = (.37) 3 4 b. How many basis functions are actually needed to represent our signal set? What are the new basis functions in terms of {φ (t), φ (t),..., φ 6 (t)}?( pts) c. Find the new matrix Â which gives the data symbol representation for the original modulated waveforms using the smaller set of basis functions found in (b). Â will have 8 columns, one for each data symbol. he number of rows in Â will be the number of basis functions you found in (b). ( pts).5 Decision rules for binary channels. a. he Binary Symmetric Channel (BSC) has binary (0 or ) inputs and outputs. It outputs each bit correctly with probability p and incorrectly with probability p. Assume 0 and are equally likely inputs. State the MAP and ML decision rules for the BSC when p <. How are the decision rules different when p >? (5 pts) b. he Binary Erasure Channel (BEC) has binary inputs as with the BSC. However there are three possible outputs. Given an input of 0, the output is 0 with probability p and with probability p. Given an input of, the output is with probability p and with probability p. Assume 0 and are equally likely inputs. State the MAP and ML decision rules for the BEC when p < p <. How are the decision rules different when p < p <? (5 pts).6 Minimax [Wesel 994]. 85

86 -p 0 0 p p -p Figure.5: Binary Symmetric Channel(BSC). p 0 0 p p p Figure.5: Binary Erasure Channel (BEC). Consider a -dimensional vector channel y = x + n where x = ± and n is Gaussian noise with σ =. he Maximum-Likelihood (ML) Receiver which is minimax, has decision regions: D ML, = [0, ) and D ML, = (, 0) So if y is in D ML, we decode y as +; in D ML,, as. Consider another receiver, R, where the decision regions are: D R, = [, ) and D R, = (, ) 86

87 a. Find P e,ml and P e,r as a function of px() = p for values of p in the interval [0, ]. On the same graph, plot P e,ml vs. p and P e,r vs. p. ( pts) b. Find max p P e,ml and max p P e,r. Are your results consistent with the Minimax heorem? ( pts) c. For what value of p is D R the MAP decision rule? ( pt) Note: For this problem you will need to use the Q( ) function discussed in Appendix B. Here are some relevant values of Q( ). x Q(x)

88 .7 Irrelevancy/Decision Regions. (From Wozencraft and Jacobs) a. Consider the channel in Figure a where x, n, and n are independent binary random variables. All the additions shown below are modulo two. (Equivalently, the additions may be considered xor s.) n + y x n + y n + n + y 3 (i) Given only y, is y 3 relevant? ( pt) (ii) Given y and y, is y 3 relevant? ( pt) For the rest of the problem, consider the channel in Figure a, n + y x n + y One of the two signals x 0 = or x = is transmitted over this channel. he noise random variables n and n are statistically independent of the transmitted signal x and of each other. 88

89 heir density functions are, p n (n) = p n (n) = e n (.38) 89

90 b. Given y only, is y relevant? ( pt) c. Prove that the optimum decision regions for equally likely messages are shown in Figure c, (3 pts) y Either Choice Choose x - y Choose - x 0 Either Choice d. A receiver chooses x if and only if (y +y ) > 0. Is this receiver optimum for equally likely messages? What is the probability of error? (Hint: P e = P {y + y > 0/x = }px( ) + P {y + y /x = }px() and use symmetry. Recall the probability density function of the sum of random variables is the convolution of their individual probability density functions) (4 pts) e. Prove that the optimum decision regions are modified as indicated in Figure e when P r{x = x } > /. ( pts) y - Choose x - 0 Choose x 45 o y 90

91 .8 Optimum Receiver. (From Wozencraft and Jacobs) Suppose one of M equiprobable signals x i (t), i = 0,..., M is to be transmitted during a period of time over an AWGN channel. Moreover, each signal is identical to all others in the subinterval [t, t ], where 0 < t < t <. a. Show that the optimum receiver may ignore the subinterval [t, t ]. ( pts) b. Equivalently, show that if x 0,..., x M all have the same projection in one dimension, then this dimension may be ignored.( pts) c. Does this result necessarily hold true if the noise is Gaussian but not white? Explain.( pts).9 Receiver Noise (use MALAB for all necessary calculations - courtesy S. Li, 005. Each column of A given below is a data symbol that is used to construct its corresponding modulated waveform from a set of orthonormal basis functions (assume all messages are equally likely): Φ(t) = [ φ (t) φ (t) φ 3 (t) φ 4 (t) φ 5 (t) φ 6 (t) ]. he matrix A is given by A = (.39) so that A noise vector n = n n n3 n4 n5 n6 x(t) = Φ(t)A = [x 0 (t) x (t)... x 7 (t)]. (.330) is added to the symbol vector x, such that y(t) = Φ(t) (x + n) where n...n 6 are independent, with n k = ± with equal probability. he transmitted waveform y(t) is demodulated using the basis detector of Figure??. his problem examines the signal-to-noise ratio of the demodulated vector y = x + n with σ = E(n ) a. Find Ēx, σ, and SNR, Ēx/σ if all messages are equally likely. ( pts) b. Find the minimal number of basis vectors and new matrix Â as in Problem.4, and calculate the new ε x, σ, and SNR. (4 points) c. Let the new vector be ỹ = x + ñ, and discuss if the conversion from y to ỹ is invariant (namely, if P e is affected by the conversion matrix). Compare the detectors for parts a and b. ( points) d. Compare b, ε x with the previous system. Is the new system superior? Why or why not? ( pts) e. he new system now has three unused dimensions, we would like to send 8 more messages by constructing a big matrix Ā, as follows: Ā = [ Â 0 0 Â ] 9

92 ϕ θ ϕ L Figure.53: A Signal Constellation Compare b, ε x with the original 6-dimensional system, and the 3-dimensional system in b). (4 pts).0 ilt. Consider the signal set shown in Figure.53 with an AWGN channel and let σ = 0.. a. Does P e depend on L and θ? ( pt) b. Find the nearest neighbor union bound on P e for the ML detector assuming px(i) = 9 i. ( pts) c. Find P e exactly using the assumptions of the previous part. How far off was the NNUB? (5 pts) d. Suppose we have a minimum energy constraint on the signal constellation. How would we change the constellation of this problem without changing the P e? How does θ affect the constellation energy? ( pts). Parseval. Consider binary signaling on an AWGN σ = 0.04 with ML detection for the following signal set. (Hint: consider various ways of computing d min.) x 0 (t) = sinc (t) x (t) = sinc (t) cos(4πt) Determine the exact P e assuming that the two input signals are equally likely. (5 pts). Disk storage channel. Binary data storage with a thin-film disk can be approximated by an input-dependent additive white Gaussian noise channel where the noise n has a variance dependent on the transmitted (stored) input. he noise has the following input dependent density: e n σ if x = πσ p(n) = e n σ 0 if x = 0 πσ 0 and σ = 3σ 0. he channel inputs are equally-likely. 9

93 a. For either input, the output can take on any real value. On the same graph, plot the two possible output probability density functions (pdf s). i.e. Plot the output pdf for x = 0 and the output pdf for x =. Indicate (qualitatively) the decision regions on your graph. ( pts) b. Determine the optimal receiver in terms of σ and σ 0. (3 pts) c. Find σ 0 and σ if the SNR is 5 db. SNR is defined as d. Determine P e when SNR = 5 db. (3 pts) Ex = (σ 0 +σ ) σ0 +σ. ( pt) e. What happens as σ 0 0? You may restrict your attention to the physically reasonable case σ where σ is a fixed finite value and σ 0 0. ( pt).3 Rotation with correlated noise. A two dimensional vector channel y = x + n has correlated gaussian noise (that is the noise is not white and so not independent in each dimension) such that E[n ] = E[n ] = 0, E[n ] = E[n ] = 0., and E[n n ] = n is along the horizontal axis and n is along the vertical axis. a. Suppose we use the constellation in Figure.54 with θ = 45 and d =. (i.e. x = (, ) and x = (, )) Find the mean and mean square values of the noise projected on the line connecting the two constellation points. Note this value more generally is a function of θ when noise is not white. ( pts) x x d θ Figure.54: Constellation b. Note that the noise projected on the line in the previous part is Gaussian. Find P e for the ML detector. Assume your detector was designed for uncorrelated noise. ( pts) c. Fixing d =, find θ to minimize the ML detector P e and give the corresponding P e. You may continue to assume that the receiver is designed for uncorrelated noise. ( pts) d. Could your detector in part a be improved by taking advantage of the fact that the noise is correlated? ( pt).4 Hybrid QAM. Consider the 64 QAM constellation with d= (see Figure.55.): he 3 hybrid QAM ( ) is obtained by taking one of two points of the constellation. his problem investigates the properties of such a constellation. Assume all points are equally likely and the channel is an AWGN. a. Compute the energy Ex of the 64 QAM and the 3 hybrid QAM constellations. Compare your results. ( pts) b. Find the NNUB for the probability of error for the 64 QAM and 3 hybrid QAM constellations. Which has lower P e? Why? (3 pts) 93

94 Figure.55: 3 SQ embedded in 64 SQ QAM Constellation c. What is d min for a 3 Cross QAM constellation having the same energy? ( pt) d. Find the NNUB for the probability of error for the 3 Cross QAM constellation. Compare with the 3 hybrid QAM constellation. Which one performs better? Why? ( pts) e. Compute the figure of merit for both 3 QAM constellations. Is your result consistent with the one of (d)? ( pts).5 ernary Amplitude Modulation. Consider the general case of the 3-D AM constellation for which the data symbols are, ( d (x l, x m, x n ) = (l M 3 ), d (m M 3 ), d ) (n M 3 ) with l =,,... M 3, m =,,... M 3, n =,,... M 3. Assume that M 3 is an even integer. a. Show that the energy of this constellation is ( pts) Ex = M 3M 3 3 M l= x l (.33) b. Now show that (3 pts) Ex = d 4 (M 3 ) c. Assuming an AWGN channel with variance σ, find the NNUB for P e and P e. (3 pts) d. Find b and b. ( pt) e. Find Ex and the energy per bit E b. ( pt) f. For an equal number of bits per dimension b = b N, find the figure of merit for PAM, QAM and AM constellations with appropriate sizes of M. Compare your results. ( pts) 94

95 .6 Equivalency of rectangular-lattice constellations. Consider an AWGN system with a SNR = Ēx σ of db, a target probability of error P e = 0 6, and a symbol rate = 8 KHz. a. Find the maximum data rate R = b that can be transmitted for (i) PAM ( pt) (ii) QAM ( pt) (iii) AM ( pt) b. What is the NNUB normalized probability of error P e for the systems used in (a).( pts) c. For the rest of the problem we will only consider QAM systems. Suppose that the desired data rate is 40 Kbps. What is the transmit power needed to maintain the same probability of error? he SNR is no longer given as db. ( pts) d. Suppose now that the SNR was increased to 8 db. What is the highest data rate that can be reliably sent at the same probability of error 0 6? ( pt).7 Frequency separation in FSK. (Adapted from Wozencraft & Jacobs.) Consider the following two signals used in a Frequency Shift Key communications system over an AWGN channel. { Ex x 0 (t) = cos(πf 0 (t)) if t [0, ] 0 otherwise { Ex x (t) = cos(π(f 0 + )t) if t [0, ] 0 otherwise a. Find P e if = 0 4. ( pts) = 00µs f 0 = 0 5 Hz σ = 0.0 Ex = 0.3 b. Find the smallest such that the same P e found in part (a) is maintained. What type of constellation is this? (3 pts).8 Pattern Recognition. In this problem a simple pattern recognition scheme, based on optimum detectors is investigated. he patterns considered consist of a square divided into four smaller squares, as shown in Figure.56, Figure.56: Sample pattern 95

96 Figure.57: Examples of patterns considered Each square may have two possible intensities, black or white. he class of patterns studied will consist of those having two black squares, and two white squares. For example, some of these patterns are as shown in Figure.57, Each pattern can be encoded into a vector x = [x x x 3 x 4 ] where each component indicates the intensity of a small square according to the following rule, Black square x i = White square x i = For a given pattern, a set of four sensors take measurements at the center of each small square and outputs y = [y y y 3 y 4 ], y = x + n (.33) Where n = [n n n 3 n 4 ] is thermal noise (White Gaussian noise) introduced by the sensors. he goal of the problem is to minimize the probability of error for this particular case of pattern recognition. a. What is the total number of possible patterns? ( pt) b. Write the optimum decision rule for deciding which pattern is being observed. Draw the corresponding signal detector. Assume each pattern is equally likely. (3 pts) c. Find the union bound for the probability of error P e. ( pts) d. Assuming that nearest neighbours are at minimum distance, find the NNUB for the probability of error P e. ( pts).9 Shaping Gain. Find the shaping gain for the following two dimensional voronoi regions (decision regions) relative to the square voronoi region. Do this using the continuous approximation for a continuous uniform distribution of energy through the region. a. equilateral triangle ( pts) b. regular hexagon ( pts) c. circle ( pts).0 ( From Wozencraft and Jacobs). On an additive white Gaussian noise channel, determine P e for the following signal set with ML detection. Leave the answer in terms of σ. 96

97 (Hint: Plot the signals and then the signal vectors.) { if tɛ [0, ] x (t) = 0 otherwise { if tɛ [, ] x (t) = 0 otherwise { if tɛ [0, ] x 3 (t) = 0 otherwise { if tɛ [, 3] x 4 (t) = 0 otherwise if tɛ [0, ] x 5 (t) = if tɛ [, 3] 0 otherwise { if tɛ [, 3] x 6 (t) = 0 otherwise { if tɛ [0, 3] x 7 (t) = 0 otherwise x 8 (t) = 0. Comparing bounds. Consider the following signal constellation in use on an AWGN channel. Leave answers for parts a and b in terms of σ. x 0 = (, ) x = (, ) x = (, ) x 3 = (, ) x 4 = (0, 3) a. Find the union bound on P e for the ML detector on this signal constellation. b. Find the Nearest Neighbor Union Bound on P e for the ML detector on this signal constellation. c. Let the SNR = 4 db and determine a numerical value for P e using the NNUB.. Basic QAM Design - Midterm 996 Either square or cross QAM can be used on an AWGN channel with SNR = 30. db and symbol rate / = 0 6. a. Select a QAM constellation and specify a corresponding integer number of bits per symbol, b, for a modem with the highest data rate such that P e < 0 6. b. Compute the data rate for part a. c. Repeat part a if P e < 0 7 is the new probability of error constraint. d. Compute the data rate for part c..3 Basic Detection - One shot or wo? - Final 996 A BQ signal with d = is sent two times in immediate succession through an AWGN channel with transmit filter p(t), which is a scaled version of the basis function. All other symbol times, a symbol value of zero is sent. he symbol period for one of the BQ transmissions is =, and the transmit filter is p(t) = for 0 < t < and p(t) = 0 elsewhere. At both symbol periods, any one of the 4 messages is equally likely, and the two successive messages are independent. he WGN has power spectral density N 0 =.5. 97

98 0 0 p p / p p / p p p Figure.58: Discrete Memoryless Channel a. Draw an optimum (ML) basis detector and enumerate a signal constellation. (Hint: use basis functions.) (3 pts) b. Find d min. ( pts) c. Compute Ñe counting only those neighbors that are d min away. ( pts) d. Approximate P e for your detector. (3 pts).4 Discrete Memoryless Channel - Midterm 994 Given a channel with py x as shown in Figure.58: (y {0,, } and x {0,, }) Let p =.05 a. For px(i) = /3, find the optimum detection rule. b. Find P e for part a. c. Find P e for the MAP detector if px(0) = px() = /6 and px() = /3..5 Detection with Uniform Noise - Midterm 995 A one-dimensional additive noise channel, y = x + n, has uniform noise distribution p n (v) = { L v L 0 v > L where L/ is the maximum noise magnitude. he input x has binary antipodal constellation with equally likely input values x = ±. he noise is independent of x. a. Design an optimum detector (showing decision regions is sufficient.) ( pts) b. For what value of L is P e < 0 6? ( pt) c. Find the SNR (function of L). ( pts) d. Find the minimum SNR that ensures error-free transmission. ( pts) e. Repeat part d if 4-level PAM is used instead. ( pts.) 98

99 .6 Can you design or just use formulae? - Midterm CR QAM modulation is used for transmission on an AWGN with N0 / = 400kHz. a. Find the data rate R. =.00. he symbol rate is b. What SNR is required for P e < 0 7? (ignore N e ). c. In actual transmitter design, the analog filter rarely is normalized and has some gain/attenuation, unlike a basis function. hus, the average power in the constellation is calibrated to the actual power measured at the analog input to the channel. Suppose Ēx = corresponds to 0 dbm ( milliwatt), then what is the power of the signals entering the transmission channel for the 3CR in this problem with P e < 0 7? d. the engineer under stress. Without increasing transmit power or changing N0 =.00, design a QAM system that achieves the same P e at 3. Mbps on this same AWGN..7 QAM Design - Midterm 997 A QAM system with symbol rate / =0 MHz operates on an AWGN channel. he SNR is 4.5 db and a P e < 0 6 is desired. a. Find the largest constellation with integer b for which P e < 0 6. ( pts) b. What is the data rate for your design in part a? ( pts) c. How much more transmit power is required (with fixed symbol rate at 0 MHz) in db for the data rate to be increased to 60 Mbps? (P e < 0 6 ) ( pts) d. With SNR = 4 db, an reduced-rate alternative mode is enabled to accommodate up to 9 db margin or temporary increases in the white noise amplitude. What is the data rate in this alternative 9dBmargin mode at the same P e < 0 6? ( pts) e. What is the largest QAM (with integer b) data rate that can be achieved with the same power, E x /, as in part d, but with / possibly altered? ( pts).8 Basic Detection. A vector equivalent to a channel leads to the one-dimensional real system with y = x + n where n is exponentially distributed with probability density function p n (u) = σ u e σ for all u (.333) with zero mean and variance σ. his system uses binary antipodal signaling (with equally likely inputs) with distance d between the points. We define a function { x Q(x) = x e u du = e x for x 0 e u du = e x for x 0 (.334) a. Find the values Q( ), Q(0), Q( ), Q( 0). ( pts) b. For what x is Q(x) = 0 6? ( pt) c. Find an expression for the probability of symbol error P e in terms of d, σ, and the function Q. ( pts) d. Defining the SNR as SNR = Ēx σ, find a new expression for P e in terms of Q and this SNR. ( pts) e. Find a general expression relating P e to SNR, M, and Q for PAM transmission. ( pts) f. What SNR is required for transmission at b =,, and 3 when P e = 0 6? ( pts) 99

100 g. Would you prefer Gaussian or exponential noise if you had a choice? ( pt).9 QAM Design - Midterm 998 QAM transmission is to be used on an AWGN channel with SNR=7.5 db at a symbol rate of / = 5 MHz used throughout this problem. You ve been hired to design the transmission system. he desired probability of symbol error is P e 0 6. a. ( pts) List two basis functions that you would use for modulation. b. ( pts) Estimate the highest bit rate, b, and data rate, R, that can be achieved with QAM with your design. c. ( pt) What signal constellation are you using? d. (3 pts) By about how much (in db) would Ēx need to be increased to have 5 Mbps more data rate at the same probability of error? Does your answer change for Ex or for Px?.30 Basic Detection - Midterm pts QAM transmission is used on an AWGN channel with N0 =.0. he transmitted signal constellation [ ] [ ] [ ] ± 3 points for the QAM signal are given by 0 0 ±,, and, with each constellation point 0 ± equally likely. a. ( pt) Find M (message-set size) and Ēx (energy per dimension) for this constellation. b. ( pts) Draw the constellation with decision regions indicated for an ML detector. c. ( pts) Find N e and d min for this constellation. d. ( pts) Compute a NNUB value for P e for the ML detector of part b. e. ( pt) Determine b for this constellation (value may be non-integer). f. ( pts) For the same b as part e, how much better in decibels is the constellation of this problem than SQ QAM?.3 Basic Detection - Midterm 00-7 pts he QAM radial constellation in Figure.59 is used for transmission on an AWGN with σ =.05. All constellation points are equally likely. a. ( pts) Find E x and Ēx for this constellation. b. (3 pts) Find b, d min, and N e for this constellation. c. ( pts) Find P e and P e with the NNUB for an ML detector with this constellation..3 A concatenated QAM Constellation - 00 Miterm - 5 pts A set of 4 orthogonal basis functions {ϕ (t), ϕ (t), ϕ 3 (t), ϕ 4 (t)} uses the following constellation in both the first dimensions and again in the nd two dimensions: he constellation points are restricted in that an E ( even ) point may only follow an E point, and an O ( odd ) point can only follow an O point. For instance, the 4-dimensional point [+ + ] is permitted to occur, but the point [+ + + ] cannot occur. a. ( pts) Enumerate all M points as ordered-4-tuples. b. (3 pts) Find b, b, and the number of bits/hz or bps/hz. c. ( pt) Find Ex and Ēx (energy per dimension) for this constellation. d. ( pts) Find d min for this constellation. 00

101 Figure.59: Constellation for 8 points. O E E - O Figure.60: Constellation for Problem.3. e. ( pts) Find N e and N e for this constellation (you may elect to include only points at minimum distance in computing nearest neighbors). f. ( pts) Find P e and P e for this constellation using the NNUB if used on an AWGN with σ = 0.. g. (3 pts) Compare this 4-dimensional constellation fairly (which requires increasing the number of points in the constellation to 6 to get the same data rate). 4QAM..33 Detection Fundamentals Miterm - 5 pts A random variable x takes the values ± with equal probability independently of a second random variable x that takes the values ± also with equal probability. he two random variables are sumed to x = x + x, and x can only be observed after zero-mean Gaussian noise of variance σ =. is added, that is y = x + n is observed where n is the noise. a. ( pt) What are the values that the discrete random variable x takes, and what are their probabilities? ( pt) b. ( pt) What are the means and variances of x and y? c. ( pts) What is the lowest probability of error in detecting x given only an observation of y? Draw corresponding decision regions. 0

102 d. ( pt) Relate the value of x with a table to the values of x and x. Explain why this is called a noisy DAC channel. e. ( pt) What is the (approximate) lowest probability of error in detecting x given only an observation of y? f. ( pt) What is the (approximate) lowest probability of error in detecting x given only an observation of y? g. Suppose additional binary independent random variables are added so that the two bipolar values for x u are ± u, u =,..., U. Which x u has lowest probability of error for any AWG noise, and what is that P e? ( pt) h. For U =, what is the lowest probability of error in detecting x given an observation of y and a correct observation of x? ( pt) i. For U =, what is the lowest probability of error in detecting x given an observation of y and a correct observation of x? ( pt) j. What is the lowest probability of error in any of parts e through i if σ = 0? What does this mean in terms of the DAC? ( pt) k. Derive a general expression for the probability of error for all bits u =,..., U where x = x + x x U in AWGN with variance σ for part g? ( pts).34 Honey Comb QAM Miterm - 5 pts he QAM constellation in Figure.6 is used for transmission on an AWGN with symbol rate 0MHz and a carrier frequency of 00 MHz Figure.6: Constellation for Problem.34. Each of the solid constellation symbol possibilities is at the center of a perfect hexagon (all sides are equal) and the distance to any of the closest sides of the hexagon is d. he 6 empty points represent a possible message also, but each is used only every 6 symbol instants, so that for instance, the point labelled 0 is a potential message only on symbol instants that are integer multiples of 6. he point can only be transmitted on symbol instants that are integer multiples of 6 plus one, the point only on symbol instants that are integer multiples of 6 plus two, and so on. At any symbol instant, any of the points possible on that symbol are equally likely. 0

103 a. What is the number of messages that can be possibly transmitted on any single symbol? What are b and b? (3 pts) b. What is the data rate? ( pt) c. Draw the decision boundaries for time 0 of a ML receiver. ( pts) d. What is d min? ( pt) e. What are Ex and Ēx for this constellation in terms of d? (3 pts) f. What is the number of average number nearest neighbors? ( pt) g. Determine the NNUB expression that tightly upper bounds P e for this constellation in terms of SNR. ( pts) h. Compare this constellation fairly to Cross QAM transmission. ( pt) i. Describe an equivalent ML receiver that uses time-invariant decision boundaries and a constant decision device with a simple preprocessor to the decision device. ( pt). 03

104 Appendix A Gram-Schmidt Orthonormalization Procedure his appendix illustrates the construction of a set of orthonormal basis functions ϕ n (t) from a set of modulated waveforms {x i (t), i = 0,..., M }. he process for doing so, and achieving minimal dimensionality is called Gram-Schmidt Orthonormalization. Step : Find a signal in the set of modulated waveforms with nonzero energy and call it x 0 (t). Let ϕ (t) = x 0(t) E x 0, (A.) where Ex = [x(t)] dt. hen x 0 = [ Ex ]. Step i for i =,..., M: Compute x i,n for n =,..., i (x i,n = x i (t)ϕ n (t)dt). Compute θ i (t) = i x i (t) x i,n ϕ n (t) n= (A.) if θ i (t) = 0, then ϕ i (t) = 0, skip to step i +. If θ i (t) 0, compute ϕ i (t) = θ i(t) Eθi, (A.3) where E θi = [θ i(t)] dt. hen x i = [ x i,... x i,i Eθi ]. Final Step: Delete all components, n, for which ϕ n (t) = 0 to achieve minimum dimensional basis function set, and reorder indices appropriately. 04

105 Appendix B he Q Function he Q Function is used to evaluate probability of error in digital communication - It is the integral of a zero-mean unit-variance Gaussian random variable from some specified argument to : Definition B.0. (Q Function) Q(x) = π x e u du (B.) he integral cannot be evaluated in closed form for arbitrary x. Instead, see Figures B. and B. for a graph of the function that can be used to get numerical values. Note the argument is in db (0 log 0 (x)). Note Q( x) = Q(x), so we need only plot Q(x) for positive arguments. We state without proof the following bounds ( x ) e x Q(x) e x πx πx (B.) he upper bound in (B.) is easily seen to be a very close approximation for x 3. Computation of the probability that a Gaussian random variable u with mean m and variance σ exceeds some value d then uses the Q-function as follows: P {u d} = Q( d m ) (B.3) σ he Q-function appears in Figures B.3, B., and B. for very low SNR (-0 to 0 db), low SNR (0 to 0 db), and high SNR (0 to 6 db) using a very accurate approximation (less than % error) formula from the recent book by Leon-Garcia: [ ] e x Q(x) π π x +. (B.4) π x + π π For the mathematician at heart, Q(x) =.5 erfc(x/ ), where erfc is known as the complimentary error function by mathematicians. 05

106 Figure B.: Low SNR Q-Function Values Figure B.: High SNR Q-Function Values 06

Theory of Telecommunications Networks

Theory of Telecommunications Networks Anton Čižmár Ján Papaj Department of electronics and multimedia telecommunications CONTENTS Preface... 5 1 Introduction... 6 1.1 Mathematical models for communication