A Dynamic Relay Selection Scheme for Mobile Users in Wireless Relay Networks

A Dynamic Relay Selection Scheme for Mobile Users in Wireless Relay Networks Yifan Li, Ping Wang, Dusit Niyato School of Computer Engineering Nanyang Technological University, Singapore 639798 Email: {LIYI15, wangping, dniyato}@ntu.edu.sg Weihua Zhuang Department of Electrical and Computer Engineering University of Waterloo, Canada Email: wzhuang@bbcr.uwaterloo.ca Abstract Cooperative communication has attracted dramatic attention in the last few years due to its advantage in mitigating channel fading. Despite much effort that has been made in theoretical analysis of the performance gain, cooperative relay selection, which is one of the fundamental issues in cooperative communications, is still left as an open problem. In this paper, the tradeoff between improvement and corresponding cost of cooperative communication, focusing on relay selection is addressed. We consider a challenging scenario which takes user mobility into consideration. Based on user mobility pattern, a dynamic relay selection scheme aiming at minimizing the long-term average cost while satisfying the QoS requirement is proposed. For relay selection to achieve maximal performance, an optimization model based on the constrained Markov decision process (CMDP) is formulated and solved by applying the linear programming (LP) technique. Comprehensive analysis and comparison with several other relay selection schemes are presented. Through extensive simulations, our scheme shows its high effectiveness and flexibility in balancing the cost and QoS performance. I. INTRODUCTION It has been widely acknowledged that spacial diversity is of primary importance to combat channel fading in wireless communications. As one of the most well-known spacial diversity schemes, multiple-input-multiple-output (MIMO) has been included into recent wireless standards [1]. However, it may not be practical to apply MIMO in some wireless scenarios, especially for a wireless device which is not able to deploy multiple antennas due to size, cost, or other hardware limitations. To overcome this restriction, cooperative communication is proposed as a potential solution, which takes advantage of the broadcast nature of wireless channels, and makes virtual MIMO possible for users with a single antenna. A challenging issue in cooperative communications is how to strategically select relays to achieve cooperative diversity. In mobile communications, due to user mobility, relays selected at one location may no longer help at another location. As a result, a dynamic relay selection scheme is necessary. However, to the best of our knowledge, none of the existing works has taken account user mobility into the design of relay selection schemes. In this paper, we address such a problem and aim at proposing a dynamic relay selection scheme for mobile users (MUs). Specifically, we consider a scenario in which MUs travel among different locations in a wireless relay network (WRN). Since the relaying process consumes a certain amount of energy from each selected relay, we consider the energy consumption as a cost to be paid by the mobile MUs for selecting a relay. Based on the fact that the benefits and corresponding cost of relay cooperation are closely associated with user locations, we propose a dynamic relay selection scheme aiming at minimizing the long-term average cost while satisfying the QoS requirement (e.g., throughput or packet loss probability). II. RELATED WORK There exists a rich body of literature on relay selection problems. Generally, they can be divided into single relay selection and multiple relay selection. Single relay selection has been extensively studied and some of the well-known mechanisms are briefly described in the following. In [2], the neighbor node with the maximum SINR is selected as the best relay, while in [3] the node which is nearest to the base station will be used in cooperation. Although single-relay selection is attractive due to its simplicity, it may fail to meet the QoS performance required by users due to the limited diversity gain. To enhance the service quality by increasing the cooperative diversity order, more than one relay should be favored to be involved, which acts as the impetus for multiplerelay selection. In [4], the idea of single-relay selection is extended to multiple-relay selection and two SINR-suboptimal schemes are proposed based on the concept of relay ordering and recursion, respectively. In [5], a threshold-based relay selection protocol is proposed, in which all qualified relays (with the received SINRs higher than a pre-determined threshold) are allowed to cooperate. However, all the aforementioned relay selection schemes have not taken user mobility into account, which limits their applicability in mobile communications. In this paper, we propose a dynamic relay selection scheme to tackle with the relay selection problem for MUs in WRN, which distinguishes our work from the existing ones. A. Network Scenario III. SYSTEM MODEL As shown in Fig. 1, we consider a WRN which consists of a base station (BS) and a number of fixed relay stations (RSs) geographically located in the coverage area of the BS. Each RS adopts the DF strategy, and is assumed to be equipped with multiple channels. There are multiple MUs moving in the

2 WRN, acting as the transmission sources. Considering uplink transmission (i.e., from MU to BS), when a RS is selected by an MU, RS will utilize one channel to help MU relay packets so long as at least one more channel is available. Without loss of generality, it is assumed that the MUs who are successful in entering the system are treated equally, and each of them can independently adopt its optimal policy. For simplicity, we discuss the relay selection scheme for a single MU as it can be generalized to multiple MUs directly. Consider an MU having a data buffer with size Q B. For convenience of representation, we discretize MU locations in a service coverage area, and let L = {L 1,L 2,...,L M } be the set of all possible locations of the MU, where M = L. Let R m = {RSm,RS 1 m,...,rs 2 m Km } denote the set of available RSs for location L m, where K m is the number of such RSs. Here, the available RSs are referred to those which have reliable SR links, and thus can correctly decode the signals received from the MU. In our model, each RS is associated with a cost (due to the energy consumption of the RS for relaying), which has to be paid by the MU in return for requiring the cooperation when the RS is selected. The set of available RSs (e.g. R m ) may vary with MU locations, so are the benefits and the corresponding cost for requiring cooperation. Fig. 1. B. Mobility Model The wireless relay network. Consider a long observation time for performance metrics in terms of average cost and average packet loss probability. The whole observation time is partitioned into decision periods of constant duration. Over each decision period, the MU location remains unchanged. Given its current location L m1, the MU will either stay at the same location or travel to another location L m2 N(L m1 ) in the next decision period, where N(L m1 ) denotes the set of neighboring locations of L m1. Assume that the transferring time from one location to another location is negligible when compared with the length of one decision period. The mobility of the MU can be modeled using transition matrix P m. Similar mobility model can be found in [6]. In P m, the element in the m 1 th row and m 2 th column, p m1,m 2, denotes the probability for an MU staying in location L m1 in the current decision period to be at location L m2 in the next decision period, and m 2: L m2 N(L m1 ) p m 1,m 2 = 1. When L m2 / N(L m1 ), the transition probability p m1,m 2 is zero. C. Relay Transmission For relay transmission, protocol II in [7] is applied due to its advantage in battery life efficiency. In this protocol, the transmission is divided into two phases based on the practical consideration that the wireless terminals usually cannot transmit and receive simultaneously [7], [8]. In the first phase, the MU communicates with the selected RSs and the BS. In the second phase, all the selected RSs forward the decoded information to the BS simultaneously, while the MU remains silent. Assuming that maximal ratio combining (MRC) is applied at the BS, the post-processing SINR at the destination can be expressed as follows: γ DF p = γ SD + K γ Rk D (1) k=1 where K is the number of relays, γ SD and γ Rk D represent the instantaneous SINR of the SD link and that of the link between relay RS k and the BS, respectively. Note that singlerelay case is a special case of Eq. (1), i.e., when K = 1. For a Rayleigh fading channel, the cumulative distribution function (CDF) of the post-processing SINR for no-relay (i.e., direct transmission) case is given by F d (γ) = 1 e γ α, γ (2) where α = γ SD, which is the average SINR of the SD link. The CDF of the post-processing SINR for a single-relay case is F s (γ) = α α β γ (1 e α ) + β β α γ (1 e β ), γ (3) where β = γ RD is the average SINR of the RD link. In the context of multiple relays, Eq. (3) can be extended as follows: K F m (γ) = α (1 e γ α ) (4) α β j=1 j K + β K k β k (1 e γ β k ) β k α β k β j k=1 j=1,j k where γ, K 2, and β k = γ Rk D (k = 1,...,K) are the corresponding average SINRs. For convenience of analysis, we can consider the data transmission from source to destination through both direct link and cooperative links as achieved on an equivalent virtual link. For the DF-based relay transmission, the choice of adaptive modulation and coding (AMC) mode only depends on the post-processing SINR at the corresponding destination [8], which is calculated by Eq. (1). Given γp DF, we can obtain the maximum achievable rate (MAR) of the virtual link, which is denoted by r. With this r, the AMC mode can be chosen accordingly. For instance, if r = 2, we can use the AMC of QPSK with code rate

3 of 1 or 16QAM with code rate of 1/2. The destination feeds back its SINR measurement to the source, based on which the source selects a proper AMC mode for transmission and informs the selected relays to use the same mode to transmit. 1 Note that in the two-phase cooperative transmission, the source only transmits in the first phase, and hence the effective endto-end rate (EER) (i.e., the actual throughput at the destination) is r = r 2. Let Ö = {r,r 1,...,r R } denote the set of the EERs corresponding to different AMC modes. Without loss of generality, we assume r < r 1 < < r R. For any EER r i Ö, given the corresponding AMC mode as well as the required SINR interval [Γ ri+1,γ ri ] when selecting K RSs can be calculated as follows: F d (Γ ri+1 ) F d (Γ ri ), K = P ri,k = F s (Γ ri+1 ) F s (Γ ri ), K = 1 (5) F m (Γ ri+1 ) F m (Γ ri ), K 2 where Γ r = and Γ rr+1 =. Note that the aforementioned system model is also applicable to other cooperation protocols. The only difference exists in the post-processing SINR calculation and the EER calculation. IV. DYNAMIC RELAY SELECTION In this section, we propose a dynamic relay selection scheme by formulating an optimization model based on the CMDP. An optimal relay selection policy is obtained to minimize the long-term average cost while satisfying the QoS requirement in terms of packet loss probability. In the following, the state and action space, as well as the probability transition matrix of the CMDP model are defined. Then, the method to obtain the optimal policy of the CMDP problem is presented. A. State and Action Space The state space of the MU is defined as Ω = {(L, Q)}, where L {L 1,L 2,...,L M } and Q {,1,...,Q Buf } represent the location of MU and the number of packets in the MU s buffer (i.e., queue length), respectively. The action space of the MU represents the available choices of relay selection, and is denoted as A = {A 1,...,A m,...,a M } where A m is the available action set at location L m. As mentioned in Section III-A, when the MU moves to location L m and attempts to look for helpers, only the RSs in R m can be selected for cooperation due to the requirement for successful decoding. Otherwise, it may use direct transmission. The total ( Km j ). number of actions at location L m is A m = K m j= Note that the available action set at one location consists of all combinations of the available RSs for that location. 1 We assume that a mechanism for information exchange among MU, RSs, and BS is in place to facilitate the cooperative communication. The design of such a mechanism is beyond the scope of this paper. B. Transition Probability Matrix As the state of the MU consists of both MU location and its buffer occupancy, the state transition depends on location transition or queue length transition. Location transition can be described by transition probability matrix P m defined in Section III-B. For queue length transition, we assume that packets arrive in batches and the probability for n a packets to arrive in each decision period is λ na. Moreover, at most N a packets can arrive in one decision period. While for packet departure, let Æ = {n d,n d1,...,n dr } denote the set of all possible numbers of departing packets in a decision period, where n di corresponds to the EER rate r i Ö (i.e., when the cooperative transmission achieves EER rate r i, there are n di packets departing the queue of the MU in a decision period). Let (a) denote the probability of n d packets departing from the data queue of the MU when action a is taken. (a) can be calculated as follows: 1) When no relay is selected (i.e., K = ), (a) = { P ri,k(γ SD ), n d = n di Æ, otherwise 2) When one or more relays are selected (i.e., K 1), { P ri,k(γ SD,γ R1D,...,γ RKD), n d = n di Æ (a) =, otherwise where the probabilities P ri,k(k =,1,...) can be calculated using Eq. (5). Denote the transition probability matrix for queue length as P Qb (a), whose element P Qb,Q b (a) represents the probability for the data queue to transit from length Q b to Q b given that action a is taken. It can be obtained from P Qb,Q b (a) = {(n a,n d ),Q b +n a n d =Q b } λ n a (a). (6) Based on the transition probability matrices P m and P Qb (a), the state transition probability matrix P s,s (a) consisting of the probabilities for the MU to transit from state s Ω to s Ω when action a is taken, can be obtained from P s,s (a) = P m PQb (a) (7) where denotes the Kronecker product. C. CMDP Problem Formulation In our problem formulation, the objective is to minimize the long-term average cost for selecting the cooperative RSs, while the QoS requirement in terms of the long-term average packet loss probability is satisfied. Similar QoS constraint can be found in [9], in which the packet loss due to lack of butter space is considered. Note that the average queue throughput R can be easily derived from packet loss probability P L by R = λ(1 P L ) (8) where λ is the average packet arrival rate. In our case, it can be calculated as λ = N a n n a= aλ na. Let W denote the number of decision periods in the observation time, and the

4 maximum tolerable packet loss probability. The long-term objective and constraints of the MU can be formulated as a CMDP problem given by min π C (π) = lim W sup 1 W s. t. P L (π) = lim W sup 1 W W E(C(s i,a i )) (9) i=1 W E(P L (s i,a i )) i=1 where π denotes the relay selection policy, which is a mapping of state s Ω to action a A. C (π) and P L (π) represent the long-term average cost and probability of packet loss due to lack of queue buffer, respectively. E( ) denotes the expectation, C(s i,a i ) is cost paid to the selected relay(s) for cooperation and P L (s i,a i ) is packet loss probability in the i th decision period. E(P L (s i,a i )) can be calculated as follows: E(P L (s i,a i )) = Na n d,i Æ n P a,i= L i λ na,i,i (a i ) λ (1) where P Li = max(q i 1 + n a,i n d,i Q Buf,), q i 1 is the queue length at the end of the (i 1) th decision period, n a,i and n d,i are the number of packet arrivals and departures in the i th decision period, respectively, while λ na,i and,i (a i ) denote the corresponding probabilities. The optimal policy π can be obtained by transforming the CMDP formulation into an equivalent LP problem [1]. The randomized policy is applied here and Ψ π (s,a) denotes the optimal probability of taking action a when the MU is at state s, which can be obtained from the optimal solution of the corresponding LP problem. Let Φ(s, a) denote the probability of the MU being in state s and taking action a, the LP problem is given in the following: min Φ(s,a) s. t. C(s, a)φ(s, a) (11) P L (s,a)φ(s,a) a AΦ(s,a) = s Ω Φ(s,a) = 1 P(s s,a)φ(s,a) a A Φ(s,a), a A where P(s s,a) is an element of matrix P s,s (a) obtained in Eq. (7), indicating the probability of the MU changing to state s Ω in the next decision period given the current state s when action a is taken. Denoting the optimal solution of the LP problem as Φ (s,a), the optimal policy of the CMDP problem can be calculated as follows [6]: Ψ π (s,a) = Φ (s,a) Φ (s) = Φ (s,a) a A Φ (s,a ). (12) V. PERFORMANCE EVALUATION As illustrated in Fig. 1, here we consider 4 locations as an example, and the scenario settings are given in Table I. Unless otherwise mentioned, we let all RSs have the same cost, which is set to be 1. TABLE I PARAMETER SETTINGS Location No. γ SD RSs γ RD L 1 3 db (RS 1, RS 2, RS 3 ) (5 db, 6 db, 7 db) L 2 4 db (RS 4, RS 5, RS 6 )) (6 db, 7 db, 8 db) L 3 5 db (RS 7, RS 8, RS 9 ) (7 db, 8 db, 9 db) L 4 6 db (RS 1, RS 11, RS 12 ) (8 db, 9 db, 1 db) The MU travels among the 4 locations, the buffer size for the associated data queue is 1 packets. Four AMC modes with SINR thresholds Γ 1 = 6.4 db, Γ 2 = 9.4 db, Γ 3 = 11.2 db, and Γ 4 = 16.4 db are used. The packet arrival is assumed to be a Poisson process with average arrival rate λ = 2 packets/decision period. The packet loss probability threshold is set to be =.5. We implement the proposed CMDP based optimal relay selection policy in the simulation which consists of 2 decision periods and obtain all the numerical results. A. Performance Evaluation 1) The effect of cost variation: We define the demand of RS k as the ratio between the number of times that the MU requests for RS k and the total number of decision periods. We vary the cost of RS 1 from to 2, and illustrate the variation of the demands of all RSs at location L 1 (i.e., RS 1, RS 2, RS 3 ) in Fig. 2(a). From Fig. 2(a), we can observe that RS 3 is always preferable by the MU since it has the best SINR in location L 1. The demand of RS 1 remains high when its cost is low. However, when its cost increases to a certain value (i.e., around 8), its demand drops rapidly to zero while the demand of RS 2 increases. It accords with our expectation that no MU would like to select a RS with low SINR but high cost. The result demonstrates that the proposed CMDP scheme can effectively adapt the relay selection to the cost variation. 2) The effect of the maximum queue size: Two different user mobility patterns are considered to study the effect of maximum queue size on our relay selection result: 1) Scenario 1 (uniform mobility): When the MU is at location L m (m = 1,2,3,4) in one decision period, it has equal probability to stay at location L m or move to the neighboring locations in the next decision period. 2) Scenario 2 (non-uniform mobility): When the MU is at location L m (m = 1,2,4) in one decision period, it has a larger probability (e.g., 2 3 ) to be at location L 1 in the next decision period, while smaller probability (e.g., 1 6 ) to move to any other possible location. Since location L 3 is not in the neighborhood of L 1, we simply assume an equal probability for the MU to move to each possible location when it is at location L 3.

5 Demand of RSs 1 3.7.6.5.4.3.2.1 Sum of demands of RS 1, 2, 3 Demand of RS 3 Demand of RS 1 Demand of RS 2 2 4 6 8 1 12 14 16 18 2 Cost of RS 1 Prob. of selecting different RSs 1.9.8.7.6.5.4.3.2.1 Sce 1: Prob. of using 3 RSs Sce 2: Prob. of using 3 RSs Sce 2: Prob. of using 2 RSs Sce 1: Prob. of using 2 RSs Sce 1: Prob. of using 1 RS Sce 2: Prob. of using 1 RS 2 4 6 8 1 12 The maximum queue size Average packet loss probability.25.2.15.1.5 P =.5 =.1 =.5 P =.1 CMDP 1 RS 2 RS Relay selection scheme Average TCR 1.5 1.5 =.5 P =.1 CMDP 2 RS 3 RS Relay selection scheme Fig. 2. (a) Demand of RSs 1-3 vs. the cost variation of RS 1 (b) action distribution vs. maximum queue size in both scenario 1 and scenario 2, (c) comparison of the average packet loss, and (d) comparison of the average throughput-cost-ratio (TCR). We study the effect of queue size variation on relay selection. λ = 2 packets/decision period. In simulation, we increase the buffer size of the data queue from 2 to 12 packets and observe the variation in the action distribution in both scenarios 1 and 2. As shown in Fig. 2(b), the probability of selecting three RSs decreases to when the maximum queue size increases to 5 packets and 9 packets in scenario 1 and scenario 2, respectively. As the queue size increases, there is more available space to buffer the arriving packets, and hence MU is not necessary to transmit them immediately to avoid violating the packet loss requirement. Consequently, selecting three relays becomes less likely due to the corresponding high cost. 3) Performance Comparisons: We compare the performance of our CMDP scheme with that of other relay selection methods. We introduce a performance metric throughput-costratio (TCR), which is defined as the ratio between average throughput and average cost. Note that the average throughput can be derived from the average packet loss in the same way as Eq. (8). Here, we choose the following relay selection schemes for comparison: (1) direct transmission, (2) singlerelay selection with the best SINR, (3) two-relay selection with the best and second-best SINR, and (4) always uses three relays. We first vary the packet loss threshold from.1 to.1 when λ = 2 packets/decision period (Case I), and then change λ from 8 packets/decision period to 26 packets/decision period when =.5 (Case II). The performance metrics in terms of average packet loss probability and average TCR are evaluated and compared in Fig. 2(c) and 2(d), respectively. Two sample comparison results ( =.5 and.1) are presented. Note that the packet loss performance of direct transmission and threerelay selection is not presented, since the former is over 8% which is extremely unacceptable, while the latter hardly has measurable packet loss. In Fig. 2(d), we only compare the three schemes satisfying the packet loss requirement. From Fig. 2(c), it can be observed that, for single-relay selection, the packet loss requirement is violated. Although two-relay selection has better performance than that of our scheme within the QoS constraint, it has a lower average TCR as shown in Fig. 2(d). Moreover, three-relay selection becomes disadvantageous due to the highest cost and lowest TCR. The results clearly show the effectiveness of our scheme in balancing the cost and QoS performance. In other words, our scheme will not sacrifice more energy (measured by the cost) to achieve an oversatisfied QoS performance. VI. CONCLUSION A dynamic relay selection scheme has been proposed in this paper, taking user s mobility into consideration, which is ignored in most of the existing works. In our model, energy consumption in relaying is considered as a cost associated with each relay, and the user has to pay the selected relays for requiring cooperative transmission. A CMDP based optimization model has been formulated and solved by the LP technique to obtain the optimal policy for relay selection, aiming at minimizing the average cost while satisfying the long-term QoS requirement. The simulation results have clearly shown the effectiveness and flexibility of our scheme in balancing the cost and required QoS performance, as compared with several other relay selection schemes. REFERENCES [1] Y. Wei, F. Yu, and M. Song, Distributed optimal relay selection in wireless cooperative networks with finite-state markov channels, IEEE Trans. Vehicular Technology, vol. 59, no. 5, pp. 2149 2158, Jun 21. [2] Y. Zhao, R. Adve, and T. Lim, Improving amplify-and-forward relay networks: optimal power allocation versus selection, IEEE Trans. Wireless Communications, vol. 6, no. 8, pp. 3114 3123, August 27. [3] A. Sadek, Z. Han, and K. Liu, A distributed relay-assignment algorithm for cooperative communications in wireless networks, in Proc. of IEEE ICC, 26. [4] Y. Jing and H. Jafarkhani, Single and multiple relay selection schemes and their achievable diversity orders, IEEE Trans. Wireless Communications, vol. 8, no. 3, pp. 1414 1423, March 29. [5] F. Onat, Y. Fan, H. Yanikomeroglu, and H. Poor, Threshold based relay selection in cooperative wireless networks, in Proc. of IEEE GLOBECOM 28, nov. 28, pp. 1 5. [6] D. Niyato and P. Wang, Optimization of the mobile router and traffic sources in vehicular delay-tolerant network, IEEE Trans. Vehicular Technology, vol. 58, no. 9, pp. 595 514, Nov. 29. [7] R. Nabar, H. Bolcskei, and F. Kneubuhler, Fading relay channels: performance limits and space-time signal design, IEEE Select. Areas Communications, vol. 22, no. 6, pp. 199 119, Aug. 24. [8] B. Can, H. Yomo, and E. Carvalho, Link adaptation and selection method for ofdm based wireless relay networks, Journal of Communications and Networks, 27. [9] D. Niyato and P. Wang, Credit-based spectrum sharing for cognitive mobile multihop relay networks, in Proc. of IEEE WCNC 21, 18-21 21, pp. 1 6. [1] M. Puterman, Markov decision processes: Discrete stochastic dynamic programming, IMA Journal of Management Mathematics, 1994.