Dynamic Pricing Approach for Spectrum Allocation in Wireless Networks with Selfish Users

Dynamc Prcng Approach for Spectrum Allocaton n Wreless Networks wth Selfsh Users Zhu J and K. J. Ray Lu Electrcal and Computer Engneerng Department and Insttute for Systems Research Unversty of Maryland, College Park, MD 20742 emal: zhu, krlu@umd.edu Abstract Dynamc spectrum allocaton becomes a promsng approach to ncrease the spectrum effcency for wreless networks. In ths paper, we consder the spectrum allocaton n wreless networks wth multple selfsh legacy spectrum holders and unlcensed users as mult-stage dynamc games. A dynamc prcng approach s proposed to optmze overall spectrum effcency whle keepng the partcpatng ncentves of the users based on double-aucton rules. Moreover, a belef system s developed to assst selfsh users to dynamcally update ther strateges adaptve to the network dynamcs and substantally decrease the prcng overhead. The smulaton results show that our proposed scheme not only approaches optmal outcomes but also has low overhead. I. INTRODUCTION Current statc spectrum allocaton can be very neffcent consderng the bandwdth demands may vary hghly along the tme dmenson or the space dmenson. Wth the development of cogntve rado technologes, dynamc spectrum access becomes a promsng approach to ncrease the effcency of spectrum usage, whch allows unlcensed wreless users to dynamcally access the lcensed bands from legacy spectrum holders based on leasng agreements. The FCC began to consder more flexble and comprehensve use of avalable spectrum n [1], [2]. Then, great attentons have been drawn to explore the open spectrum systems [3], [4] for dynamc spectrum sharng. Tradtonally, network-wde spectrum assgnment s carred out by a central server, namely, spectrum broker [5], [6]. Recently, dstrbuted spectrum allocaton approaches [7], [8] have been well studed to enable effcent spectrum sharng only based on local observatons. Although the exstng dynamc spectrum access schemes have acheved some success on enhancng the spectrum effcency and dstrbutve desgn, most of them focus on effcent spectrum allocaton gven fxed topologes and cannot adapt to the dynamcs of wreless networks due to node moblty, channel varatons or varyng wreless traffc. Furthermore, exstng cogntve spectrum sharng approaches generally assume that the network users wll act cooperatvely to maxmze the overall system performance. However, wth the emergng applcatons of moble ad hoc networks envsoned n cvlan usage, the users may be selfsh and am to maxmze ther own nterests. Therefore, novel spectrum allocaton approaches need to be developed consderng the dynamc nature of wreless networks and users selfsh behavors. Consderng a general network scenaro n whch multple prmary users (legacy spectrum holders) and secondary users (unlcensed users) coexst, prmary users attempt to sell unused spectrum resources to secondary users for monetary gans whle secondary users try to acqure spectrum usage permssons from prmary users to acheve certan communcaton goals, whch generally ntroduces reward payoffs for them. In order to solve the above ssues, we consder the spectrum sharng as multstage dynamc games and propose a belef-asssted dynamc prcng approach to optmze the overall spectrum effcency, meanwhle, keepng the partcpatng ncentves of the users based on double-aucton rules. The smulaton results show that our proposed scheme not only approaches optmal spectrum effcency but also has low prcng overhead compared to general contnuous double aucton mechansms. The remnder of ths paper s organzed as follows: The system model of dynamc spectrum allocaton s descrbed n Secton II. In Secton III, we formulate the spectrum allocaton as prcng games based on the system model. In Secton IV, the belef-based dynamc prcng approach s proposed for the optmal spectrum allocaton. The smulaton studes are provded n Secton V. Fnally, Secton VI concludes ths paper. II. SYSTEM MODEL We consder the wreless networks where multple prmary users and secondary users operate smultaneously, whch may represent varous network scenaros. For nstance, the prmary users can be the spectrum broker connected to the core network and the secondary users are the base statons equpped wth cogntve rado technologes; or the prmary users are the access ponts of a mesh network and the secondary users are the moble devces. On one hand, consderng that the authorzed spectrum of prmary users may not be fully utlzed over tme, they prefer to lease the unused channels to the secondary users for monetary gans. On the other hand, snce the unlcensed spectrums become more and more crowded, the secondary users may try to lease some unused channels from prmary users for more communcaton gans by provdng leasng payments. In our system model, we assume all users are selfsh and ratonal, that s, ther obectves are to maxmze ther own payoffs, not to cause damage to other users. However, users are allowed to cheat whenever they beleve cheatng behavors can help them to ncrease ther payoffs. Generally speakng, n order to acqure the spectrum lcenses from regulatory bodes such as FCC, the prmary users have certan operatng costs. In order to have the reward payoffs, secondary users want to utlze more spectrum resources. The selfshness of both prmary and secondary users wll prevent them from

revealng ther prvate nformaton such as acquston costs or reward payoffs, whch makes tradtonal spectrum allocaton approaches not applcable. Specfcally, we consder the collecton of the avalable spectrums from all prmary users as a spectrum pool, whch totally conssts of N non-overlappng channels. Assume there are J prmary users and K secondary users, ndcated by the set P = {p 1,p 2,..., p J } and S = {s 1,s 2,..., s K }, respectvely. We represent the channels authorzed to prmary user p usng a vector A = {a } {1,2,...,n }, where a represents the channel ndex n the spectrum pool and n s the total number of channels whch belong to user p. Defne A as the set of all the channels n the spectrum pool. Moreover, denote the acquston costs of user p s channels as the vector C = {c a } {1,2,...,n}, where the th element represents the acquston cost of the th channel n A. For smplcty, we wrte c a as c. As for secondary user s, we defne her/hs payoff vector as V = {v } {1,2,...,N}, where the th element s the reward payoff f ths user successfully leases the th channel n the spectrum pool. III. PRICING GAME MODEL In ths paper, we model the dynamc spectrum allocaton problem as a prcng game to study the nteractons among the players,.e., the prmary and secondary users. Based on the dscusson n the prevous secton, we are able to have the payoff functons of the players n our dynamc game. Specfcally, f prmary user p reaches agreements of leasng all or part of her/hs channels to secondary users, the payoff functon of ths prmary user can be wrtten as follows. U p (φ A,α A )= n =1 (φ a c )αa, (1) where φ A = {φ a } {1,2,...,n} and φ a s the payment that user p obtans from the secondary user by leasng the channel a n the spectrum pool. Note that αa = {α a } {1,2,...,n} and α a {0, 1} whch ndcates f the th channel of user p has been allocated to a secondary user or not. For smplcty, we denote α a as α. Smlarly, the payoff functon of secondary user s can be modeled as follows. U s (φ A,β A )= N (v φ )β, (2) =1 where φ A = {φ } {1,2,...,N}, β A = {β } {1,2,...,N}. Note that β {0, 1} llustrates f secondary user s successfully leases the th channel n the spectrum pool or not. Hence, the strateges of the prmary users and secondary users are actually defned by α A and β A, respectvely. From the above dscusson, we can see that the players may have conflct nterests wth each other. Specfcally, the prmary users want to earn as much payments as possble by leasng the unused channels and the secondary users am to accomplsh ther communcaton goals by provdng the least possble payments for leasng the channels. Moreover, the spectrum allocaton nvolves multple channels over tme. Therefore, the spectrum users nvolved n the spectrum allocaton process construct a multstage non-cooperatve prcng game [9], [10]. Also, the selfsh users wll not reveal ther prvate nformaton to others unless some mechansms have been appled to guarantee that t s not harmful to dsclose the prvate nformaton. Generally, such non-cooperatve game wth ncomplete nformaton s dffcult to study as the players do not know the perfect strategy profle of others. However, based on our game settng, the well-developed aucton theory [11] can be appled to formulate and analyze our prcng game. In aucton games [11], accordng to an explct set of rules, the prncples (auctoneers) determne resource allocaton and prces on the bass of bds from the agents (bdders). In our spectrum allocaton prcng game, the prmary users (prncples) attempt to sell the unused channels to the secondary users and the secondary users (bdders) compete wth each other to buy the permsson of usng prmary users channels. Moreover, multple prmary and secondary users coexst, whch ndcates the double aucton scenaro [11], [12]. It means that not only the secondary users but also the prmary users need to compete wth each other to make the benefcal transactons possble by elctng ther wllngness of the payments n the forms of bds or asks. Generally, the double aucton mechansm s hghly effcent such as n the New York Stock Exchange (NYSE) or Chcago Merchandze Exchange (CME) and can respond dynamcally to changng condtons of aucton partcpants. However, n our spectrum allocaton games, ether powerful centralzed authortes can be pre-assumed or the bandwdth of control channels s very lmted. Therefore, we am to develop an effcent prcng approach for spectrum allocaton, whch adapts to spectrum dynamcs by smple message exchanges. IV. DYNAMIC PRICING FOR EFFICIENT SPECTRUM ALLOCATION A. Statc Prcng Game and Compettve Equlbrum Assume that the avalable channels from the prmary users are leased for usage of certan tme perod T. Also, we assume that the cost of the prmary users and reward payoffs of the secondary users reman unchanged over ths perod. Before ths spectrum sharng perod, we defne a tradng perod τ, wthn whch the users exchange ther nformaton of bds and asks to acheve agreements of spectrum usage. The tme perod T + τ s consdered as one stage n our prcng game. We frst study the nteractons of the players n statc prcng games. Note that the users goals are to maxmze ther own payoff functons. As for the prmary users, the optmzaton problem can be wrtten as follows. O(p ) = s.t. Uŝa max U p (φ A,α A φ A,α A ), {1, 2,..., J} (3) ({φ a,φ a },β A ) Uŝa ({φ a, φ a },β A ), ŝ a 0,a A. (4)

where φ a s any feasble payment and φ a s the payment vector excludng the element of the payment for the channel a. Note that ŝ a s defned as follows. ŝ a = { s k f β a k =1, 0 f β a k =0, k {1, 2,..., K}. (5) Thus, (4) s the ncentve compatble constrant [11]. It means that the secondary users have ncentves to provde the optmal payment because they cannot have extra gans by cheatng on the prmary users. Smlarly, the optmzaton problem can be wrtten for the secondary users as follows. O(s ) = max φ A,β A U s (φ A,β A ), {1, 2,..., K} (6) s.t. Uˆp ({φ,φ },β A ) Uˆp ({φ, φ },β A ), ˆp 0,β =1. (7) where ˆp s defned as { pk f β ˆp = =1, A k,α k =1 (8) 0 otherwse, k {1, 2,..., J}. Smlarly, (7) s the ncentve compatble constrant for the prmary users, whch guarantees that the prmary user wll gve the usage permsson of ther channels to the secondary users so that they can receve the optmal payments. From (3) and (6), we can see that n order to obtan the optmal allocaton and payments, a mult-obectve optmzaton problem needs to be solved, whch becomes extremely complcated due to our game settng that only nvolves ncomplete nformaton. Thus, n order to make ths problem tangble, we analyze t from the game theory pont of vew. Consderng the double aucton scenaros of our prcng game, Compettve Equlbrum (CE) [12] s a well-known theoretcal predcton of the outcomes. It s the prce at whch the number of buyers wllng to buy s equal to the number of sellers wllng to sell. Alternatvely, CE can also be nterpreted as where the supply and demand match [11]. We descrbe the supply and demand functons of spectrum resources n Fgure 1. Note that CE s also proved to be Pareto optmal n statonary double aucton scenaros [13]. B. Belef-Asssted Dynamc Prcng Consderng spectrum dynamcs due to moblty, channel varatons or wreless traffc varatons, the secondary users reward payoffs and prmary users costs may change over tme or spectrum. Thus, c and v need to be consdered as random varables n dynamc scenaros. Wthout loss of generalty, we assume the homogeneous game settngs for the statstcs of c and v, whch satsfy the probablty densty functons (PDF) f c (c) and f v (v), respectvely. Therefore, consderng dynamc network condtons, we further model the spectrum sharng as a mult-stage dynamc prcng game. Let γ be the dscount factor of our mult-stage prcng game. Based on (3) and (6), Fg. 1: Illustraton of supply and demand functons. the obectve functons for the prmary users and secondary users can be rewrtten as follows. Õ(p ) = Õ(s ) = max E φ A,t,α A c [,v,t max E c [ φ A,t,β,t A,v t=1 γ t U p,t(φ A,t,α A,t )], (9) γ t U s,t(φ A,t,β,t)], A (10) t=1 where the subscrpt t ndcates the tth stage of the multstage game. Generally speakng, there may exst some overall constrants of spectrum sharng such as each secondary user s total budget for leasng spectrum resources or each prmary user s total avalable spectrum supply. Under these constrants, the above problem needs to be further modeled as a dynamc programmng process [14], [15] to obtan optmal sequental strateges. However, the maor dffculty of dynamc spectrum sharng les n that how to effcently and dynamcally update the spectrum sharng strateges accordng to the changng network condtons only based on local nformaton. Therefore, n ths paper we don t assume the overall constrants and focus on developng a belef-asssted dynamc prcng approach, whch can not only approach CE outcomes but also respond dynamcally to networkng dynamcs whle only ntroducng lmted overhead. Snce our prcng game belongs to the non-cooperaton games wth ncomplete nformaton [9], the players need to buld up certan belefs of other players future possble strateges to assst ther decson makng. Consderng that there are multple players wth prvate nformaton n the prcng game and what drectly affect the outcome of the game are the bd/ask prces, t s more effcent to defne one common belef functon based on the publcly observed bd/ask prces than generatng specfc belef of every other player s prvate nformaton. Hence, enlghtened by [12], we consder the prmary/secondary users belefs as the rato ther bd/ask beng accepted at dfferent prce levels. At each tme durng the dynamc spectrum sharng, the rato of asks from prmary users at x that have been accepted can be wrtten as follows. r p (x) = µ A(x) µ(x), (11)

where µ(x) and µ A (x) are the number of asks at x and the number of accepted asks at x, respectvely. Smlarly, at each tme durng the dynamc spectrum sharng, the rato of bds from secondary users at y that have been accepted s r s (y) = η A(y) η(y), (12) where η(y) and η A (y) are the number of bds at y and the number of accepted bds at y, respectvely. Usually, r p (x) and r s (y) can be accurately estmated f a great number of buyers and sellers are partcpatng n the prcng at the same tme. However, n our prcng game, only a relatvely small number of players are nvolved n the spectrum sharng at the specfc tme. The belefs, namely, r p (x) and r s (y) cannot be practcally obtaned so that we need to further consder usng the hstorcal bd/ask nformaton to buld up emprcal belef values. Consderng the characterstcs of double aucton, we have the followng observatons: f an ask x <xs reected, the ask at x wll also be reected; f an ask x >xs accepted, the ask at x wll also be accepted; f a bd ỹ>xs made, the ask at x wll also be accepted. Based on the above observatons, the players belefs can be further defned as follows usng the past bd/ask nformaton. Defnton 1: Prmary users belefs: for each potental ask at x, defne 1 x =0 w x ˆr p(x) = µ A(w)+ w x η(w) w x µ A(w)+ w x η(w)+ w x µ x (0,M) R(w) 0 x M (13) where µ R (w) s the number of asks at w that has been reected, M s a large enough value so that the asks greater than M won t be accepted. Also, t s ntutve that the ask at 0 wll be defntely accepted as no cost s ntroduced. Defnton 2: Secondary users belefs: for each potental bd at y, defne ˆr s(x) = 0 y =0 w y η A(w)+ w y µ(w) w y η A(w)+ w y µ(w)+ w y η R(w) y (0,M) 1 y M (14) where η R (w) s the number of bds at w that has been reected. And, t s ntutve that the bd at 0 wll not be accepted by any prmary users. Notng that t s too costly to buld up belefs on every possble bd or ask prce, we can update the belefs only at some fxed prces and use nterpolaton to obtan the belef functon over the prce space. Moreover, only local nformaton s needed for the users updatng ther belefs, though publc nformaton may accelerate the belef-updatng process. Before usng our defned belef functons to assst the strategy decsons, we frst look at the Spread Reducton Rule (SRR) of double aucton mechansms. Generally, before the double aucton prcng game converges to CE, there may exst a gap between the hghest bd and lowest ask, whch s called the spread of double aucton. The SRR states that any ask that s permssble must be lower than current lowest ask,.e., outstandng ask [12], and then ether each new ask results TABLE I: Belef-asssted dynamc spectrum allocaton 1. Intalze the users belefs and bds/asks The prmary users ntalze ther asks as large values close to M and ther belefs as small postve values less than 1; The secondary users ntalze ther bds as small values close to 0 and ther belefs as small postve values less than 1. 2. Belef update based on local nformaton: Update prmary and secondary users belefs usng (13) and (14), respectvely 3. Optmal bd/ask update: Obtan the optmal ask for each prmary user by solvng (16); Obtan the optmal bd for each secondary user by solvng (17). 4. Update leasng agreement and spectrum pool: If the outstandng bd s greater than or equal to the outstandng ask, the leasng agreement wll be sgned between the correspondng users; Update the spectrum pool by removng the assgned channel. 5. Iteraton: If the spectrum pool s not empty, go back to Step 2. n an agreed transacton or t becomes the new outstandng ask. A smlar argument can be appled to bds. By defnng current outstandng ask and bd as ox and oy, respectvely, we let r p (x) = ˆr p (x) I [0,ox) (x) for each x and r s (y) = ˆr s (x) I (oy,m](y) for each y, whch are modfed belef functon consderng the SRR. Note that I (a,b) (x) s defned as { 1 f x (a, b); I (a,b) (x) = (15) 0 otherwse. By usng the belef functon r p (x), the payoff maxmzaton of sellng the th prmary user s th channel can be wrtten as max E[U p (x, )], (16) x (oy,ox) where U p (x, ) represents the payoff ntroduced by allocatng the th channel when the ask s x, and then E[U p (x, )] = (x c ) r p(x). Smlarly, as for the secondary user s,the payoff maxmzaton of leasng the th channel n the spectrum pool can be wrtten as max E[U s (y, )], (17) y (oy,ox) where U s (y, ) represents the payoff ntroduced by leasng the th channel n the spectrum pool when the bd s y, and then E[U s (y, )] = (v y) r s(y). Therefore, by solvng the optmzaton problem for each prmary and secondary user usng (16) and (17), respectvely, prmary and secondary users can make the optmal decson of spectrum allocaton at every stage condtonal on dynamc spectrum demand and supply. Based on the above dscussons, we llustrate our belefasssted dynamc prcng algorthm for spectrum allocaton n Table I. V. SIMULATION RESULTS In ths secton, we evaluate the performance of the proposed belef-asssted dynamc spectrum sharng approach n wreless networks. Consderng a wreless network coverng 100 100 area, we smulate J prmary users by randomly placng them n the network. These prmary users can be the base statons servng for dfferent wreless network operators or dfferent access ponts n a mesh network. Here we assume the prmary users locatons are fxed and ther unused channels

Fg. 2: Comparson of the total payoff for the proposed scheme and theoretcal Compettve Equlbrum. are avalable to the secondary users wthn the dstance of 50. Then, we randomly deploy K secondary users n the network, whch are assumed to be moble devces. The moblty of the secondary users s modeled usng a smplfed random waypont model [16], where we assume the thnkng tme at each waypont s close to the effectve duraton of one channelleasng agreement, the wayponts are unformly dstrbuted wthn the dstance of 10, and the travelng tme s much smaller than the thnkng tme. Let the cost of an avalable channel n the spectrum pool be unformly dstrbuted n [10, 30], the reward payoff of leasng one channel be unformly dstrbuted n [20, 40]. If a channel s not avalable to some secondary users, let the correspondng reward payoffs of ths channel be 0. Note that J =5and 10 3 prcng stages have been smulated. Let n =4, {1, 2,..., J} and γ =0.99. In our smulaton, the local bd/ask nformaton wthn the transmsson range of each node s used for belef constructon and update. In Fgure 2, we compare the total payoff of all users of our proposed approach wth that of the theoretcal CE outcomes for dfferent number of secondary users. It can be seen from ths fgure that the performance loss of our approach s very lmted compared to that of the theoretcal optmal solutons. Moreover, when the number of secondary users ncreases, our approach s able to approach the optmal CE. It s because that the belef functon reflects the spectrum demand and supply more accurately when more users are nvolved n spectrum sharng. Now we study the overhead of our prcng approach. Here we measure the prcng overhead by showng the average number of bds and asks for each stage. In Fgure 3, the overhead of our prcng approach s compared to that of the tradtonal contnuous double aucton when the same total payoff s acheved. Assume the mnmal bd/ask step δ of the contnuous double aucton to be 0.01. It can be seen from the fgure that our approach substantally decreases the prcng communcaton overhead. Note that when decreasng the overhead, our proposed approach may ntroduce extra complexty to update the belefs. Fg. 3: Comparson of the overhead between the proposed scheme and contnuous double aucton scheme. VI. CONCLUSIONS In ths paper, we have studed dynamc prcng for effcent spectrum allocaton n wreless networks wth selfsh users. We model the dynamc spectrum allocaton as a mult-stage game and propose a belef-asssted dynamc prcng approach to maxmze the users payoffs whle provdng them the partcpatng ncentves va double aucton rules. Smulaton results show that the proposed scheme can approach the optmal spectrum effcency by only usng lmted prcng overhead. REFERENCES [1] FCC, Spectrum polcy task force report, FCC Document ET Docket No. 02-135, November 2002. [2] FCC, Facltatng opportuntes for flexble, effcent, and relable spectrum use employng cogntve rado technologes: notce of proposed rule makng and order, FCC Document ET Docket No. 03-108, December 2003. [3] R. J. Berger, Open spectrum: a path to ubqutous connectvty, FCC ACM Queue 1, 3, May 2003. [4] J. M. Peha, Approaches to spectrum sharng, IEEE Communcatons Magazne, vol. 43, pp. 10 12, Feburary 2005. [5] M. M. Buddhkot, Dmsumnet: new drectons n wreless networkng usng coordnated dynamc spectrum access, n Proc. of IEEE WoW- MoM 05, 2005. [6] C. Peng, H. Zheng, and B. Y. Zhao, Utlzaton and farness n spectrum assgnment for opportunstc spectrum access, to appear n Moble Networks and Applcatons (MONET), 2006. [7] L. Cao and H. Zheng, Dstrbuted spectrum allocaton va local barganng, n Proc. of IEEE DySpan, 2005. [8] R. Etkn, A. Parekh, and D. Tse, Spectrum sharng for unlcensed bands, n Proc. of IEEE DySpan, 2005. [9] M. J. Osborne and A. Rubnsten, A Course n Game Theory, TheMIT Press, Cambrdge, Massachusetts, 1994. [10] D. Fudenberg and J. Trole, Game Theory, The MIT Press, Cambrdge, Massachusetts, 1991. [11] V. Krshna, Aucton Theory, Academc Press, 2002. [12] S. Gerstad and J. Dckhaut, Prce formaton n double auctons, Games and Economc Behavor, vol. 22, pp. 1 29, 1998. [13] L. Hurwcz, R. Radner, and S. Reter, A stochastc decentralzed resource allocaton process: Part, Econometrca, vol. 43, pp. 363 393, 1975. [14] D. Bertsekas, Dynamc Programmng and Optmal Control, vol. 1,2, Athena Scentfc, Belmont, MA, Second edton, 2001. [15] Z. J, W. Yu, and K. J. R. Lu, An optmal dynamc prcng framework for autonomous moble ad hoc networks, n Proc. of IEEE INFOCOM 06, 2006. [16] D. B. Johnson and D. A. Maltz, Dynamc source routng n ad hoc wreless networks, moble computng, IEEE Transactons on Moble Computng, pp. 153 181, 2000.