Online Learning in Autonomic Multi-Hop Wireless Networks for Transmitting Mission-Critical Applications

Size: px
Start display at page:

Download "Online Learning in Autonomic Multi-Hop Wireless Networks for Transmitting Mission-Critical Applications"

Transcription

1 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 28, NO. 5, JUNE 2 Online Learning in Autonomic Multi-Hop Wireless Networks for Transmitting Mission-Critical Applications Hsien-Po Siang and Miaela van der Scaar, Fellow, IEEE Abstract In tis paper, we study ow to optimize te transmission decisions of nodes aimed at supporting mission-critical applications, suc as surveillance, security monitoring, and military operations, etc. We focus on a network scenario were multiple source nodes transmit simultaneously mission-critical data troug relay nodes to one or multiple destinations in multi-op wireless Mission-Critical Networks (MCN). In suc a network, te wireless nodes can be modeled as agents tat can acquire local information from teir neigbors and, based on tis available information, can make timely transmission decisions to minimize te end-to-end delays of te mission-critical applications. Importantly, te MCN needs to cope in practice wit te time-varying network dynamics. Hence, te agents need to make transmission decisions by considering not only te current network status, but also ow te network status evolves over time, and ow tis is influenced by te actions taken by te nodes. We formulate te agents autonomic decision making problem as a Markov decision process (MDP) and construct a distributed MDP framework, wic takes into consideration te informationally-decentralized nature of te multi-op MCN. We furter propose an online model-based reinforcement learning approac for agents to solve te distributed MDP at runtime, by modeling te network dynamics using priority queuing. We compare te proposed model-based reinforcement learning approac wit oter model-free reinforcement learning approaces in te MCN. Te results sow tat te proposed model-based reinforcement learning approac for mission-critical applications not only outperforms myopic approaces witout learning capability, but also outperforms conventional model-free reinforcement learning approaces. Index Terms multi-user mission-critical transmission, autonomic multi-op wireless networks, distributed Markov decision process, online reinforcement learning. I. INTRODUCTION A PLETHORA of mission-critical applications suc as battlefield videoconferencing, surveillance and security monitoring are emerging, e.g. in SOSANETs [], were real-time response and actions to te acquired critical data becomes vital. Tis critical data needs to be reliably and timely relayed to one or multiple decision makers, possibly located at different destinations. To connect te various sources to te destinations, a rapidly deployable solution can be provided using multi-op autonomic wireless networks. Manuscript received 4 April 29; revised 22 December 29. Tis work was funded by an ONR grant and by NSF CAREER CCF grant. Hsien-Po Siang and Miaela van der Scaar are wit te Department of Electrical Engineering, UCLA ( psiang@ucla.edu, miaela@ee.ucla.edu). Digital Object Identifier.9/JSAC.2.6xx //$25. c 2 IEEE A key advantage of suc flexible infrastructures is tat te same network can be re-used and reconfigured to relay critical data to multiple destinations. Te mission-critical applications require te network to support various transmission priorities, security, robustness requirements, and stringent transmission delay deadlines [6][8]. In tis paper, we focus on minimizing te network delays of te mission-critical applications, and rely on related work (suc as [8][9]) for te security and reliability requirements of te mission-critical applications. Autonomic wireless networks are composed of autonomic wireless nodes (also intercangeably referred to as agents in tis paper) endowed wit te capability of individually sensing te network environment, learning te dynamic network canges based on teir local information, and promptly adapting teir transmission actions in an autonomous manner to optimize te utility of te applications wic tey are serving []. Te dynamic network canges include variations in network topology, wireless cannel conditions, application requirements, etc. Wen tese network dynamics occur, te autonomic nodes can self-configure temselves and immediately react to tese canges, witout te need of propagating messages back and fort to a centralized coordinator. Autonomic wireless networks are especially suitable for missioncritical applications, since te autonomic beavior allows te wireless nodes to promptly discover local network canges and instantaneously react to tese canges, suc tat te important data packets tey are relaying will arrive at teir destinations witin teir delay deadlines. Moreover, autonomic wireless nodes endowed wit online learning capabilities can successfully model te network dynamics and foresigtedly adapt teir packet transmission to maximize te utility of te mission-critical applications. In te MCN, te autonomic nodes need to coordinate teir transmission decisions [7]. For example, in [25], it is sown tat te performance degradation is unavoidable if te agents do not optimize teir routing decisions in a cooperative manner. In [26][27], te Network Utility Maximization (NUM) framework is introduced and it is sown tat by allowing agents to cooperatively excange information, tey can optimize teir transmission actions in a distributed manner, suc tat a Pareto-efficient solution can be reaced. However, suc solutions assume a static network setting and tey cannot address te dynamic nature of te MCN. Dynamic transmission policies based on local information feedback are proposed (for example, based on QoS state information

2 2 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 28, NO. 5, JUNE 2 [2] and queuing backpressure [4][5]), wic ensure tat te delays of te mission-critical applications are bounded as long as te rate allocations are inside te capacity region of te network. However, computing te capacity region requires a ig computational complexity [32] and, moreover, does not guarantee tat te required delay constraints of te missioncritical applications are met. In [3], a QoS-aware protocol wit priority-based queuing model was proposed to support real-time traffic in wireless sensor networks. Te protocol allocates energy-efficient pats to te applications tat meet teir end-to-end delay requirements. Also, oter alternative QoS-aware solutions can be found in [] for supporting various applications in wireless sensor networks. However, most of tese solutions are mainly concerned wit minimizing te energy consumption. Importantly, in te distributed setting, an agent s decision impacts and is impacted by te decisions of te neigboring agents. We refer to tis coupling effect as te spatial dependency among te agents. Altoug te abovementioned solutions consider te spatial dependency, tey only react to te network canges in a myopic way. Tey merely optimize te transmission decisions based only on te information about te current network status and application requirements. In te dynamic MCN, owever, te agents need to adopt foresigted adaptation by considering not only te immediate network status, but also ow te network status evolves over time (referred to as te network dynamics in tis paper), in order to make optimal transmission decisions. Hence, in addition to te spatial dependency, agents need also to consider te temporal dependency among teir sequential decisions (performed over time). Moreover, in practice, te network dynamics may not be known. Reinforcement learning solutions ave been proposed for te nodes to learn te network dynamics and optimize te performance in routing [3] and admission control [3] solutions at runtime. However, tese solutions do not minimize te delays of te mission-critical applications. Moreover, te majority of tese solutions focus on model-free reinforcement learning approaces, wic are not suitable for te missioncritical applications due to teir slow convergence rates [5]. In summary, tere is no integrated framework tat considers te spatio-temporal dependencies among te agents in te MCN to minimize te end-to-end delays of te missioncritical applications, based on application priorities, packetbased delay deadlines, and te network dynamics. In tis paper, we provide a systematic framework based on wic agents (te nodes in te MCN) can optimize teir crosslayer transmission actions and minimize te delays of te mission-critical applications, wile considering te spatiotemporal dependencies among teir actions. We assume tat all te source and relay nodes are able to make teir own cross-layer transmission decisions, wic are te packet-based sceduling decisions in te application layer and te routing decisions in te network layer. In [28], it as been sown tat Markovian models (e.g. finite-state markov model [29]) can be applied for bot traffic state transition and cannel state transition. Also in [3], it was sown tat routing protocols in mobile ad oc networks can be furter improved by allowing te agents to make teir decisions using Markov Decision Process (MDP) [8]. Based on te MDP, te agents are able to forecast te future network status and optimize teir crosslayer transmission actions tat consider te MCN dynamics. However, unlike in [3], wic focuses on optimizing te overall trougput of te network, in tis paper, te agents minimize te expected end-to-end delays of te missioncritical applications. Te expected end-to-end delay is referred to in tis paper as te MDP delay value. Overall, te paper makes te following contributions: ) Distributed MDP framework tat considers te spatiotemporal dependencies in MCN. To account for te dynamic nature of te MCN, we construct an MDP framework wic minimizes te MDP delay values of te mission-critical applications. To address te informationally-decentralized nature of te multi-op MCN, te MDP needs to be formulated in a distributed manner, suc tat eac agent in te MCN can deploy its own cross-layer transmission policy based on only local information excanges wit its neigboring agents. Te proposed distributed MDP minimizes te delays of te mission-critical applications wile capturing te spatiotemporal dependency in te MCN. 2) Model-based online learning approac to solve te distributed MDP in MCN. We propose an online modelbased learning approac for te agents in MCN to solve te distributed MDP at runtime, wen te network dynamics are unknown. Unlike te conventional model-free reinforcement learning approaces for solving MDPs (as in [6][7]), te proposed model-based learning algoritm adopts a preemptive-repeat priority M/G/ queuing model [2], wic enables a faster convergence rate and sorter delays for te mission-critical applications. Te upper and lower bounds of te resulting MDP delay value are provided to verify te accuracy of te proposed model-based online learning approac at different network locations. Moreover, we compare te proposed model-based reinforcement learning approac wit te model-free reinforcement learning approaces in terms of delay performance, computational complexity, and te required information excange overeads. Tis paper is organized as follows. In Section II, we discuss te network settings and te cross-layer transmission actions of te autonomic wireless nodes, and formulate te autonomic decision making problem in te MCN. In Section III, we discuss te distributed MDP framework tat addresses bot te dynamic and information-decentralized nature of te MCN. In Section IV, we propose a model-based online learning approac for te autonomic wireless nodes to solve te distributed MDP at runtime, wic is suitable for te mission-critical applications. Section V provides simulation results and Section VI concludes te paper. II. AUTONOMIC DECISION MAKING PROBLEM FORMULATION IN MCN A. Mission-critical application caracteristics Unlike most cross-layer design papers tat consider only a single application, we assume tat tere are multiple sources transmitting simultaneously delay-critical information over te MCN. Let V = {V i } represent te set of te mission-critical applications. We assume tat te packets of an application

3 SHIANG and VAN DER SCHAAR: ONLINE LEARNING IN AUTONOMIC MULTI-HOP WIRELESS NETWORKS 3 V i are prioritized into K i priority classes. Te total number of te priority classes in te network is K = V i= K i. Let {C k,k =,..., K} represent all te priority classes in te network. In te subsequent part of te paper, we label te K classes (across all applications) in descending order of teir priorities, i.e. C is te igest priority class. A priority class C k is caracterized by te following parameters {D k,r k,l k }. D k represents te delay deadline of te packets in class C k. A packet of a mission-critical application is useful only if it is received at te destination before its delay deadline. R k is te average source rate of te packets in class C k. Based on te source rate, te source node generates a certain number of packets per unit time, wic impacts te traffic load of te MCN. L k is te average packet lengt of te packets in class C k, wic directly impacts te packet error rate and te transmission rate of sending a class C k packet. Let Delay k represent te end-to-end delay tat is required for te transmission of te traffic inclassc k. Tese required delays are mandated by te mission and te deployed applications, and te MCN agents need to prioritize te traffic and minimize teir end-to-end delays according to te assigned priorities [8]. For example, in a battlefield mission-critical network, instructions from a command center are mission-critical and sould ave iger priority tan any oter traffic, e.g. response notification, surveillance results, etc. B. Multi-op MCN settings Te MCN is represented by a network grap G(V, M, E), were M = {m,..., m M } represents te set of agents and E = {e,..., e E } represents a set of edges (transmission links) tat connect te various agents. Tere are two types of agents defined in tis paper: ) Autonomic Source Agents Ss). Eac AS generates a mission-critical application and would like to transmit te application to a predetermined destination node. 2) Autonomic Relay Agents Rs). ARs relay te packets from te AS to te corresponding destination node. Unlike te ASs, te ARs do not generate teir own traffic. Tey make teir cross-layer transmission decisions and forward te packets for te ASs. To enable us to better discuss te various networking solutions, we label te agents using a directed acyclic grap [3] as sown in Figure, wic consists of H ops from te ASs to te destination nodes 2. We assume tat M is te number of agents at te -t op ( H ), and M = M H = V. Eac agent at te -t op will be tagged wit a distinct number m ( m M ). Let M M represent te set of agents at te -t op. Te agent m processes a priority queue and it can only transmit te packets in te queue to a subset of ARs in M +. Troug periodic information excange (e.g. ello message excange in [24]), we assume tat eac agent m knows te existence of its neigboring nodes (i.e. te oter agents m M in te same We refer te interested readers to our previous work [2] for more details on tese parameters. 2 Note tat suc a directed acyclic network can be deployed over any pysical network topologies as an overlay network (see [3] for more details about ow to deploy te directed acyclic grap over a multi-op wireless network). Missioncritical priority classes C C K ASs. m M... Hop ARs.. m Hop + Fig.. Considered multi-op wireless network [].. m + M M + Destinations m H M H op and te agents m + M + in te next op), as well as te interference matrix [2] of te current op tat defines weter or not two different links of neigboring nodes can transmit simultaneously. C. Effective transmission rate over te multi-op MCN We denote te maximum transmission rate over te link (m,m + ) as T k,m,m + for traffic classc k. Assuming a memory-less packet erasure cannel as in [2][2], and given te Signal-to-Interference-Noise-Ratio (SINR) x m,m +,we can compute te packet error rate p k,m,m + (x m,m + ) over te link. If te agent m selects m + as its next relay, te effective transmission rate (goodput) can be approximated using te sigmoid function [2]: T goodput k,m,m + (x m,m + ) = T k,m,m + ( p k,m,m + (x m,m + )), p k,m,m + (x m,m + )= +e ζ(xm,m + δ), were ζ and δ are constants corresponding to te modulation and coding scemes for a given packet lengt L k. Tis goodput is determined by te actions of te agent m,wic influences te delay of te applications (see Section III.A for more details). D. Actions of te autonomic wireless nodes An agent s cross-layer transmission action varies wen transmitting different priority class traffic. Denote A m = {A k,m, C k } as te cross-layer transmission action of agent m,werea k,m = {π k,m,β k,m,m +,m + M + } A m represents te action of agent m wen sending packets in class C k. A m represents te set of feasible actions for te agent m. In tis paper, we assume tat te cross-layer transmission action includes te application layer packet sceduling π k,m of transmitting packets in class C k,andtenetwork layer relay selecting parameter β k,m,m +, wic determines te probability of selecting a node m + M + in te next op as te next relay. Denote A = {A m, m M} as te actions of all te agents in te MCN. Note tat te delay Delay k ) of packets in class C k is a function of all agents actions. ()

4 4 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 28, NO. 5, JUNE 2 E. Problem formulation In tis subsection, we discuss several ways to determine te cross-layer transmission decisions for transmitting te mission-critical applications over te MCN. - Centralized decision making Te majority of te cross-layer design papers assume a centralized optimization, in wic a central controller collects global network information G and make transmission decisions for all te agents in te MCN. Since minimizing end-toend delay is te key objective in te MCN, te centralized optimization needs to minimize te end-to-end delays for te various applications [2][3]. An advantage of suc delaydriven approac is tat te optimization only needs to be done for te iger priority classes, and te packets of te lower priority classes can be simply dropped if teir delay constraints cannot be met 3. Let a k = [A k,m, m M] represent te actions of all te agents sending traffic classc k. Te actions for transmitting te priority class C k can be computed after te actions for te iger priority classes {a,..., a k } are determined and te action a k will not affect any of te actions for {a,..., a k }. Specifically, te following delay constrained optimization is considered for te priority class C k : =argmindelay k (a k, {a,..., a k }, G) a k (2) s.t.delay k (a k, {a,..., a k }, G) D k a opt k However, in mission-critical applications, wic ave stringent delay deadlines, it is impractical to assume tat te global information G can be gatered in time at a central controller. Hence, it is important to decompose te optimization in equation (2) in suc a way tat eac agent m can make timely decision based on local information L m. - Distributed decision making for te agent m Let E[Delay k,m k,m, L m )] represent te expected delay from m to te destination node of te traffic classc k,wic is a function of te transmission action A k,m and its local information L m.letdelayk,m PASS represent te delay tat as already passed wen te class C k packet arrives at te agent m. Tis can be computed based on te information tat is encapsulated in te packet eader. Since agent m cannot influence Delayk,m PASS, it can only minimize te delay for te igest priority class C k in its queue using te following optimization [2]: A opt k,m (L m )=arg min E[Delay k,m k,m, L m )] A k,m s.t.delay k,m k,m, L m ) D k Delayk,m PASS (3) Figure 2(a) illustrates tis conventional distributed decision making. First, te agent evaluates te utility (i.e. te expected delay E[Delay k,m k,m, L m )]), wic it can obtain from taking various actions based on te local information L m. Ten, te agent determines its transmission action by solving 3 Te action A k,m = {β k,m,m +,m + M + } ereafter does not include te application layer sceduling, since te igest priority packet is selected to be transmitted. To simplify te notation, we use te same notation for te cross-layer transmission actions and assume tat te class C k is te igest priority class existing in te queue of te agent m wen taking te action A k,m. (a) (b) input rate, SINR Wireless networks (oter agents) input rate, SINR Wireless networks (oter agents) Gater local Information State Gater local information Utility evaluation Determine transmission action Future utility evaluation Determine transmission action Agent Fig. 2. (a) Conventional distributed decision making of an agent.(b) Proposed foresigted decision making of an agent. te optimization in equation (3). Te required local information L m for computing E[Delay k,m k,m, L m )] will be discussed later in Section III.B. However, due to te dynamic nature of te MCN, te gatered local information is canging over time. Hence, it is important for te agents to consider not only te current expected delay, but also te future expected delay as te network dynamics evolve. Figure 2(b) illustrates ow an agent anticipates te evolution of te network dynamics by considering te impact of its current transmission action on te future network state (wic will be defined in Section III.A), and based on it, makes foresigted transmission decisions to transmit mission-critical applications. Next, we formulate tis foresigted decision making of an agent in te MCN. - Proposed foresigted decision making for te agent m Assume E[Delay t k,m ] as te expected delay of agent m at current service interval t. Given te current local information L t m, agent m makes foresigted decisions by taking into account te impact of its actions not only on te current expected delay, but also on te discounted expected delays in te future service intervals, i.e. μ k,m (L t m{ )= } arg min γ t t E[Delayk,m t A k,m, L t (4) m )] k,m t=t were <γ< 4 represents te discount factor to decrease te utility impact of te later transmitted packets. If te discount factor γ =, te optimization in equation (4) becomes a myopic decision making, similar to te one in [2]. We refer to te function μ k,m (L m ) as te cross-layer transmission policy given te local information L m.inte 4 γ can be regarded as te probability tat te priority class ends in a certain service interval. Note tat different discount factors γ k can be considered for different priority classes. However, to simplify te exposition, we consider ere te same γ for all priority classes.

5 SHIANG and VAN DER SCHAAR: ONLINE LEARNING IN AUTONOMIC MULTI-HOP WIRELESS NETWORKS 5 next section, we will discuss ow to compute tis cross-layer transmission policy. III. DISTRIBUTED MARKOV DECISION PROCESS FRAMEWORK In tis section, we discuss ow to systematically compute te cross-layer transmission policy μ k,m (L m ) for te agents in te MCN. First, we define te state of te agents in Section III.A. Ten, in Section III.B, we propose te distributed MDP wic allows all te agents to make teir own decisions. A. States of te autonomic wireless nodes We define te network state at agent m as s m = {[η k,m, C k ], [x m,m +, (m,m + )]} X m, were x m,m + represents te cannel condition (see Section II.C) and η k,m represents te arrival rate of te class C k packets at agent m. To evaluate te expected delay E[Delay k,m ], agent m needs to first compute te expected queuing delay E[W k,m ] for wic te packets in class will be queued at m. Te state includes sufficient statistics for computing te expected queuing delay E[W k,m ],wenanactiona k,m is taken. Note tat te first two moments of te service rate can be obtained as: and L E[X k,m ]= k T k,m,m + ( p k,m,m + (x m,m + )) E[Xk,m 2 L ]= k (+p k,m,m + (x m,m + )) Tk,m 2,m ( p k,m,m + + (x m,m + )) 2 (5) Togeter wit te arrival rate η k,m, te expected queuing delay E[W k,m ] can be computed using an priority M/G/ queuing model [2]. We assume tat eac agent will feed back its expected delays to all te agents in te previous op (similar to DSDV protocols [24]). Hence, te agent m is able to select te next relay tat minimizes te sum of current queuing delay and te expected delay from te next op to te destination node of class C k,i.e. E[Delay k,m k,m,s m )] E[W k,m k,m,s m )] = H = = E[W k,m k,m,s m )] + E[Delay k,m+ k,m+ )] (6) Importantly, te agent m s transmission action will impact te information feedback E[Delay k,m+ ], since it will select te next relay m + M + tat feeds back different expected delay values. Moreover, te expected delay E[Delay k,m ] will be fed back to te agents in te previous op and ence impact teir transmission actions. Hence, te agent m s action A k,m will affect its own future state s m and also will influence te future expected delay as te network dynamics evolve. As in [3], we denote te probability tat te agent m as a state s t+ m in service interval t + as p(s t+ m ), wic is modeled as a function of agent m s current state s t m and current action A t k,m,i.e. p(s t+ m ) = ˆF s t+(s t m m,a t k,m ) (7) Note tat te real p(s t+ m ) can be very complicated in a real network, since it is impacted by te decisions of all te agents in te previous op as well as te interference among te agents in te current op. Note tat in our solution, te agents do not need to know te exact form of p(s t+ m ). Online learning approaces will be discussed in Section IV for te agents to learn te state transition function in equation (7). Next, we formulate te cross-layer optimization of te agent as an MDP for eac class. B. Distributed MDP for class C k For class C k, te MDP at te agent m is definedbya tuple X m, A m, I m, T m, U m,γ : - States: Recall tat te state is defined in Section III.A as s m = {[η k,m, C k ], [x m,m +, (m,m + )]} X m. - Actions: Recall tat te action is defined as A k,m = {β k,m,m +,m + M + } A m in Section II.C. To simplify te notation, we will afterward use A m instead of A k,m. - Information excange: Let I m = {F b,ff }5 represent te information excange of te agents in te -t op to te previous op and to te next op. Denote F b,t (m )= E[Delayk,m t ] as te feedback information from agent m to te agents in te previous op (see equation (6)) and let F b,t = [F b,t (m ),m M ] represents te feedback information in te -t op in te service interval t. Denote F f,t (m )={Delayk,m PASS,η k,m } as te feedforward information from node to te selected relay in te next op and let F f,t =[F f,t (m ),m M ] represent te feedforward information in te -t op. Given te feedforward information F f,t, te agent m computes te average delay Delayk, PASS of passing troug te previous ops as: Delay PASS k, = M m = η k,m Delayk,m PASS R (8) k If Delayk, PASS exceeds te delay deadline D k, te packet in class C k sould be dropped and no MDP is needed for traffic class C k at te agent m. - State transition probabilities: Let T sm s m m ) T m : X m X m A m [, ] represent te stationary state transition probabilities from state s m to state s m wen action A m is taken. Based on te state transition models in equation (7), we compute te state transition probabilities as T sm s m m ) = ˆF s m (s m,a m ). -Cost:Te expected delay E[Delay k,m (s m,a m )] U m represents te cost function. As mentioned in Section III.A, we rely on a priority-based queuing model to compute te cost function (see equation (6)). Note tat te expected delay of a iger priority class will not be influenced by te oter lower priority classes. However, if te class is one of te lower priority classes, te influence of te iger priority classes is taken into account based on te priority-based queuing model [2] (given te actions and states associated wit te iger priority classes). - Discount factor: Recall tat γ is te same discount factor as in equation (4). Based on te information feedback F+ b, we modify te 5 Te superscript b and te superscript f represent backwards and forwards information, respectively.

6 6 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 28, NO. 5, JUNE 2 b F f F 2 Markovian state transition Future utility evaluation Distributed MDP μ k ( s m ) Decision Determine process transmission of agents action m M f F Markovian state Local transition Local Information Information b Future F utility State s m evaluation State s m Distributed MDP μ k ( s m ) f F Decision Determine process transmission of agents m action M Fig. 3. Proposed decentralized MDP framework and te necessary information excange among te agents Bellman equation [9] of te MDP as: V k,m (s m,f { + b ) } = min γ t E[Delayk,m t A m A (s m,a m )] m t= E[W k,m (s m,a m )] + F+ b m )+ = min A m A m γ T sm s m m )V s k,m (s m,f+ b ) m (9) were Vk,m is referred to as te MDP delay value, wic is a discounted version of te long-term expected delay. To solve tis feedback-modified Bellman equation, te agent m adopts value iteration [9] by updating te MDP delay value: V t+ k,m (s m,f b,t + )= min Q t k,m A m A (s m,a m,f+ b ), () were Q t k,m (s m,a m,f+ b ) = E[W k,m t (s m,a m )] + F b,t + m ) + γ s m T sm s m m )V t k,m (s m,f b,t + ) is te Q-value at te agent m wen a crosslayer transmission action A m is taken in state s m. Te stationary policy can be written as: μ t k,m (L t m )=arg min Q t m A m A (s m,a m,f b,t + ). m Te feedback-modified Bellman equation in equation (9) can be solved using value iteration, if te agent m as complete knowledge about E[W k,m (s m,a m )] and T sm s m m ). Table I presents te detailed implementation of te distributed MDP and Figure 3 sows te considered system diagram of te distributed MDP tat allows te agents to excange information wit te nodes in te neigboring ops. IV. ONLINE MODEL-BASED LEARNING FOR SOLVING THE DISTRIBUTED MDP In order to solve te Bellman equations, te agents need to know te state transition probabilities T sm s m m ) in te updating equation (). However, te state transition probabilities may not be known to te agents a priori. In tis section, we discuss online learning approaces for solving te distributed MDP introduced in te previous section at runtime. We propose a novel model-based reinforcement learning approac tat is suitable for te agents to transmit missioncritical applications over te MCN. Te proposed modelbased reinforcement learning approac adopts te priority queuing model E[W k,m (s m,a m )] for te cost and directly estimates te state transition probabilities T sm s m m ) to solve te distributed MDP. In Section IV.B, we sow tat te proposed model-based learning metods converge faster tan te model-free learning approaces, since it takes less time for te autonomic node to explore different states and correctly evaluate te Q values. A. Conventional model-free reinforcement learning Te model-free learning metods, e.g. Q-learning [6][7], can be applied at an agent m to learn te next Q values [Q t+ k,m (s m,a m ), s m X m ] witout caracterizing te state transition probabilities T sm s m m ).TakingQlearning as an example, given te feedback value F b,t +,te autonomic node m updates te Q-value using te following updating equation: Q t+ k,m { (s m,a m )=( ρ t )Q t k,m (s m,a t m )+ } ρ t Cost t k,m + F b,t + t m )+γ min Q t k,m A (s t+ m,a m ) m () were <ρ t < represents te learning rate, and t ρ t = and t (ρ)2 < are ensured for te convergence of te Q- value [6]. Te Cost t k,m represents te delay measurement (e.g. by measuring te queue size) of sending packets in class C k and s t+ m represents te next state after te agent m takes te cross-layer transmission action A t m. For exploration purposes, instead of following te optimal stationary policy μ t k,m (s m ) = arg min Q t k,m A m A (s m,a m ), te next m action is selected according to a soft-min policy. Assume πk,m t (s m,a m ) denotes te probability for agent m to take te action A m given te state s m. Te soft-min policy μ t k,m (s m )=[πk,m t (s m,a m ), A m A m ] is defined using te Boltzmann distribution [4][5][6]: πk,m t exp( Qt k,m (s m,a m ) τ ) (s m,a m )= A m A m exp( Qt k,m (s m,a m ) τ ) (2) were τ is te temperature parameter. A small τ provides a greater probability difference in selecting different actions. If τ, te approac reduces back to μ t k,m (s m ) = arg min Q t k,m A m A (s m,a m ). On te oter and, a larger m τ allows te agents to explore various actions wit iger probabilities 6. We provide detailed steps of te model-free reinforcement learning in Algoritm in Table VI. Table II summarizes te required local information, memory complexity, and computational complexity of te model-free reinforcement learning approaces. In eac service interval, te model-free reinforcement learning approaces need to update te Q-values of s m X m, C k, and for eac state, Q t k,m (s t+ m,a m ) over A m A m is calculated. Hence, te computational complexity is O ( X m A m K). Note 6 τ provides an exploration and exploitation tradeoff between exploring different actions and exploiting te Q-values of taking an action. Suc tradeoff is important in te MCN, since it significantly impacts te convergence rate and te performance of te learning approac.

7 SHIANG and VAN DER SCHAAR: ONLINE LEARNING IN AUTONOMIC MULTI-HOP WIRELESS NETWORKS 7 TABLE I IMPLEMENTATION OF THE DISTRIBUTED MDP f, t Step. Gater local information. From te information feedforward F from te previous op, te agent m computes PASS Delayk, and determine weter te distributed MDP sould be performed for traffic class C k. Ten, gaters te local t t bt, information L m = { s, } m F +. Step 2. Evaluate queuing delay and state transition probabilities. Based on state s m and action A m te agent m t evaluates EW [ km, ]. Te state transition probabilities are modeled using ˆ t t T ( ) s t (, ) m sm ' Am = F + s s m A m in equation (7). m t + bt, Step 3. Update te transmission policy. Te agent m updates te MDP delay value Vm ( s, ) m F + using equation (). t t t bt, Te stationary policy of te agent m is μ km, ( L ) arg min (,, ) m = Q m s m A m F A A +. m m t t km m Step 4. Update te information excange. After te policy μ, ( L ) is determined, te next relay m + is selected and m can ten update te feedback information bt, + bt, t t t t F ( m) = βk, m, ( ) [, (,, ( ))] m F m EWk m s m m μ + k m m + M L. Te wireless node m also needs to + f, t + PASS t t t t update its feedforward information F ( m ) = Delay + EW [ ( s, μ ( L ))]. k, k, m m k, m m tat te dynamics in te MCN may cange before te updated policy converges wen using a model-free learning approac. Hence, we consider alternative model-based reinforcement learning in te next subsection, wic is more suitable for te agents in te MCN due to a faster convergence rate. B. Proposed model-based reinforcement learning In tis section, we propose our model-based learning approac tat enables te agent m to directly model te expected queuing delay E[W k,m (s m,a m )] and estimate te state transition probabilities ˆTsm s m m ) to solve te Bellman equation troug value iteration [9]. Figure 4 provides a system block diagram of te proposed online learning approac at te agent m. Our approac is similar to te Adaptive-RTDP in [4], were te state transition probabilities are determined using maximum-likeliood estimation. Specifically, let ˆTsm s m m ) denote te estimated state transition probability at, wic is updated at eac service interval. Te Q-value is also updated as: Q t+ k,m (s m,a m )=( ρ t )Q t k,m (s m,a m )+ E[Wk,m t (s m,a m )] + F b,t + m )+ ρ t t γ min ˆT sm s A m A m m )Q t k,m (s m,a m ) s m (3) s m represents te next state to wic agent m transits, after it takes te cross-layer transmission action A m. We provide te detailed steps of te proposed model-based reinforcement learning in Algoritm 2 in Table VII. Te main differences between te model-based online learning approac and model-free learning approaces are te following: ) We model te expected queuing delay E[W k,m (s m,a m )] wit an action realized from te policy μ t k,m using te preemptive-repeat priority M/G/ queuing model as in [2]: E[W k,m (s m,a m )] = 2 k P kp η i,m E[X 2 i,m ] i=!! P η i,m E[X i,m ] k η i,m E[X i,m ] i= i=, ife[w k,m ] Dk,m rem, oterwise (4) From equation (4), we know tat if te queuing time exceeds te remaining delay deadline D rem = D k Delayk, PASS, k,m te expected queuing time E[W k,m ] becomes infinite, since te packets will be useless (no utility gain) and tey will be dropped at te agent m. Unlike Q-learning tat can only update one Q-value of a state-action pair at eac service interval, wit te priority queuing model, our model-based learning approac provides accurate estimation for any stateaction pairs. Hence, te priority queuing model enables a faster learning capability, wic is very important in order to satisfy te stringent delay constraints of mission-critical applications. 2) We apply te maximum-likeliood state-transition probabilities [4] in Algoritm 2 to update te state transition probabilities ˆT t s m s m m ), instead of using te Q-value of te next state s t+ m at eac service interval. In Algoritm 2, n t s m s m m ) represents te observed number of times before service interval t tat te action A m is taken wen te state was in s m and made a transition to s m and n t s m m ) = s m X m n t s m s m m ) represents te observed number of times before service interval t tat te action A m is taken wen te state was s m. 3) Unlike regular value iteration and Q-learning, instead of updating te value Q t+ k,m (s m,a m ) for s m X m,we only update te value for states in a particular set B m.te rest of te states s m / B m ave insufficient SINR values to keep te transmission time witin te remaining / delay deadline Dk,m rem. In oter words, te condition L k T goodput k,m,m + must old to support te transmission of traffic clas D rem k,m

8 8 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 28, NO. 5, JUNE 2 Wireless network local Information Model te state transition probability s Information excange ˆ T ss ' b f +, F F b f F, F d Solve te modified μ ik Bellman equation EW km, [ ] Expected queuing delay estimation A m Select an action according to te policy A m A m Autonomic node m Packet transmission Fig. 4. System diagram of te proposed model-based online learning approac at te agent m TABLE II COMPLEXITY SUMMARY OF THE MODEL-FREE REINFORCEMENT LEARNING Required local information Memory complexity Computational complexity L t t t f, t bt, m = s m Cost k, m C k F F+ Transmission policy {,{, },, } State transition Q-value Xm A m K Not required X m A m K O( X A K) m m ( Tk,m,m + D rem ) ξ k,m B m = {s m : x m,m + δ ln } (5) L k C k at agent m. Hence, te set is defined as in (5), wic depends on te pysical layer parameters δ and ξ of te agent m (see equation ()). We only update te Q-values of te states s m B m in Algoritm 2. Table III summarizes te required local information, memory complexity, and computational complexity of te proposed model-based reinforcement learning approac. Te proposed model-based reinforcement learning approac as iger computational complexity tan model-free reinforcement learning approaces. However, te computational complexity is a minor concern in te MCN compared wit satisfying te delay constraints of te missioncritical applications. For te proposed model-based reinforcement learning approac, te Q-values of s m B m, C k need to be updated in eac service interval, and for eac state over A m A m, te last term t min ˆT sm s A m A m m )Q t k,m (s m,a m ) in equation m s m (3) is calculated. Altoug te computational complexity is larger, te convergence rate of te proposed model-based reinforcement learning approac is muc faster tan te modelfree reinforcement learning approaces. In Section V.B, we compare te convergence speeds of different learning metods troug extensive simulation results. Hence, te MCN nodes can coose to implement tis iger complexity learning to improve teir performance. In Section V.C, we investigate te case were nodes deploy eterogeneous learning metods and determine te resulting performance. C. Upper and lower bounds of te model-based learning approac Since te maximum-likeliood state-transition probabilities ˆT s t m s m m ) are used in te proposed model-based learning approac, tere is no guarantee tat te resulting MDP delay value can converge to te optimal value Vk,m (s m,f+ b ) in equation (9). In tis subsection, we investigate te accuracy of te proposed model-based learning in terms of te resulting MDP delay value. Let V t k,m (s m,f b,t + ) and V t k,m (s m,f b,t + ) denote te upper and te lower bounds of te value, respectively, using ˆT s t m s m m ) in te proposed model-based learning approac in service interval t. Wedefine ε as te ( δ)- confidence interval of te real MDP delay value (using te unknown ˆT s t m s m m ) in Section III) in service interval t, i.e. Prob(V t k,m (s m,f b,t + ) V k,m t (s m,f b,t + ) ε) δ( <δ<). Proposition: Tere exists a ( δ)-confidence interval ε, suc tat an agent m can update te upper bound of value V t k,m (s m,f b,t + ) using V t+ k,m (s m,f b,t + )= E[Wk,m t (s m,a m )] + F b,t + m )+ min γ ˆT A m s t m s m m )V t k,m (s m,f b,t s m + )+ε (6)

9 SHIANG and VAN DER SCHAAR: ONLINE LEARNING IN AUTONOMIC MULTI-HOP WIRELESS NETWORKS 9 TABLE III COMPLEXITY SUMMARY OF THE MODEL-BASED REINFORCEMENT LEARNING Required local information Memory complexity Computational L Transmission policy Bm A m K t t f, t bt, m = s m F F+ {,, } State transition 2 m m K 2 m m Q-value B A B A O complexity ( K B A ) m m K and update te lower bound V t k,m (s m,f b,t + ) using V t+ k,m (s m,f b,t + )= E[Wk,m t (s m,a m )] + F b,t + m )+ min A m γ t ˆT sm s m m )V t k,m (s m,f b,t + ) ε s m (7) and te following two conditions ( are satisfied: ) )n t s m m ) = 2 ln Am B m ( Vmax ) 2 δ ε, A m A m,werev max = max k Drem k,m γ represents te largest MDP delay value. 2)V k,m (s m,f+ b ) V k,m (s m,f+ b ) V k,m (s m,f+ b ) wit probability at least 2δ. Proof: See Appendix. Tis proposition sows tat te estimated values V t+ k,m (s m,f b,t + ) become more accurate as nt s m m ) ( ) becomes larger tan 2 ln Am B m ( Vmax ) 2 δ ε. Moreover, te closer te agent m is to te destination node, te remaining pat becomes sorter and provides a smaller and leads to a smaller V max requirement on n t s m m ). Hence, using te same proposed model-based learning approac to accumulate n t s m m ), te learning approac provides a more accurate MDP delay value for an agent tat is closer to its destination node, wic is also verified in te simulation results in Section V.D. V. SIMULATION RESULTS In tis section, we simulate te performance of te proposed model-based reinforcement learning for solving te distributed MDP for te mission-critical applications. A. Simulation results for different network topologies We simulate first a 6-op MCN wit a topology sown in Figure 5(a) wit two ASs and 8 ARs. Suc MCN is commonly adopted in various areas, suc as battlefield sensing, security monitoring, and ealtcare applications, were prioritized data packets need to be relayed to te remote destinations in a timely manner. Two groups of mission-critical applications are sent in different priority classes (K =8). Te caracteristic parameters of tese mission-critical applications are given in Table IV. Various mission-critical applications can be supported, e.g. video streams from surveillance cameras [2], delay-sensitive monitoring report suc as forest fire detection, or patient monitoring []. Group mission-critical applications are sent troug te AS m to te destination node D and group 2 mission-critical applications are sent from te oter AS m 2 to its destination node D2. Te agents are assumed to be able to select a set of modulation and coding scemes tat support a transmission rate T =Mbps for all te transmission links in te network [2]. Eac receiver of te transmission links receives a random SINR x tat results in a packet error rate ranging from 5% to 3%. We assume tat te nodes are excanging ello messages (as in DSDV [24]) wit te required information excange every ms (eac service interval is ms). Figure 5(b) sows te MDP delay values from te ASs to te destination nodes for te first 2 service intervals. Only te results of te first five priority classes are sown. Te iger priority traffic as a smaller MDP delay value Vk,m t. Te results of centralized optimization are analytically computed by assuming tat te global network information is known by a central controller, wic is unrealistic in practice. On te oter and, te proposed model-based reinforcement learning determines te cross-layer transmission policy at eac agent based on local information. We set γ =.75, wic is appropriate for igly time-varying MCN (after service intervals, te future is only about 5% of te cost). Note tat our model-based learning provides te MDP delay values close to te centralized optimization results, especially for te priority classes C,C 2,C 3 tat satisfy te condition E[W k,m ] Dk,m rem. Tese tree priority classes converge to a steady state after t =4, since teir end-to-end delays are witin te delay deadline of te applications (te required performance level is set as γ t D k = D k γ =4wen te delay deadline of t= eac future service interval is considered) and no packets are dropped. Te results also sow tat te iger priority traffic converges faster tan te lower priority traffic. Tis is because te queuing delay of te lower priority class traffic is impacted by te iger priority class traffic.next,wesimulateaskewed network topology tat as two clusters of nodes sown in Figure 6(a). Suc network topology wit clusters of nodes can be common in te MCN due to landscape requirements. Te network connections between te two clusters usually form a bottleneck to transmit te mission-critical applications. Figure of all te priority classes increase. We observe tat only te convergence rates of te iger priority classes decrease in te skewed network due to te impact of te bottleneck. 6(b) sows tat te MDP delay values V t k,m

10 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 28, NO. 5, JUNE 2 TABLE IV THE CHARACTERISTIC PARAMETERS OF THE MISSION-CRITICAL APPLICATIONS Group mission-critical applications V Group 2 mission-critical applications V 2 C C k C C 4 C 6 C 8 C 2 C 3 C 5 7 R (Kbps) k D L k k sec bytes y-axis (m) (a) m m D2 D x-axis (m) (b) V,m V 2,m2 V 3,m2 V 4,m V 5,m2 5 required performance level Centralized optimization Model-based learning service interval t Fig. 5. (a) 6-op network topology (b) MDP delay values of te first five priority classes y-axis (m) (a) m 2 m D2 D x-axis (m) (b) V,m V 2,m2 V 3,m2 V 4,m V 5,m2 5 required performance level Centralized optimization Model-based learning service interval t Fig. 6. (a) 2-cluster skewed network topology (b) MDP delay values of te first five priority classes B. Comparison among te reinforcement learning approaces In tis subsection, we compare te proposed model-based reinforcement learning approac wit Q-learning in [6] (a model-free reinforcement learning approac) and te myopic self-learning approac in [2] (γ =). We adopt te same network conditions as te previous simulations and te network topology sown in Figure 5(a). In Figure 7, te simulation results sow tat te proposed model-based reinforcement learning approac outperforms te oter two learning approaces in terms of te MDP delay values for all te priority classes. Altoug Q-learning as te lowest computational complexity, it as te worst performance in terms of bot te MDP delay value Vk,m t and te convergence rate. Te delay of te C traffic converges after t =2for te proposed model-based learning approac and converges only after t =4for Q-learning approac. Te convergence is not guaranteed for te lower priority class traffic, especially for te myopic self-learning solution. Moreover, altoug te myopic approac as te fastest convergence rate, it results in a worse performance tan te proposed model-based reinforcement learning approac. In addition to te MDP delay values Vk,m t, we directly com-

11 SHIANG and VAN DER SCHAAR: ONLINE LEARNING IN AUTONOMIC MULTI-HOP WIRELESS NETWORKS V,m V 2,m2 V 3,m2 V 4,m2 V 5,m 4 Model-based learning 2 Self-learning Q-learning required performance level service interval t Fig. 7. Comparisons of te discounted end-to-end delay using different learning approaces tat solves te distributed MDP TABLE V THE RESULTS OF HETEROGENEOUS LEARNING SCENARIOS Learning metod (witin 2 ops from ASs) Learning metod (outside 2 ops from ASs) Expected delay of te first class traffic (sec) Expected delay of te second class traffic (sec) Model-based Model-based Model-based Bot (random) Model-based Model-free Model-free Model-based Model-free Bot (random) Model-free Model-free pare expected end-to-end delays E[Delayk t ] of te missioncritical applications from te ASs to te destination nodes. Te acceptance level for E[Delayk t ] is D k =. In Figure 8, te simulation results sow tat by using te proposed model-based learning approac, te MCN is able to support up to tree mission-critical classes, since te end-to-end delay must be witin te delay deadline of te applications (E[Delayk t ] D k), wile by using te oter two learning approaces, te network can only support two mission-critical classes. Next, we simulate te expected delay of different classes in a source variation scenario, were te AS m disappears rigt after service interval t =6. Figure 9 sows te canges of expected delays over time for different classes using various learning approaces. Since te AS m is te source node of packets in classes {C,C 4,C 6,C 8 }, te expected delays E[Delay ] and E[Delay 4 ] in Figure 8 vanis after t =6.We can observe tat if Q-learning is applied, before t =6, only class C from m can be delivered in time (E[Delay ] D ). However, after t =6, te class C 2 from m 2 can be supported by te MCN due to te alleviation of te traffic loading. By applying te proposed model-based learning approac, before t =6, bot classes C,C 2 can be delivered in time, and after t =6, not only te class C 2 but also te class C 3 from m 2 can be supported by te MCN. Tis sows tat te proposed model-based learning approac enables te MCN to support more mission-critical applications. 2 E[Delay ] E[Delay 2 ] E[Delay 3 ] E[Delay 4 ] E[Delay 5 ] 4 Model-based learning 2 Self-learning Q-learning required performance level service interval t Fig. 8. Comparisons of te expected end-to-end delay using different learning approaces tat solves te distributed MDP. E[Delay ] E[Delay 2 ] E[Delay 3 ] E[Delay 4 ] E[Delay 5 ] 4 Model-based learning Self-learning 2 Q-learning Source node disappears Source node disappears required performance level service interval t Fig. 9. Source node of packets in class C,C 4 disappears after t =6. C. Heterogeneous learning In te previous simulations, we assume tat all te network nodes adopt te same learning approac to solve te distributed MDP. However in reality, te agents can adopt different learning approaces. We simulated different scenarios in wic te agents ave eterogeneous learning capabilities using te same network conditions as te previous simulation and te same network topology sown in Figure 5(a). In Table V, we assume tat te agents in te same op are using te same learning metod. Te model-based learning refers to te proposed model-based reinforcement learning approac and te model-free learning refers to te Q-learning in [6]. Te simulation results sow tat adopting a modelbased learning approac near te ASs is very important. Te delays are smaller independent of te type of learning approaces te rest of te nodes. Tis is because te modelbased learning approac provides a more accurate estimate of te expected delay feedback tan te model-free learning approac. Also, te model-based learning approac converges faster tan te model-free learning approac. Hence, te more

12 2 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 28, NO. 5, JUNE 2 V values in op V values in op 2 V values in op V value (discounted end-to-end delay) 4 Upper bound of te V value Lower bound of te V value service interval t Fig.. Te upper and te lower bounds of te discounted end-to-end delays for te first priority class traffic at different ops. remaining nodes adopt te model-based learning approac, te iger te improvement in te delay performance. Moreover, te delays of te second priority class traffic varymoretan te first priority class. Tis sows tat te learning metods adopted by te agents can significantly impact te performance of mission-critical applications, especially te ones wit lower priorities. In oter words, te deployed learning approaces impact te number of mission-critical applications supported by te MCN. D. Determining te upper and te lower bound In tis subsection, we provide simulation results to sow te upper bound and te lower bound of te model-based reinforcement learning. We adopt te same network conditions and te 2-cluster network topology sown in Figure 6(a). Figure sows te MDP delay values of te first priority class traffic at different ops. Since te real delay is proven to be bounded between te upper and te lower bounds, te result sows tat te model-based reinforcement learning provides end-to-end delays tat are more and more accurate over time as well as wen te agents are getting closer to te destination nodes. VI. CONCLUSION In tis paper, we investigated ow te agents in te MCN sould optimally select teir cross-layer transmission actions in te MCN in order to minimize te end-to-end delays of mission-critical applications. To consider bot te spatial and temporal dependency in te MCN, we formulate te network delay minimization problem using distributed MDP. To solve te distributed MDP in practice, we propose an online model-based reinforcement learning approac. Unlike te conventional model-free reinforcement learning approaces, te proposed model-based reinforcement learning approac as a faster convergence rate, since it takes advantage of te priority queuing model and requires less time for te autonomic node to explore different states to evaluate te Q- values. Our simulation results verify tat te suitability of te proposed model-based learning approac supporting missioncritical applications by te agents in te MCN. APPENDIX PROOF OF THE PROPOSITION We apply Heoffding inequality [22] to obtain te confidence interval ε, wic basically states tat given random variables {X,..., X m } in range [,X max ], te inequality olds: Prob( m X i m E[X i ] ε) e 2m( ε Xmax )2 (8) m m i= i= From te first condition, we ave ε =! ln δ Am V Bm max 2n t sm m ). Denote E[V (s m,a m )] = t ˆT sm s m m )V t k,m (s m,f b,t + ) as te average s m MDP delay upper bound based on te estimated ˆT s t m s m m ) wenever state s m is visited and action A m is taken, and denote E[V (s m,a m )] = T sm s m m )V s k,m t (s m,f b,t + ) as te average m expected MDP delay value based on real Ts t m s m m ). Similar to te proof of lemma 3.2 in [23], equation (8) can be rewritten as: Prob(E[V (s m,a m )] E[V (s m,a m )] ε)! ( ) 2 exp 2n t s m A m V max V ln δ 2 Am Bm max 2n t A sm m ( ) δ = A m B m (9) t+ Hence,Prob( V k,m (s m,f b,t + ) V t+ k,m (s m,f b,t + ) ε) δ for eac state-action pair (te total number of te state-action pairs is A m B m ). Similar proof can be applied to te lower bound. Since n t s m m ) in te last term of equations (6) and (7) goes to infinity as t, we can sow tat bot te upper bound and te lower bound converge under te same conditions, i.e. V k,m (s m,f b + ) = lim t V t k,m (s m,f b,t + ), and V k,m (s m,f+ b ) = lim V t t k,m (s m,f b,t + ). Due to te symmetric structure of V k,m (s m,f+ b ) and V k,m (s m,f+ b ), we apply te union bound as in [23] to sow tat te probability Prob( V k,m (s m,f+ b ) V k,m (s m,f+ b ) ε) 2δ and complete te proof. REFERENCES [] A. Rezgui and M. Eltoweissy, Service-Oriented Sensor-Actuator Networks, IEEE Commun. Mag., vol. 45, no. 2, pp 92-, Dec 27. [2] S. Nelakuditi, Z. Zang, R. P. Tsang, D. H. C. Du, Adaptive Proportional Routing: A Localized QoS Routing Approac, IEEE/ACM Trans. Netw., vol., no. 6, pp , Dec 22. [3] K. Akayya and M. Younis, An Energy-Aware QoS Routing Protocol for Wireless Sensor Networks, in te Proc. IEEE Worksop on Mobile and Wireless Networks(MWN23), Providence, RI, May 23. [4] P. Gupta and T. Javidi, Towards Trougput and Delay-Optimal Routing for Wireless Ad-Hoc Networks, Asilomar Conference on Signals, Systems and Computers, Nov. 27. [5] M. J. Neely, E. Modiano, and C. E. Rors, Dynamic Power Allocation and Routing for Time-Varying Wireless Networks, IEEE J. Sel. Areas Commun., vol. 23. no., Jan 25.

13 SHIANG and VAN DER SCHAAR: ONLINE LEARNING IN AUTONOMIC MULTI-HOP WIRELESS NETWORKS 3 TABLE VI ALGORITHM : MODEL-FREE REINFORCEMENT LEARNING AT NODE m [6] A. Tizgadam, A. Leon-Garcia, On Congestion in Mission-Critical Networks, IEEE INFOCOM 28, April 28. [7] M. Liotine, Mission Critical Network Planning, Artec House, Norwood, MA 23. [8] Y. Guan, X. Fu, D. Xuan, P. U. Senoy, R. Bettati, and W. Zao, NetCamo: Camouflaging Network Traffic for QoS-Guaranteed Mission Critical Applications, IEEE Trans. Syst., Man, Cybernet. A., vol. 3, no. 4, pp , July 2. [9] Y. Huang, W. He, K. Narstedt, W. C. Lee, Dos Resistant Broadcast Autentication wit Low End-to-end Delay, IEEE INFOCOM 28, April 28. [] D. Mars, R. Tynan, D. O Kane, G. M. P. O Hare, Autonomic Wireless Sensor Networks, Artificial Intelligence, vol. 7, pp , 24. [] D. Cen and P. K. Varsney, QoS support in wireless sensor networks: A survey In Proc. International Conference on Wireless Networks (ICWN), pp. 2-24, Las Vegas, NV, June 24. [2] J. Cakareski and P. Frossard, Rate-Distortion Optimized Distributed Packet Sceduling of Multiple Video Streams Over Sared Communication Resource, IEEE Trans. Multimedia, vol. 8, no. 2, Apr, 26. [3] H.-P. Siang and M. van der Scaar, Informationally Decentralized Video Streaming over Multi-op Wireless Networks, IEEE Trans. Multimedia, vol. 9, no. 6, pp , Sep 27. [4] A. G. Barto, S. J. Bradtke and S. P. Sing, Learning to act using realtime dynamic programming, Artificial Intelligence, vol. 72, no. -2, Jan 995, pp [5] P. Tadepalli and D. Ok, Model-based average reward reinforcement learning, Artificial Intelligence, vol., no. -2, Jan 998, pp [6] C. J. C. H. Watkins, P. Dayan, Q-learning, Macine Learning, vol. 8, no. 3-4, pp , May 992. [7] R. S. Sutton, Learning to predict by te metod of temporal differences, Macine Learning, vol. 3, no., pp. 9-44, Aug [8] M. L. Puterman, Markov Decision Process: Discrete Stocastic Dynamic Programming, Jon Wiley & Sons, Inc. New York, 994. [9] D. P. Bertsekas, Dynamic Programming and Optimal Control, Atena Scientific, 995. [2] D. Krisnaswamy, Network-assisted Link Adaptation wit Power Control and Cannel Reassignment in Wireless Networks, 3G Wireless Conference, pp. 65-7, 22. [2] H. -P. Siang and M. van der Scaar, Multi-user video streaming over multi-op wireless networks: A distributed, cross-layer approac based on priority queuing, IEEE J. Sel. Areas Commun., vol. 25, no. 4, pp , May 27. [22] W. Hoeffding, Probability inequalities for sums of bounded random variables, J. American Statistical Association, vol. 58, no. 3, pp. 3-3, Mar [23] E. Even-Dar, S. Mannor, Y. Manour, Action elimination and stopping conditions for reinforcement learning, Proc. International Conference on Macine Learning (ICML 23), 23. [24] C. E. Perkins, P. Bagwat, Higly Dynamic Destination-Sequenced Distance-Vector Routing (DSDV) for Mobile Computers, ACM SIG- COMM Computer Communication Review, vol. 24, no. 4, pp , Oct [25] T. Rouggarden, E. Tardos, How Bad is Selfis Routing? J. ACM, vol. 49, no. 2, pp , Marc 22. [26] F. Kelly, A. Maulloo, and D. Tan, Rate control in communication networks: sadow prices, proportional fairness and stability, J. Operational Researc Society, vol. 49, no. 3, pp , Mar [27] D. Xu, M. Ciang, and J. Rexford, Link-state routing wit opby-op forwarding acieves optimal traffic engineering, Proc. IEEE INFOCOM, 28. [28] F. Fu, M. van der Scaar, A systematic framework for dynamically optimizing multi-user video transmission, tecnical report, ttp://arxiv.org/abs/ [29] Q. Zang, S. A. Kassam, Finite-state Markov Model for Reyleig fading cannels, IEEE Trans. Commun., vol. 47, no., Nov [3] J. Dowling, E. Curran, R. Cunningam, and V. Caill, Using Feedback in Collaborative Reinforcement Learning to Adaptively Optimize MANET Routing, IEEE Trans. Syst., Man, Cybern. A., vol. 35, no. 3, pp , May 25. [3] H. Tong, T. X. Brown, Adaptive Call Admission Control under Quality of Service Constraints: A Reinforcement Learning Solution, IEEE J. Sel. Areas Commun., vol. 8, no. 2, pp , Feb 2. [32] S. Toumpis, A. J. Goldsmit, Capacity Regions for wireless Ad Hoc Network, IEEE Trans. Wireless Commun., vol. 2, no. 4, pp , July 23.

14 4 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 28, NO. 5, JUNE 2 TABLE VII ALGORITHM 2: MODEL-BASED REINFORCEMENT LEARNING AT NODE m Hsien-Po Siang is currently a Postdoctoral Scolar at te Department of Electrical Engineering, University of California, Los Angeles. He graduated from National Taiwan University wit is B.S. and M.S. in Electrical Engineering in 2 and 22, respectively. In 29, e received is P.D. degree from Electrical Engineering at University of California, Los Angeles. During is P.D. study, e worked at Intel Corp., Folsom CA in 26, researcing overlay network infrastructure over wireless mes networks. He publised several journal papers and conference papers on tese topics and as been selected as one of te eigt P.D. students cosen for te 27 Watson Emerging Leaders in Multimedia awarded by IBM Researc, NY. His researc interests include cross-layer optimizations/adaptations, multimedia communications, and dynamic resource management for delay-sensitive applications. Miaela van der Scaar received te P.D. degree from Eindoven University of Tecnology, Eindoven, Te Neterlands, in 2. Se is currently an Associate Professor at te Department of Electrical Engineering, University of California, Los Angeles. Se olds 3 granted US patents. Se is also te editor (wit Pil Cou) of te book Multimedia over IP and Wireless Networks: Compression, Networking, and Systems (San Diego, CA: Academic Press, 27). Dr. Van der Scaar as been an active participant in te International Organization for Standardization (ISO) MPEG standard since 999, to wic se made more tan 5 contributions and for wic se received 3 ISO recognition awards. Se received te National Science Foundation CAREER Award in 24, IBM Faculty Award in 25 and 27, te Okawa Foundation Award in 26, te IEEE Transactions on Circuits and Systems for Video Tecnology Best Paper Award in 25, and te Most Cited Paper Award from te European Association for Signal Processing Journal Signal Processing: Image Communication for Se was elected as an IEEE Fellow in 2. Her researc interests include wireless multimedia processing, communication and networking, game-teoretic approaces in multi-agent communication systems, and multimedia systems.

Spectrum Sharing with Multi-hop Relaying

Spectrum Sharing with Multi-hop Relaying Spectrum Saring wit Multi-op Relaying Yong XIAO and Guoan Bi Scool of Electrical and Electronic Engineering Nanyang Tecnological University, Singapore Email: xiao001 and egbi@ntu.edu.sg Abstract Spectrum

More information

Designing Autonomic Wireless Multi-hop Networks for Delay-Sensitive Applications

Designing Autonomic Wireless Multi-hop Networks for Delay-Sensitive Applications Designing Autonomic Wireless Multi-op Networks for Delay-Sensitive Applications Peter Hsien-Po Siang Advisor : Prof. Miaela van der Scaar Electrical Engineering, UCLA Delay-sensitive applications are booming!

More information

CAPACITY OF MULTIPLE ACCESS CHANNELS WITH CORRELATED JAMMING

CAPACITY OF MULTIPLE ACCESS CHANNELS WITH CORRELATED JAMMING CAPACITY OF MULTIPLE ACCESS CHANNELS WITH CORRELATED JAMMING Sabnam Safiee and Sennur Ulukus Department of Electrical and Computer Engineering University of Maryland College Park, MD ABSTRACT We investigate

More information

Punctured Binary Turbo-Codes with Optimized Performance

Punctured Binary Turbo-Codes with Optimized Performance Punctured Binary Turbo-odes wit Optimized Performance I. atzigeorgiou, M. R. D. Rodrigues, I. J. Wassell Laboratory for ommunication Engineering omputer Laboratory, University of ambridge {ic1, mrdr, iw}@cam.ac.uk

More information

Binary Search Tree (Part 2 The AVL-tree)

Binary Search Tree (Part 2 The AVL-tree) Yufei Tao ITEE University of Queensland We ave already learned a static version of te BST. In tis lecture, we will make te structure dynamic, namely, allowing it to support updates (i.e., insertions and

More information

Distributed Topology Control for Stable Path Routing in Multi-hop Wireless Networks

Distributed Topology Control for Stable Path Routing in Multi-hop Wireless Networks 49t IEEE Conference on Decision and Control December 15-17, 2010 Hilton Atlanta Hotel, Atlanta, GA, USA Distributed Topology Control for Stable Pat Routing in Multi-op Wireless Networks Kiran K. Somasundaram,

More information

Lecture-3 Amplitude Modulation: Single Side Band (SSB) Modulation

Lecture-3 Amplitude Modulation: Single Side Band (SSB) Modulation Lecture-3 Amplitude Modulation: Single Side Band (SSB) Modulation 3.0 Introduction. 3.1 Baseband Signal SSB Modulation. 3.1.1 Frequency Domain Description. 3.1. Time Domain Description. 3. Single Tone

More information

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 3, MARCH

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 3, MARCH IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 3, MARCH 2010 1401 Decomposition Principles and Online Learning in Cross-Layer Optimization for Delay-Sensitive Applications Fangwen Fu, Student Member,

More information

An Efficient Handoff Scheme Using a Minimum Residual Time First Scheme

An Efficient Handoff Scheme Using a Minimum Residual Time First Scheme An Efficient Handoff Sceme Using a Minimum Residual Time First Sceme Bilal Owaidat Rola Kassem and Hamza Issa Abstract Wen a mobile station (MS) wit an ongoing call is about to leave a cell te base station

More information

An Experimental Downlink Multiuser MIMO System with Distributed and Coherently-Coordinated Transmit Antennas

An Experimental Downlink Multiuser MIMO System with Distributed and Coherently-Coordinated Transmit Antennas An Experimental Downlink Multiuser MIMO System wit Distributed and Coerently-Coordinated Antennas Dragan Samardzija, Howard Huang, Reinaldo Valenzuela and Teodore Sizer Bell Laboratories, Alcatel-Lucent,

More information

Distributed Topology Control for Stable Path Routing in Mobile Ad Hoc Networks

Distributed Topology Control for Stable Path Routing in Mobile Ad Hoc Networks Te InsTITuTe for systems researc Isr TecnIcal report 2009-20 Distributed Topology Control for Stable Pat Routing in Mobile Ad Hoc Networks Kiran K. Somasundaram, Kaustub Jain, Vaid Tabatabaee, Jon S. Baras

More information

Performance Evaluation of Limited Feedback Schemes for 3D Beamforming in LTE-Advanced System

Performance Evaluation of Limited Feedback Schemes for 3D Beamforming in LTE-Advanced System Performance Evaluation of Limited Feedback Scemes for 3D Beamforming in LTE-Advanced System Sang-Lim Ju, Young-Jae Kim, and Won-Ho Jeong Department of Radio and Communication Engineering Cungbuk National

More information

On the Sum Capacity of Multiaccess Block-Fading Channels with Individual Side Information

On the Sum Capacity of Multiaccess Block-Fading Channels with Individual Side Information On te Sum Capacity of Multiaccess Block-Fading Cannels wit Individual Side Information Yas Despande, Sibi Raj B Pillai, Bikas K Dey Department of Electrical Engineering Indian Institute of Tecnology, Bombay.

More information

Calculation of Antenna Pattern Influence on Radiated Emission Measurement Uncertainty

Calculation of Antenna Pattern Influence on Radiated Emission Measurement Uncertainty Calculation of Antenna Pattern Influence on Radiated Emission Measurement Uncertainty Alexander Kriz Business Unit RF-Engineering Austrian Researc Centers GmbH - ARC A-444 Seibersdorf, Austria alexander.kriz@arcs.ac.at

More information

Resource Management in QoS-Aware Wireless Cellular Networks

Resource Management in QoS-Aware Wireless Cellular Networks Resource Management in QoS-Aware Wireless Cellular Networks Zhi Zhang Dept. of Electrical and Computer Engineering Colorado State University April 24, 2009 Zhi Zhang (ECE CSU) Resource Management in Wireless

More information

Unit 5 Waveguides P a g e 1

Unit 5 Waveguides P a g e 1 Unit 5 Waveguides P a g e Syllabus: Introduction, wave equation in Cartesian coordinates, Rectangular waveguide, TE, TM, TEM waves in rectangular guides, wave impedance, losses in wave guide, introduction

More information

Cooperative Request-answer Schemes for Mobile Receivers in OFDM Systems

Cooperative Request-answer Schemes for Mobile Receivers in OFDM Systems Cooperative Request-answer Scemes for Mobile Receivers in OFDM Systems Y. Samayoa, J. Ostermann Institut für Informationsverarbeitung Gottfried Wilelm Leibniz Universität Hannover 30167 Hannover, Germany

More information

5.3 Sum and Difference Identities

5.3 Sum and Difference Identities SECTION 5.3 Sum and Difference Identities 21 5.3 Sum and Difference Identities Wat you ll learn about Cosine of a Difference Cosine of a Sum Sine of a Difference or Sum Tangent of a Difference or Sum Verifying

More information

Overview of MIMO Radio Channels

Overview of MIMO Radio Channels Helsinki University of Tecnology S.72.333 Postgraduate Course in Radio Communications Overview of MIMO Radio Cannels 18, May 2004 Suiyan Geng gsuiyan@cc.ut.fi Outline I. Introduction II. III. IV. Caracteristics

More information

DYNAMIC BEAM FORMING USING CHIRP SIGNALS

DYNAMIC BEAM FORMING USING CHIRP SIGNALS BeBeC-018-D04 DYNAMIC BEAM FORMING USING CHIRP SIGNALS Stuart Bradley 1, Lily Panton 1 and Matew Legg 1 Pysics Department, University of Auckland 38 Princes Street, 1010, Auckland, New Zealand Scool of

More information

ELEC 546 Lecture #9. Orthogonal Frequency Division Multiplexing (OFDM): Basic OFDM System

ELEC 546 Lecture #9. Orthogonal Frequency Division Multiplexing (OFDM): Basic OFDM System ELEC 546 Lecture #9 Ortogonal Frequency Division Multiplexing (OFDM): Basic OFDM System Outline Motivations Diagonalization of Vector Cannels Transmission of one OFDM Symbol Transmission of sequence of

More information

ON THE IMPACT OF RESIDUAL CFO IN UL MU-MIMO

ON THE IMPACT OF RESIDUAL CFO IN UL MU-MIMO ON THE IMPACT O RESIDUAL CO IN UL MU-MIMO eng Jiang, Ron Porat, and Tu Nguyen WLAN Group of Broadcom Corporation, San Diego, CA, USA {fjiang, rporat, tun}@broadcom.com ABSTRACT Uplink multiuser MIMO (UL

More information

Modelling Capture Behaviour in IEEE Radio Modems

Modelling Capture Behaviour in IEEE Radio Modems Modelling Capture Beaviour in IEEE 80211 Radio Modems Cristoper Ware, Joe Cicaro, Tadeusz Wysocki cris@titruoweduau 20t February Abstract In tis paper we investigate te performance of common capture models

More information

Optimal Foresighted Multi-User Wireless Video

Optimal Foresighted Multi-User Wireless Video Optimal Foresighted Multi-User Wireless Video Yuanzhang Xiao, Student Member, IEEE, and Mihaela van der Schaar, Fellow, IEEE Department of Electrical Engineering, UCLA. Email: yxiao@seas.ucla.edu, mihaela@ee.ucla.edu.

More information

Channel Estimation Filter Using Sinc-Interpolation for UTRA FDD Downlink

Channel Estimation Filter Using Sinc-Interpolation for UTRA FDD Downlink { Cannel Estimation Filter Using Sinc-Interpolation for UTA FDD Downlink KLAUS KNOCHE, JÜGEN INAS and KAL-DIK KAMMEYE Department of Communications Engineering, FB- University of Bremen P.O. Box 33 4 4,

More information

On the relation between radiated and conducted RF emission tests

On the relation between radiated and conducted RF emission tests Presented at te 3 t International Zuric Symposium on Electromagnetic Compatibility, February 999. On te relation between radiated and conducted RF emission tests S. B. Worm Pilips Researc Eindoven, te

More information

Abstract 1. INTRODUCTION

Abstract 1. INTRODUCTION Allocating armonic emission to MV customers in long feeder systems V.J. Gosbell and D. Robinson Integral nergy Power Quality Centre University of Wollongong Abstract Previous work as attempted to find

More information

Hedonic Coalition Formation for Distributed Task Allocation among Wireless Agents

Hedonic Coalition Formation for Distributed Task Allocation among Wireless Agents Hedonic Coalition Formation for Distributed Task Allocation among Wireless Agents Walid Saad, Zhu Han, Tamer Basar, Me rouane Debbah, and Are Hjørungnes. IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 10,

More information

3432 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 53, NO. 10, OCTOBER 2007

3432 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 53, NO. 10, OCTOBER 2007 3432 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 53, NO 10, OCTOBER 2007 Resource Allocation for Wireless Fading Relay Channels: Max-Min Solution Yingbin Liang, Member, IEEE, Venugopal V Veeravalli, Fellow,

More information

Indirect Measurement

Indirect Measurement exploration Georgia Performance Standards M6G1.c, M6A2.c, M6A2.g Te eigts of very tall structures can be measured indirectly using similar figures and proportions. Tis metod is called indirect measurement.

More information

Performance analysis and comparison of m x n zero forcing and MMSE equalizer based receiver for mimo wireless channel

Performance analysis and comparison of m x n zero forcing and MMSE equalizer based receiver for mimo wireless channel Songklanakarin J. Sci. Tecnol. 33 (3), 335-340, May - Jun. 0 ttp://www.sjst.psu.ac.t Original Article Performance analysis and comparison of m x n zero forcing and MMSE equalizer based receiver for mimo

More information

MIMO-based Jamming Resilient Communication in Wireless Networks

MIMO-based Jamming Resilient Communication in Wireless Networks MIMO-based Jamming Resilient Communication in Wireless Networks Qiben Yan Huaceng Zeng Tingting Jiang Ming Li Wening Lou Y. Tomas Hou Virginia Polytecnic Institute and State University, VA, USA Uta State

More information

Multi-agent coordination via a shared wireless spectrum

Multi-agent coordination via a shared wireless spectrum 217 IEEE 56t Annual Conference on Decision and Control (CDC) December 12-15, 217, Melbourne, Australia Multi-agent coordination via a sared wireless spectrum Cameron Nowzari Abstract Tis paper considers

More information

Token System Design for Autonomic Wireless Relay Networks

Token System Design for Autonomic Wireless Relay Networks 1 Token System Design for Autonomic Wireless Relay Networks Jie Xu and Mihaela van der Schaar, Fellow, IEEE, Abstract This paper proposes a novel framework for incentivizing self-interested transceivers

More information

Genetic Algorithm for Wireless Sensor Network With Localization Based Techniques

Genetic Algorithm for Wireless Sensor Network With Localization Based Techniques International Journal of Scientific and Researc Publications, Volume, Issue 9, September 201 1 Genetic Algoritm for Wireless Sensor Network Wit Localization Based Tecniques * Kapil Uraiya, ** Dilip Kumar

More information

ABSTRACT. Kiran Kumar Somasundaram, Doctor of Philosophy, 2010

ABSTRACT. Kiran Kumar Somasundaram, Doctor of Philosophy, 2010 ABSTRACT Title of dissertation: TOPOLOGY CONTROL ALGORITHMS FOR RULE-BASED ROUTING Kiran Kumar Somasundaram, Doctor of Pilosopy, 2010 Dissertation directed by: Professor Jon S. Baras Department of Electrical

More information

3644 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 6, JUNE 2011

3644 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 6, JUNE 2011 3644 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 6, JUNE 2011 Asynchronous CSMA Policies in Multihop Wireless Networks With Primary Interference Constraints Peter Marbach, Member, IEEE, Atilla

More information

ON TWO-PLANE BALANCING OF SYMMETRIC ROTORS

ON TWO-PLANE BALANCING OF SYMMETRIC ROTORS Proceedings of ME Turbo Expo 0 GT0 June -5, 0, openagen, Denmark GT0-6806 ON TO-PLNE BLNING OF YMMETRI ROTOR Jon J. Yu, P.D. GE Energy 63 Bently Parkway out Minden, Nevada 8943 U Pone: (775) 5-5 E-mail:

More information

This study concerns the use of machine learning based

This study concerns the use of machine learning based Modern AI for games: RoboCode Jon Lau Nielsen (jlni@itu.dk), Benjamin Fedder Jensen (bfje@itu.dk) Abstract Te study concerns te use of neuroevolution, neural networks and reinforcement learning in te creation

More information

Evaluation Model of Microblog Information Confidence Based on BP Neural Network

Evaluation Model of Microblog Information Confidence Based on BP Neural Network Evaluation Model of Microblog Information Confidence Based on BP Neural Network Yuguang Ye Quanzou Normal University; Quanzou, 36, Cina Abstract: As te carrier of social media, microblog as become an important

More information

Gateways Placement in Backbone Wireless Mesh Networks

Gateways Placement in Backbone Wireless Mesh Networks I. J. Communications, Network and System Sciences, 2009, 1, 1-89 Published Online February 2009 in SciRes (http://www.scirp.org/journal/ijcns/). Gateways Placement in Backbone Wireless Mesh Networks Abstract

More information

A REVIEW OF THE NEW AUSTRALIAN HARMONICS STANDARD AS/NZS

A REVIEW OF THE NEW AUSTRALIAN HARMONICS STANDARD AS/NZS A REVIEW OF THE NEW AUSTRALIAN HARMONICS STANDARD AS/NZS 61000.3.6 Abstract V. J. Gosbell 1, P. Muttik 2 and D.K. Geddey 3 1 University of Wollongong, 2 Alstom, 3 Transgrid v.gosbell@uow.edu.au Harmonics

More information

On the Downlink Capacity of WCDMA Systems with Transmit Diversity

On the Downlink Capacity of WCDMA Systems with Transmit Diversity On te Downlink Capacity of WCDMA Systems wit ransmit Diversity Vaibav Sing, Oya Yilmaz, Jialing Wang, Kartigeyan Reddy, and S. Ben Slimane Radio Communication Systems Department of Signals, Sensors, and

More information

Throughput-optimal number of relays in delaybounded multi-hop ALOHA networks

Throughput-optimal number of relays in delaybounded multi-hop ALOHA networks Page 1 of 10 Throughput-optimal number of relays in delaybounded multi-hop ALOHA networks. Nekoui and H. Pishro-Nik This letter addresses the throughput of an ALOHA-based Poisson-distributed multihop wireless

More information

Robust Bayesian Learning for Wireless RF Energy Harvesting Networks

Robust Bayesian Learning for Wireless RF Energy Harvesting Networks 2017 15t International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt) Robust Bayesian Learning for Wireless RF Energy Harvesting Networks Nof Abuzainab 1, Walid

More information

Estimation of Dielectric Constant for Various Standard Materials using Microstrip Ring Resonator

Estimation of Dielectric Constant for Various Standard Materials using Microstrip Ring Resonator Journal of Science and Tecnology, Vol. 9 No. 3 (017) p. 55-59 Estimation of Dielectric Constant for Various Standard Materials using Microstrip Ring Resonator Pek Jin Low 1, Famiruddin Esa 1*, Kok Yeow

More information

Image Feature Extraction and Recognition of Abstractionism and Realism Style of Indonesian Paintings

Image Feature Extraction and Recognition of Abstractionism and Realism Style of Indonesian Paintings Image Feature Extraction and Recognition of Abstractionism and Realism Style of Indonesian Paintings Tieta Antaresti R P and Aniati Murni Arymurty Faculty of Computer Science University of Indonesia Depok

More information

Power Quality Analysis Using An Adaptive Decomposition Structure

Power Quality Analysis Using An Adaptive Decomposition Structure Power Quality Analysis Using An Adaptive Decomposition Structure Doğan Gökan Ece 1 and Ömer Nezi Gerek 1 (1) Dept. of Electrical and Elctronics Engineering, Anadolu University, Scool of Engineering and

More information

Multi-Round Sensor Deployment for Guaranteed Barrier Coverage

Multi-Round Sensor Deployment for Guaranteed Barrier Coverage Tis full text paper was peer reviewed at te direction of IEEE Communications Society subject matter experts for publication in te IEEE INFOCOM 21 proceedings Tis paper was presented as part of te main

More information

The deterministic EPQ with partial backordering: A new approach

The deterministic EPQ with partial backordering: A new approach Omega 37 (009) 64 636 www.elsevier.com/locate/omega Te deterministic EPQ wit partial backordering: A new approac David W. Pentico a, Mattew J. Drake a,, Carl Toews b a Scool of Business Administration,

More information

Closed-Form Optimality Characterization of Network-Assisted Device-to-Device Communications

Closed-Form Optimality Characterization of Network-Assisted Device-to-Device Communications Closed-Form Optimality Caracterization of Network-Assisted Device-to-Device Communications Serve Salmasi,EmilBjörnson, Slimane Ben Slimane,andMérouane Debba Department of Communication Systems, Scool of

More information

A Backlog-Based CSMA Mechanism to Achieve Fairness and Throughput-Optimality in Multihop Wireless Networks

A Backlog-Based CSMA Mechanism to Achieve Fairness and Throughput-Optimality in Multihop Wireless Networks A Backlog-Based CSMA Mechanism to Achieve Fairness and Throughput-Optimality in Multihop Wireless Networks Peter Marbach, and Atilla Eryilmaz Dept. of Computer Science, University of Toronto Email: marbach@cs.toronto.edu

More information

Loading transformers with non sinusoidal currents

Loading transformers with non sinusoidal currents LES00070-ZB rev. Loading transformers wit non sinusoidal currents K Factor Loading transformers wit non sinusoidal currents... Interpretation / example... 6 Copyrigt 007 ABB, All rigts reserved. LES00070-ZB

More information

Published in: Proceedings of 8th Annual IEEE Energy Conversion Congress & Exposition (ECCE 2016)

Published in: Proceedings of 8th Annual IEEE Energy Conversion Congress & Exposition (ECCE 2016) Aalborg Universitet A Multi-Pulse Front-End Rectifier System wit Electronic Pase-Sifting for Harmonic Mitigation in Motor Drive Applications Zare, Firuz; Davari, Pooya; Blaabjerg, Frede Publised in: Proceedings

More information

DESIGN AND ANALYSIS OF MIMO SYSTEM FOR UWB COMMUNICATION

DESIGN AND ANALYSIS OF MIMO SYSTEM FOR UWB COMMUNICATION DESIGN AND ANAYSIS OF IO SYSTE FOR UWB COUNICATION iir N. oanty, onalisa Bol, axmi Prasad isra 3, Sanjat Kumar isra 4 ITER, Siksa O Anusandan University, Bubaneswar, Odisa, 75030, India Seemanta Engineering

More information

GENERALLY, the power loss in the winding of an

GENERALLY, the power loss in the winding of an INTL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 00, VOL. 6, NO., PP. 7-6 Manuscript received July 0, 00: revised September, 00. DOI: 0.78/v077-00-00- Foil Winding Resistance and Power Loss in Individual

More information

A Multi-User Cooperative Diversity for Wireless Local Area Networks

A Multi-User Cooperative Diversity for Wireless Local Area Networks I. J. Communications, Network and System Sciences, 008, 3, 07-83 Publised Online August 008 in SciRes (ttp://www.scirp.org/journal/ijcns/). A Multi-User Cooperatie Diersity for Wireless Local Area Networks

More information

Wireless Information and Energy Transfer in Multi-Antenna Interference Channel

Wireless Information and Energy Transfer in Multi-Antenna Interference Channel SUBMITTED TO IEEE TRANSACTIONS ON SIGNAL PROCESSING Wireless Information and Energy Transfer in Multi-Antenna Interference Cannel Cao Sen, Wei-Ciang Li and Tsung-ui Cang arxiv:8.88v [cs.it] Aug Abstract

More information

Enhanced HARQ Technique Using Self-Interference Cancellation Coding (SICC)

Enhanced HARQ Technique Using Self-Interference Cancellation Coding (SICC) MITUBIHI ELECTRIC REEARCH LABORATORIE ttp://www.merl.com Enanced HARQ Tecnique Using elf-interference Cancellation Coding (ICC) Wataru Matsumoto, Tosiyuki Kuze, igeru Ucida, Yosida Hideo, Pilip Orlik,

More information

Space Shift Keying (SSK-) MIMO over Correlated Rician Fading Channels: Performance Analysis and a New Method for Transmit-Diversity

Space Shift Keying (SSK-) MIMO over Correlated Rician Fading Channels: Performance Analysis and a New Method for Transmit-Diversity Space Sift Keying SSK-) MIMO over Correlated ician Fading Cannels: Performance Analysis and a New Metod for Transmit-Diversity Marco Di enzo, Harald Haas To cite tis version: Marco Di enzo, Harald Haas.

More information

Optimal DG Placement and Sizing in Distribution System for Loss and THD Reduction

Optimal DG Placement and Sizing in Distribution System for Loss and THD Reduction International Journal of Electronic and Electrical Engineering. ISSN 0974-2174 Volume 5, Number 3 (2012), pp. 227-237 International Researc Publication House ttp://www.irpouse.com Optimal Placement and

More information

LOADING OF TRANSFORMERS BEYOND NAMEPLATE RATING

LOADING OF TRANSFORMERS BEYOND NAMEPLATE RATING LOADING OF TRANSFORMERS BEYOND NAMEPLATE RATING by K. B. M. I. Perera and J. R. Lucas Abstract Te application of a load in excess of nameplate ratings, and/or an ambient temperature iger tan designed of

More information

Mathematical Derivation of MIMO Based MANET to Improve the Network Performance

Mathematical Derivation of MIMO Based MANET to Improve the Network Performance Journal of Computer Science Original Researc Paper Matematical Derivation of MIMO Based MANET to Improve te Network Performance Swati Cowduri, Pranab Banerjee and Seli Sina Caudury Department of Electronics

More information

IMAGE ILLUMINATION (4F 2 OR 4F 2 +1?)

IMAGE ILLUMINATION (4F 2 OR 4F 2 +1?) IMAGE ILLUMINATION ( OR +?) BACKGROUND Publications abound wit two differing expressions for calculating image illumination, te amount of radiation tat transfers from an object troug an optical system

More information

OPTI-502 Optical Design and Instrumentation I John E. Greivenkamp Homework Set 5 Fall, 2018

OPTI-502 Optical Design and Instrumentation I John E. Greivenkamp Homework Set 5 Fall, 2018 Homework Set 5 all, 2018 Assigned: 9/26/18 Lecture 11 Due: 10/3/18 Lecture 13 Midterm Exam: Wednesday October 24 (Lecture 19) 5-1) Te following combination of tin lenses in air is in a telepoto configuration:

More information

Joint Spectrum and Power Allocation for Inter-Cell Spectrum Sharing in Cognitive Radio Networks

Joint Spectrum and Power Allocation for Inter-Cell Spectrum Sharing in Cognitive Radio Networks Joint Spectrum and Power Allocation for Inter-Cell Spectrum Sharing in Cognitive Radio Networks Won-Yeol Lee and Ian F. Akyildiz Broadband Wireless Networking Laboratory School of Electrical and Computer

More information

Directional Derivative, Gradient and Level Set

Directional Derivative, Gradient and Level Set Directional Derivative, Gradient and Level Set Liming Pang 1 Directional Derivative Te partial derivatives of a multi-variable function f(x, y), f f and, tell us te rate of cange of te function along te

More information

Distributed Collaborative Path Planning in Sensor Networks with Multiple Mobile Sensor Nodes

Distributed Collaborative Path Planning in Sensor Networks with Multiple Mobile Sensor Nodes 7th Mediterranean Conference on Control & Automation Makedonia Palace, Thessaloniki, Greece June 4-6, 009 Distributed Collaborative Path Planning in Sensor Networks with Multiple Mobile Sensor Nodes Theofanis

More information

Modelling and Control of Gene Regulatory Networks for Perturbation Mitigation

Modelling and Control of Gene Regulatory Networks for Perturbation Mitigation Tis article as been accepted for publication in a future issue of tis journal, but as not been fully edited. Content may cange prior to final publication. Citation information: DOI.9/TCBB.., IEEE/ACM IEEE/ACM

More information

On Event Signal Reconstruction in Wireless Sensor Networks

On Event Signal Reconstruction in Wireless Sensor Networks On Event Signal Reconstruction in Wireless Sensor Networks Barış Atakan and Özgür B. Akan Next Generation Wireless Communications Laboratory Department of Electrical and Electronics Engineering Middle

More information

Localization in Wireless Sensor Networks

Localization in Wireless Sensor Networks Localization in Wireless Sensor Networks Part 2: Localization techniques Department of Informatics University of Oslo Cyber Physical Systems, 11.10.2011 Localization problem in WSN In a localization problem

More information

Training Spiking Neuronal Networks With Applications in Engineering Tasks

Training Spiking Neuronal Networks With Applications in Engineering Tasks Training Spiking Neuronal Networks Wit Applications in Engineering Tasks Pill Rowcliffe and Jianfeng Feng P. Rowcliffe is wit te Department of Informatics at te Scool of Science and Tecnology (SciTec,

More information

Full-Duplex Machine-to-Machine Communication for Wireless-Powered Internet-of-Things

Full-Duplex Machine-to-Machine Communication for Wireless-Powered Internet-of-Things 1 Full-Duplex Machine-to-Machine Communication for Wireless-Powered Internet-of-Things Yong Xiao, Zixiang Xiong, Dusit Niyato, Zhu Han and Luiz A. DaSilva Department of Electrical and Computer Engineering,

More information

ON THE USE OF MULTI-HARMONIC LEAST-SQUARES FITTING FOR THD ESTIMATION IN POWER QUALITY ANALYSIS

ON THE USE OF MULTI-HARMONIC LEAST-SQUARES FITTING FOR THD ESTIMATION IN POWER QUALITY ANALYSIS Metrol. Meas. Syst., Vol. XIX (2012), No. 2, pp. 295-306. METROLOGY AND MEASUREMENT SYSTEMS Index 330930, ISSN 0860-8229 www.metrology.pg.gda.pl ON THE USE OF MULTI-HARMONIC LEAST-SQUARES FITTING FOR THD

More information

A Realistic Power Consumption Model for Wireless Sensor Network Devices

A Realistic Power Consumption Model for Wireless Sensor Network Devices A ealistic ower Consumption Model for Wireless Sensor Network Devices Qin Wang, Mark Hempstead and Woodward Yang Division of ngineering and Applied Sciences Harvard University {qwang, mempste, woody}@eecs.arvard.edu

More information

Performance Analysis for LTE Wireless Communication

Performance Analysis for LTE Wireless Communication IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS Performance Analysis for LTE Wireless Communication To cite tis article: S Tolat and T C Tiong 2015 IOP Conf. Ser.: Mater. Sci.

More information

Energy Savings with an Energy Star Compliant Harmonic Mitigating Transformer

Energy Savings with an Energy Star Compliant Harmonic Mitigating Transformer Energy Savings wit an Energy Star Compliant Harmonic Mitigating Transformer Tony Hoevenaars, P.Eng, Vice President Mirus International Inc. Te United States Environmental Protection Agency s Energy Star

More information

ARQ strategies for MIMO eigenmode transmission with adaptive modulation and coding

ARQ strategies for MIMO eigenmode transmission with adaptive modulation and coding ARQ strategies for MIMO eigenmode transmission with adaptive modulation and coding Elisabeth de Carvalho and Petar Popovski Aalborg University, Niels Jernes Vej 2 9220 Aalborg, Denmark email: {edc,petarp}@es.aau.dk

More information

MIMO IDENTICAL EIGENMODE TRANSMISSION SYSTEM (IETS) A CHANNEL DECOMPOSITION PERSPECTIVE

MIMO IDENTICAL EIGENMODE TRANSMISSION SYSTEM (IETS) A CHANNEL DECOMPOSITION PERSPECTIVE MIMO IDENTICAL EIGENMODE TRANSMISSION SYSTEM (IETS) A CANNEL DECOMPOSITION PERSPECTIVE M. Zeesan Sakir, Student member IEEE, and Tariq S. Durrani, Fellow IEEE Department of Electronic and Electrical Engineering,

More information

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. XX, NO. X, AUGUST 20XX 1

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. XX, NO. X, AUGUST 20XX 1 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. XX, NO. X, AUGUST 0XX 1 Greenput: a Power-saving Algorithm That Achieves Maximum Throughput in Wireless Networks Cheng-Shang Chang, Fellow, IEEE, Duan-Shin Lee,

More information

Energy-aware Task Scheduling in Wireless Sensor Networks based on Cooperative Reinforcement Learning

Energy-aware Task Scheduling in Wireless Sensor Networks based on Cooperative Reinforcement Learning Energy-aware Task Scheduling in Wireless Sensor Networks based on Cooperative Reinforcement Learning Muhidul Islam Khan, Bernhard Rinner Institute of Networked and Embedded Systems Alpen-Adria Universität

More information

Multiuser Scheduling and Power Sharing for CDMA Packet Data Systems

Multiuser Scheduling and Power Sharing for CDMA Packet Data Systems Multiuser Scheduling and Power Sharing for CDMA Packet Data Systems Sandeep Vangipuram NVIDIA Graphics Pvt. Ltd. No. 10, M.G. Road, Bangalore 560001. sandeep84@gmail.com Srikrishna Bhashyam Department

More information

Medium Access Control via Nearest-Neighbor Interactions for Regular Wireless Networks

Medium Access Control via Nearest-Neighbor Interactions for Regular Wireless Networks Medium Access Control via Nearest-Neighbor Interactions for Regular Wireless Networks Ka Hung Hui, Dongning Guo and Randall A. Berry Department of Electrical Engineering and Computer Science Northwestern

More information

Mobile Terminal Energy Management for Sustainable Multi-homing Video Transmission

Mobile Terminal Energy Management for Sustainable Multi-homing Video Transmission 1 Mobile Terminal Energy Management for Sustainable Multi-homing Video Transmission Muhammad Ismail, Member, IEEE, and Weihua Zhuang, Fellow, IEEE Abstract In this paper, an energy management sub-system

More information

The Future of Network Science: Guiding the Formation of Networks

The Future of Network Science: Guiding the Formation of Networks The Future of Network Science: Guiding the Formation of Networks Mihaela van der Schaar and Simpson Zhang University of California, Los Angeles Acknowledgement: ONR 1 Agenda Establish methods for guiding

More information

Design, Realization And Measurements of Microstrip Patch Antenna Using Three Direct Feeding Modes For 2.45ghz Applications

Design, Realization And Measurements of Microstrip Patch Antenna Using Three Direct Feeding Modes For 2.45ghz Applications International Journal of Computer Engineering and Information Tecnology VOL. 9, NO. 8, August 2017, 150 156 Available online at: www.ijceit.org E-ISSN 2412-8856 (Online) Design, Realization And Measurements

More information

Framework for Performance Analysis of Channel-aware Wireless Schedulers

Framework for Performance Analysis of Channel-aware Wireless Schedulers Framework for Performance Analysis of Channel-aware Wireless Schedulers Raphael Rom and Hwee Pink Tan Department of Electrical Engineering Technion, Israel Institute of Technology Technion City, Haifa

More information

Power Control Algorithm for Providing Packet Error Rate Guarantees in Ad-Hoc Networks

Power Control Algorithm for Providing Packet Error Rate Guarantees in Ad-Hoc Networks Proceedings of the 44th IEEE Conference on Decision and Control, and the European Control Conference 2005 Seville, Spain, December 12-15, 2005 WeC14.5 Power Control Algorithm for Providing Packet Error

More information

School of Electrical and Computer Engineering, Cornell University. ECE 303: Electromagnetic Fields and Waves. Fall 2007

School of Electrical and Computer Engineering, Cornell University. ECE 303: Electromagnetic Fields and Waves. Fall 2007 Scool of Electrical and Computer Engineering, Cornell University ECE 303: Electromagnetic Fields and Waves Fall 007 Homework 11 Due on Nov. 9, 007 by 5:00 PM Reading Assignments: i) Review te lecture notes.

More information

OPTIMAL FORESIGHTED PACKET SCHEDULING AND RESOURCE ALLOCATION FOR MULTI-USER VIDEO TRANSMISSION IN 4G CELLULAR NETWORKS

OPTIMAL FORESIGHTED PACKET SCHEDULING AND RESOURCE ALLOCATION FOR MULTI-USER VIDEO TRANSMISSION IN 4G CELLULAR NETWORKS OTIMAL FORESIGHTED ACKET SCHEDULING AND RESOURCE ALLOCATION FOR MULTI-USER VIDEO TRANSMISSION IN 4G CELLULAR NETWORKS Yuanzhang Xiao and Mihaela van der Schaar Department of Electrical Engineering, UCLA.

More information

Secondary Transmission Profile for a Single-band Cognitive Interference Channel

Secondary Transmission Profile for a Single-band Cognitive Interference Channel Secondary Transmission rofile for a Single-band Cognitive Interference Channel Debashis Dash and Ashutosh Sabharwal Department of Electrical and Computer Engineering, Rice University Email:{ddash,ashu}@rice.edu

More information

Branch and bound methods based tone injection schemes for PAPR reduction of DCO-OFDM visible light communications

Branch and bound methods based tone injection schemes for PAPR reduction of DCO-OFDM visible light communications Vol. 5, No. 3 Jan 07 OPTICS EXPRESS 595 Branc and bound metods based tone injection scemes for PAPR reduction of DCO-OFDM visible ligt communications YONGQIANG HEI,,JIAO LIU, WENTAO LI, XIAOCHUAN XU,3

More information

Performance of ALOHA and CSMA in Spatially Distributed Wireless Networks

Performance of ALOHA and CSMA in Spatially Distributed Wireless Networks Performance of ALOHA and CSMA in Spatially Distributed Wireless Networks Mariam Kaynia and Nihar Jindal Dept. of Electrical and Computer Engineering, University of Minnesota Dept. of Electronics and Telecommunications,

More information

How user throughput depends on the traffic demand in large cellular networks

How user throughput depends on the traffic demand in large cellular networks How user throughput depends on the traffic demand in large cellular networks B. Błaszczyszyn Inria/ENS based on a joint work with M. Jovanovic and M. K. Karray (Orange Labs, Paris) 1st Symposium on Spatial

More information

Performance Improvement of 4x4 Extended Alamouti Scheme with Implementation of Eigen Beamforming Technique

Performance Improvement of 4x4 Extended Alamouti Scheme with Implementation of Eigen Beamforming Technique Performance Improvement of 4x4 Extended Alamouti Sceme wit Implementation of Eigen Beamforming Tecnique Maarsi N. Rindani Lecturer, EC Department RK University, Rajkot, ndia-360007 Niscal M. Rindani Sr.

More information

Comparison of Downlink Transmit Diversity Schemes for RAKE and SINR Maximizing Receivers

Comparison of Downlink Transmit Diversity Schemes for RAKE and SINR Maximizing Receivers Comparison of Downlink Transmit Diversity Scemes for RAKE and SINR Maximizing Receivers Massimiliano enardi, Abdelkader Medles and Dirk TM Slock Mobile Communications Department - Institut Eurécom 2229

More information

Joint Relaying and Network Coding in Wireless Networks

Joint Relaying and Network Coding in Wireless Networks Joint Relaying and Network Coding in Wireless Networks Sachin Katti Ivana Marić Andrea Goldsmith Dina Katabi Muriel Médard MIT Stanford Stanford MIT MIT Abstract Relaying is a fundamental building block

More information

Advanced Modeling and Simulation of Mobile Ad-Hoc Networks

Advanced Modeling and Simulation of Mobile Ad-Hoc Networks Advanced Modeling and Simulation of Mobile Ad-Hoc Networks Prepared For: UMIACS/LTS Seminar March 3, 2004 Telcordia Contact: Stephanie Demers Robert A. Ziegler ziegler@research.telcordia.com 732.758.5494

More information

Multi-class Services in the Internet

Multi-class Services in the Internet Non-convex Optimization and Rate Control for Multi-class Services in the Internet Jang-Won Lee, Ravi R. Mazumdar, and Ness B. Shroff School of Electrical and Computer Engineering Purdue University West

More information

Analysis of Rectangular Notch Antenna for Dual-Band Operation

Analysis of Rectangular Notch Antenna for Dual-Band Operation Engineering, 00,, 9-96 doi:0.436/eng.00.0 Publised Online February 00 (ttp://www.scirp.org/journal/eng). Analysis of Rectangular Notc Antenna for Dual-Band Operation Abstract Rajes Kumar Viswakarma, Sanjay

More information