Practical large-scale coordinated scheduling in LTE-Advanced networks 1

Similar documents
Lecture 4. EITN Chapter 12, 13 Modulation and diversity. Antenna noise is usually given as a noise temperature!

Variation Aware Cross-Talk Aggressor Alignment by Mixed Integer Linear Programming

P. Bruschi: Project guidelines PSM Project guidelines.

Examination Mobile & Wireless Networking ( ) April 12,

Traffic. analysis. The general setting. Example: buffer. Arrival Curves. Cumulative #bits: R(t), R*(t) Instantaneous speeds: r(t), r*(t)

Adaptive CQI adjustment with LTE higher-order sectorization

Social-aware Dynamic Router Node Placement in Wireless Mesh Networks

Chapter 14: Bandpass Digital Transmission. A. Bruce Carlson Paul B. Crilly 2010 The McGraw-Hill Companies

TELE4652 Mobile and Satellite Communications

weight: amplitude of sine curve

B-MAC Tunable MAC protocol for wireless networks

Performance Evaluation of a MAC Protocol for Radio over Fiber Wireless LAN operating in the 60-GHz band

Mobile Communications Chapter 3 : Media Access

An off-line multiprocessor real-time scheduling algorithm to reduce static energy consumption

ECMA st Edition / June Near Field Communication Wired Interface (NFC-WI)

When answering the following 25 questions, always remember that there is someone who has to grade them. So please use legible handwriting.

Lecture #7: Discrete-time Signals and Sampling

EXPERIMENT #9 FIBER OPTIC COMMUNICATIONS LINK

5 Spatial Relations on Lines

Negative frequency communication

Will my next WLAN work at 1 Gbps?

Transmit Beamforming with Reduced Feedback Information in OFDM Based Wireless Systems

Installing remote sites using TCP/IP

EE 330 Lecture 24. Amplification with Transistor Circuits Small Signal Modelling

Transmit Power Minimization and Base Station Planning for High-Speed Trains with Multiple Moving Relays in OFDMA Systems

Receiver-Initiated vs. Short-Preamble Burst MAC Approaches for Multi-channel Wireless Sensor Networks

A WIDEBAND RADIO CHANNEL MODEL FOR SIMULATION OF CHAOTIC COMMUNICATION SYSTEMS

DS CDMA Scheme for WATM with Errors and Erasures Decoding

Key Issue. 3. Media Access. Hidden and Exposed Terminals. Near and Far Terminals. FDD/FDMA General Scheme, Example GSM. Access Methods SDMA/FDMA/TDMA

Lecture September 6, 2011

The University of Melbourne Department of Mathematics and Statistics School Mathematics Competition, 2013 JUNIOR DIVISION Time allowed: Two hours

Lab 3 Acceleration. What You Need To Know: Physics 211 Lab

ICT 5305 Mobile Communications

Passband Data Transmission I References Phase-shift keying Chapter , S. Haykin, Communication Systems, Wiley. G.1

Interference Avoidance with Dynamic Inter-Cell Coordination for Downlink LTE System

ECE-517 Reinforcement Learning in Artificial Intelligence

Memorandum on Impulse Winding Tester

On the Scalability of Ad Hoc Routing Protocols

March 13, 2009 CHAPTER 3: PARTIAL DERIVATIVES AND DIFFERENTIATION

Interference Coordination Strategies for Content Update Dissemination in LTE-A

A Novel D2D Data Offloading Scheme for LTE Networks

16.5 ADDITIONAL EXAMPLES

Distributed Multi-robot Exploration and Mapping

EXPERIMENT #4 AM MODULATOR AND POWER AMPLIFIER

4 20mA Interface-IC AM462 for industrial µ-processor applications

4.5 Biasing in BJT Amplifier Circuits

Communications II Lecture 7: Performance of digital modulation

Network Design and Optimization for Quality of Services in Wireless Local Area Networks using Multi-Objective Approach

PILOT SYMBOL DESIGN FOR CHANNEL ESTIMATION IN MIMO-OFDM SYSTEMS WITH NULL SUBCARRIERS

ECE ANALOG COMMUNICATIONS - INVESTIGATION 7 INTRODUCTION TO AMPLITUDE MODULATION - PART II

Performance Analysis of High-Rate Full-Diversity Space Time Frequency/Space Frequency Codes for Multiuser MIMO-OFDM

Pointwise Image Operations

EE 40 Final Project Basic Circuit

On Eliminating the Exposed Terminal Problem Using Signature Detection

Lecture 11. Digital Transmission Fundamentals

Joint Optimization of Uplink Power Control Parameters in LTE-Advanced Relay Networks

Chapter 2 Introduction: From Phase-Locked Loop to Costas Loop

Architectures for Resource Reservation Modules for Optical Burst Switching Core Nodes *

To Relay or Not to Relay: Learning Device-to-Device Relaying Strategies in Cellular Networks

Integrated Scheduling of Multimedia and Hard Real-Time Tasks

Channel Estimation for Wired MIMO Communication Systems

Foreign Fiber Image Segmentation Based on Maximum Entropy and Genetic Algorithm

UNIT IV DIGITAL MODULATION SCHEME

Base Station Sleeping and Resource. Allocation in Renewable Energy Powered. Cellular Networks

Load Balancing Models based on Reinforcement Learning for Self-Optimized Macro-Femto LTE- Advanced Heterogeneous Network

Answer Key for Week 3 Homework = 100 = 140 = 138

Review Wireless Communications

Modulation exercises. Chapter 3

A NEW DUAL-POLARIZED HORN ANTENNA EXCITED BY A GAP-FED SQUARE PATCH

Investigation and Simulation Model Results of High Density Wireless Power Harvesting and Transfer Method

Table of Contents. 3.0 SMPS Topologies. For Further Research. 3.1 Basic Components. 3.2 Buck (Step Down) 3.3 Boost (Step Up) 3.4 Inverter (Buck/Boost)

THE economic forces that are driving the cellular industry

OpenStax-CNX module: m Elemental Signals. Don Johnson. Perhaps the most common real-valued signal is the sinusoid.

Control and Protection Strategies for Matrix Converters. Control and Protection Strategies for Matrix Converters

Communication Systems. Department of Electronics and Electrical Engineering

A New Design of Private Information Retrieval for Storage Constrained Databases

ECMA-373. Near Field Communication Wired Interface (NFC-WI) 2 nd Edition / June Reference number ECMA-123:2009

A Perspective on Radio Resource Management in B3G

Distributed Coordinated Signal Timing Optimization in Connected Transportation Networks

2600 Capitol Avenue Suite 200 Sacramento, CA phone fax

BOUNCER CIRCUIT FOR A 120 MW/370 KV SOLID STATE MODULATOR

The student will create simulations of vertical components of circular and harmonic motion on GX.

Chapter 2 Summary: Continuous-Wave Modulation. Belkacem Derras

Experiment 6: Transmission Line Pulse Response

THE Cisco visual network index report predicts a drastic

Mobile Robot Localization Using Fusion of Object Recognition and Range Information

Electrical connection

ECE3204 Microelectronics II Bitar / McNeill. ECE 3204 / Term D-2017 Problem Set 7

(This lesson plan assumes the students are using an air-powered rocket as described in the Materials section.)

Industrial, High Repetition Rate Picosecond Laser

EE201 Circuit Theory I Fall

Wrap Up. Fourier Transform Sampling, Modulation, Filtering Noise and the Digital Abstraction Binary signaling model and Shannon Capacity

Modeling and Prediction of the Wireless Vector Channel Encountered by Smart Antenna Systems

EECS 380: Wireless Communications Weeks 5-6

TRADITIONAL wireless sensor networks (WSNs) are constrained

AN303 APPLICATION NOTE

Q-learning Based Adaptive Zone Partition for Load Balancing in Multi-Sink Wireless Sensor Networks

Starvation Mitigation Through Multi-Channel Coordination in CSMA Multi-hop Wireless Networks

BELECTRIC: Enhanced Frequency Control Capability

Knowledge Transfer in Semi-automatic Image Interpretation

Transcription:

Pracical large-scale coordinaed scheduling in LTE-Advanced neworks 1 Giovanni Nardini (1), Giovanni Sea (1), Anonio Virdis (1), Dario Sabella (2), Marco Carei (2) (1) Diparimeno di Ingegneria dell Informazione, Universiy of Pisa, Ialy g.nardini@ing.unipi.i, giovanni.sea@unipi.i, a.virdis@ie.unipi.i (2) Telecom Ialia, Turin, Ialy {dario.sabella, marco.carei}@elecomialia.i Absrac In LTE-Advanced, he same specrum can be re-used in neighboring cells, hence coordinaed scheduling is employed o improve he overall nework performance (cell hroughpu, fairness, and energy efficiency) by reducing iner-cell inerference. In his paper, we advocae ha large-scale coordinaion can be obained hrough a layered soluion: a cluser of few (i.e., hree) cells is coordinaed a he firs level, and clusers of coordinaed cells are hen coordinaed a a larger scale (e.g., ens of cells). We model boh smallscale coordinaion and large-scale coordinaion as opimizaion problems, show ha solving hem a opimaliy is prohibiive, and propose wo efficien heurisics ha achieve good resuls, and ye are simple enough o be run a every Transmission Time Inerval (TTI). Deailed packe-level simulaions show ha our layered approach ouperforms he exising ones, boh saic and dynamic. Index Terms LTE-A, Coordinaed Scheduling, CoMP, Opimizaion T 1. INTRODUCTION HE ever-increasing rend owards higher user bandwidh in LTE-Advanced (LTE-A) cellular neworks [1] finds a naural opponen in iner-cell inerference. Coordinaing neighboring cells, so as o reduce he inerference suffered by each User Equipmen (UE, e.g. a mobile phone), is also he key o achieve higher Signal-o-Inerference-and-Noise Raios (SINRs), hence higher hroughpu, energy efficiency for he same hroughpu, and fairness for cell-edge UEs. Coordinaed Scheduling (CS) is a CoMP (Coordinaed Muli-Poin Transmission and Recepion) echnique ha allows several enodebs (enbs) o coordinae service o a se of UEs: by deciding who addresses whom and using which Resource Blocks (RB), pairs of cell-ues ransmissions can be scheduled concurrenly wih a olerable increase in inerference, hus maximizing he benefis of spaial specrum reuse [2],[3]. Cells can be coordinaed in boh a disribued and a cenralized archiecure. The former relies on enbs running independen algorihms and sharing informaion hrough peer-o-peer iner-enb connecions. This approach may suffer from limied sae visibiliy (i.e., each enb only possesses parial informaion on he sae of he nework, and especially of neighboring cells, hence makes subopimal decisions) and may enail higher iner-enb communicaion laencies. Cenralized coordinaion, insead, can leverage cloud-based archiecures, such as Cloud Radio Access Nework (C-RAN) [4]. This makes i possible 1 Some of he maerial included in he presen paper also appeared, in a preliminary form, in [24] and [25]. 1

o exploi nework-wide informaion o make beer decisions, provided ha he compuaional overhead does no become a boleneck iself. Coordinaing a (possibly large) number of cells enails deciding which cells are acive on which RB, o arge which UE, so as o minimize he inerference and increase he overall hroughpu. In order o do his, he sysem needs o be able o assess he effec of inerference of subses of cells on single UEs. The main problem wih his approach is scale: UE channel reporing is limied in pracice, and an enb can only be expeced o be made aware of he inerference of bu a few (e.g., one or wo) neighboring cells by each UE Errore. L'origine riferimeno non è saa rovaa.,errore. L'origine riferimeno non è saa rovaa. 2. Even hough increasing he coordinaion scale is likely o yield diminishing reurns in he long run, he scale a which coordinaed scheduling is beneficial goes beyond hese figures. Therefore, in his paper we advocae a layered approach: we decompose he problem ino smallscale and large-scale coordinaion (SSC and LSC, respecively): we firs endeavor o coordinae a relaively small cluser of hree neighboring cells, using a level-1 maser node ha arbiraes he provisional schedules of he coordinaed cells. Then, we scale up by coordinaing clusers hrough a level-2 maser node, which capializes on he underlying SSC work. We formulae boh he SSC and he LSC problems as opimizaion problems and discuss heir complexiy, showing ha solving each of hem a opimaliy is impracical. Then we propose fas, ye effecive heurisics for boh problems which can be run a shor imescales. The srenghs of layered coordinaion are several: firs, i improves he performance of he nework as for hroughpu, Qualiy of Service (QoS), fairness and energy efficiency, as we show using deailed muli-cell packe-level simulaions. The improvemens are progressive: SSC alone brings significan benefis (noably, a remarkable increase in cell hroughpu). Adding LSC furher improves he performance, paricularly in erms of fairness, QoS and energy efficiency. Second, hanks o he efficiency of our heurisics, layered coordinaion can be run dynamically, and a fas imescales, possibly a each TTI, hus reaping he benefis of fresh channel qualiy informaion (CQI) and beer coping wih bursy and/or inermien raffic sources (e.g., video). Third, i is flexible: i can be implemened in boh a 2 One mehod o deal wih inerference measuremen is repored in [31], Chaper 15.2: a se of neighboring cells can be configured o ransmi eiher a non-zero- or a zero-power Reference Signal (RS), hence one can measure he inerference wih/wihou ransmission from ha se of cells. RSs are ransmied using Resource Elemens in he Physical Downlink Shared Channel. As more cells are added o he se, more RSs are required, which increases he overhead. 2

Figure 1 Coordinaed Scheduling. Figure 2 Clusering archiecure. Figure 3 CQI reporing. cenralized archiecure, such as C-RAN, and a disribued-ran one, and may accommodae any enb scheduler, e.g., Maximum Carrier over Inerference (Max C/I), Proporional Fair (PF), ec.. CS-CoMP has araced some research laely (see, e.g., [7] and he references herein). Saic approaches (e.g. [12],[13]) have been proposed: each cell has a saically reserved subse of RBs, where i ransmis only exclusively or ogeher wih low-power inerferers. Among he dynamic approaches [8] bears some apparen similariy o ours, in ha i requires a cenral conroller which arbiraes provisional schedules made by he cells. However, i performs considerably worse in pracice, because he conroller by arbiraing single RBs fails o find work-arounds o conflicing requess by he enbs and is hus prone o long-erm unfairness. The res of he paper is organized as follows: Secion 2 repors background on LTE-Advanced. We describe he sysem model in Secion 3. Our layered approach is explained in Secion 4. Secion 5 reviews he relaed work. In Secion 6, we evaluae our framework and compare i o he exising ones. Finally, Secion 7 concludes he paper. 2. BACKGROUND ON LTE-ADVANCED In his secion we describe hose feaures of he LTE-A sysem ha are more relevan o he problem a hand, i.e. downlink scheduling a he MAC layer. In an LTE-A nework, resource allocaion akes place a he level of cells. Cells are implemened a an enodeb (enb), which may be physically realized eiher as a compac eniy, possessing he inelligence o compose cell ransmission schedules (called subframes) a every TTI, or in a spli archiecure, wih a Remoe Radio Head (RRH) conneced o a baseband (BB) uni. In he laer case, BB resources of several cells can be pooled in a cenralized eniy, as in a C-RAN archiecure. Since our problems and soluions can be mapped on boh frameworks via sraighforward archiecural modificaions, we henceforh make reference o he firs deploymen for he sake of consisency and readabiliy. The radiaion paern of a cell may or may no be isoropic. In his las case, cells are usually co-locaed. Scheduling akes place every Transmission Time Inervals, (TTIs), whose duraion is 1ms, and consiss 3

in allocaing a vecor of (Virual) Resource Blocks (RBs) o UEs (one RB goes o one UE only 3 ). Each RB carries a differen amoun of bis depending on he Channel Qualiy Indicaor (CQI) repored by he UE i is addressed o. The CQI increases wih he measured Signal o Noise and Inerference Raio (SINR), and i can be eiher wideband, i.e. covering he enire subframe, or frequency-selecive. In he laer case, a number of per-subband CQI are repored by a UE. However, when assembling a Transmission Block (TB) in a TTI, he enb maps i on he relevan number of RBs and chooses one Modulaion and Coding Scheme (MCS), ypically, he one corresponding o he minimum CQI repored on he allocaed RBs. A single UE is associaed o a cell, whose signal i receives and decodes 4. Transmissions of neighboring cells on he same RBs coun as inerference, which can be miigaed hrough coordinaed scheduling (CS), a CoMP (Coordinaed Muli-Poin Transmission and Recepion) echnique [2]. CS can be exemplified wih reference o Figure 1: cells A and B can arge UEs a and b on he same RB x, since he inerference ha each will perceive from he neighboring cell will be small, whereas hey should use differen RBs, e.g., y and z, o arge UEs c and d, and refrain o ransmi on z and y, respecively, o avoid excessive inerference. Inerference-prone ransmissions imply lower SINR, hence more RBs are required o ransmi he same payload. On one hand, his obviously reduces he capaciy of he nework, allowing fewer UEs o be served simulaneously. On he oher, i negaively affecs he energy efficiency, which also depends on he number of bis per RBs. 3. SYSTEM MODEL This secion deails he hypoheses and goals of his work. For ease of represenaion, we picure he nework as a essellaion of hexagons, as in Figure 2. Each hexagon represens an area covered by hree overlapping cells. We assume ha cells are anisoropic, radiaing a 120 angles, hence each second verex of a hexagon hoss hree co-locaed cells. A number of UEs is deployed in he area: each of hem is associaed o one cell, and i repors wideband CQIs o i. However, he serving cell is made aware of he level of inerference received by each UE from wo oher cells, This informaion is sored by he cell scheduler in he form of four differen CQIs, corresponding o he case when eiher or boh he wo inerferers are mued. 3 Muli-user Muliple-Inpu/Muliple-Oupu (MIMO) echniques are ouside he scope of his paper. 4 We leave ou echniques such as oin processing, whereby wo cells arge he same UE simulaneously, reinforcing he useful signal. 4

Furhermore, we assume ha he nework can be configured so ha cells can be clusered by hree, and all he UEs associaed o a cell repor he inerference from he oher wo cells in he same cluser. Two ways o cluser cells, shown in Figure 2, are considered: - inra-sie clusering: he hree co-locaed cells a a verex form a cluser; - iner-sie clusering: he hree cells radiaing in he same hexagon form a cluser. Clusering will be used as a basis for SSC. The noaion, models and algorihms repored in he res of he paper are independen of how we cluser cells, alhough he resuling performance will of course vary, as we show in Secion 6. For he sake of concreeness, bu wihou any loss of generaliy, we will refer o inra-sie clusering hereafer. We denoe wih A, B, C he hree cells in a cluser, each one of which can allocae M RBs. To make noaion consisen, if x denoes a generic cell, hen x+ 1 and x 1, denoe he nex and he previous ones in he above order (wih wrap-around, i.e., x= A ( x + 1) = B, ( x 1) = C ). Le N ( x ) be he number of UEs associaed o cell x. UEs can hus be idenified by couple (x,), where 1 N ( x). Consider UE associaed o cell x. Is SINR, hence is CQI, will be differen based on wheher cells x+ 1 and x 1 are acive (hus increasing he inerference) or no. This allows us o define four Inerference Logical Subbands (ILSs), corresponding o he four combinaions of aciviy of x+ 1 and x 1, for ha UE, and four differen per-ils CQIs accordingly (see Figure 3). We hus use wo superscrip symbols o denoe he inerference from he oher wo cells. The firs symbol idenifies cell x 1, whereas he second is for cell x+ 1. Symbol + means acive, and means inacive. This way, CQI x,, where T = { ++, +, +, }, denoe he four possible CQIs for a UE associaed o cell x: C Q I + + x, is he one achievable when boh x-1 and x+1 are acive, ec. Se T hus represens he four ILSs for a UE: + + denoes he no-muing ILS, is he double-muing one, and +, + are he single-muing ones. We use he name subband, which is suggesive of muli-band CQIs, o exploi he inheren parallel beween coordinaed scheduling, on one hand, and muli-band scheduling a a single cell, on he oher: in fac, in boh cases, a UE may be served based on one of relaively few CQIs, depending on he (inerference logical) subband(s) i is assigned o, hence scheduling decisions mus ake his ino accoun. However, we recall ha, in muli-band scheduling, subbands have fixed size, whereas in CS he size of ILSs mus be decided. Our goal is o coordinae a relaively high number of cells, hose radiaing in a cluser of up o seven 5

hexagons (i.e., 21 cells), so ha a nework-wide measure (e.g., he overall hroughpu) is maximized. 4. LAYERED APPROACH TO COORDINATION The logical (i.e., funcional) layou of our coordinaion scheme is shown in Figure 4: a level-1 maser (L1M) coordinaes a cluser of hree cells, hus embodying SSC. L1Ms are furher coordinaed by a level-2 maser (L2M), o achieve LSC. The ob of SSC is o decide which subse of cells ransmis on a given RB, wih he aim of opimizing a cluser-wide measure (e.g., he overall hroughpu), given he CQIs of he associaed UEs. During SSC, he L1M will compue he size of he ILSs for each cell in he cluser. The purpose of LSC is insead o arrange ILSs of neighboring clusers so as o minimize he overall inerference, given ha he ILS sizes have already been se in he previous phase. The oupu of he LSC is a se of associaions {RB-ID, ILS-ID}, compued in such a way ha he overall inerference is reduced. Noe ha i is perfecly possible o run SSC only, and sill reap some of he benefis of coordinaed scheduling: in doing so, each cluser will arrange is ILSs auonomously, hence heir placemen will be subopimal due o he absence of LSC (inerference will no be minimized, alhough i will be considerably less han wihou SSC). In he following, we presen SSC and LSC in order: we formalize boh as opimizaion problems, show why solving hem a opimaliy is infeasible, and devise fas heurisics o solve hem. Figure 4 Layered coordinaion. a. Small-scale coordinaion Small-scale coordinaion coordinaes K = 3 cells. Le s x, be he number of RBs allocaed o UE x, wihin ILS. Le Q x, be ha UE s backlog and le r x, denoe he number of bis per RB according o he (one and only) TB forma ha UE x, will be scheduled wih. Le us denoe wih π ( c) he number 6

of bis per RB achievable under CQI c. We denoe wih b x, a binary variable ha is se when UE x, has a RB wihin ILS. Finally, le R be a consan such ha π ( ). A cluser-wise maxhroughpu problem can hen be formulaed as follows: R CQI max max s.. N ( x) rx, sx, px, x { A, B, C} = 1 T r s Q + p x, i x, x, x, x, T ( ) ( 1 ),, ( ) r π CQI + R b x ii x, x, x, x, x, x, ( ) ( ) b s M b x,, iii ( ) 1 ( 1 ),, ( ) p π CQI + R b x iv x, x, x, N ( x) N ( x+ 1) sx, sx+ 1, T = 1 = 1 + + N ( x 1) sx 1, = 1 x ( v) ( + 1) N ( x 1) + + sx+ 1, sx 1, M N x + max, = 1 = 1 N ( x) N ( x+ 1) N ( x 1 ) + + sx, + max sx+ 1,, sx 1, x = 1 = 1 = 1 ( vi) N ( x 1) N ( x) N ( x+ 1 ) ++ ++ ++ + max sx 1,, sx,, sx+ 1, M = 1 = 1 = 1 + b 0,1, s Z x,, vii { } x ( ) + Z ( ) x,, r, p x, viii x, x, (1) The obecive funcion saes ha he cluser hroughpu should be maximized. Noe ha oher, alernaive obecives can be easily subsiued o his one in order o realize differen CS-CoMP sraegies. We will come back o his laer on. Every UE x, has a unique rae r x,, which is muliplied by all he RBs ha are allocaed o ha UE, whaever he ILS hey belong o. p x, denoes he padding, no o be couned as useful bis. Consrain (i) saes ha each UE canno ransmi more han is backlog s worh of raffic, including possible padding bis. Padding is necessary, because he number of RBs is an ineger, and queues may never be empied oherwise. Consrain (ii) saes ha he rae canno exceed he minimum number of bis per RB among all he ILSs i is scheduled in. For insance, if a UE is allocaed RBs wih no inerference ( b x, = 1 ) and wih inerference from boh cells ( b + + x, = 1 ), i will use he smalles number of + + bis per RB, i.e. rx, π ( CQI x, ) =. Noe ha R is a large consan, hence consrain (ii) is inacive if b = (meaning ha ILS does no conribue o he limi). Consrain (iii) saes ha s, = 0 if x, 0 x b =, and s, 1 if b, = 1, hus ensuring consisency. Consrain (iv) saes ha a UE always ges x, 0 x x 7

less han one RB s worh of padding. Consrain (v) saes ha a subframe mus include he RBs ha a cell x allocaes o is UEs x,, whichever heir ILSs (i.e., hose in he firs double sum). However, cell x has o leave enough room in is frame o allow oher cells x + 1 and x 1 o allocae RBs wihou inerference from cell x. Such room is in fac accouned for in he oher hree addenda, which can be furher spli ino wo: firs, he ILSs where he oher cells require exclusive ransmission (i.e., hose wih a superscrip). Second, he ILSs where oher cells require only x o be mued (i.e., hose in he max bracke). These las need no be disoin. Figure 5 shows an example of coordinaed subframe srucure for hree cells A, B, C, over which consrain (v) can be exemplified. Cell A s subframe (he lefmos one) mus have room for all he RBs where: A ransmis o is UEs: firs addendum in (v), corresponding o ILSs 1, 4, 6, 7 in Figure 5; B requess boh A and C no o ransmi: second addendum in (v), ILS 2 in he figure; C requess boh A and B no o ransmi: hird addendum in (v), ILS 3 in he figure; B requess A no o ransmi, whereas C may ransmi: firs elemen of he max bracke in (v), ILS 5 in he figure; C requess A no o ransmi, whereas B may ransmi: second elemen of he max bracke in (v), again ILS 5 in he figure; The las wo erms can overlap, hus we ake heir maximum insead of heir sum. Noe ha inequaliy (v) is verified wih some slack a cell A, i.e. here are some unused RBs (boom pars of ILSs 4 and 6). We will come back o his laer on. Consrain (vi) describes he fac ha he clusers of RBs where muing of one or wo cells is required mus occupy he same posiions in he subframes of he hree cells. Finally, consrains (vii-viii) define he domain of he problem variables. The above problem is a mixed ineger-nonlinear problem (MINLP), wih a size of ( ) O( K N 2 K ) O K N T = variables and consrains, N being he overall number of UEs. Nonlineariy comes from he produc in boh he obecive funcion and consrain (i), whereas he max operaor in consrains (v-vi) could easily be linearized 5. MINLPs are NP-hard in general. As anicipaed, he srucure of his one is indeed similar o ha of a muli-band-cqi scheduling (i.e. one where a MaxC/I allocaion has o be made on per-subband CQIs), subbands being replaced by ILSs, wih he added complicaion ha he dimension of ILSs is no known in advance, bu is obained as a resul, as 5 This problem can be reformulaed as a mixed-ineger-linear problem (MILP), hrough a careful reformulaion (omied for he O 2 ^ 2 K. sake of conciseness), bu only a he price of increasing he number of variables o ( ( )) 8

per consrains (v-vi). Since he muli-band-cqi scheduling problem has been proven o be NP-hard in [28], his one can only be NP-hard as well. In any case, solving i in a TTI s ime is ou of quesion, even for a small number of UEs, i.e., 10-20), as shown in [24]. Furhermore, we observe ha he reporing informaion required is proporional o he number of ILSs, which increases exponenially wih he number of coordinaed cells K. This clearly indicaes ha clusering cells a larger scales is impracical, and his is why our SSC scheme coordinaes hree cells only. The soluion o he SSC problem yields a se of s x, values. From he laer, he size of each ILS d of a cell, call i d, can be easily obained. However, ILSs can be arranged in several ways, provided ha muual exclusion consrains are me, wihou affecing opimaliy. For insance, he firs hree (double muing) ILSs in Figure 5 could be permued. This degree of freedom will in fac be exploied laer on o achieve larger-scale coordinaion. As anicipaed, we observe ha he above problem formulaion easily accommodaes differen obecives. For insance, a Coordinaed Proporional Fair (CPF) could be achieved by simply subsiuing he obecive wih: max N ( x) r,, 1 x ( sx p x A B C = T x ) Φ, x { },,,, where Φ is he long-erm PF rae achieved by UE (x,), which is available a each cell. Similarly, x, any oher scheduling sraegy ha weighs UEs according o some consan (e.g., urgency-based, queue lengh-based, ec.) can be accommodaed in he same way. Figure 6 shows he informaion flow for opimal SSC. The cells play a very minor role, since hey only uxapose queue informaion (and, possibly, long-erm PF raes or similar informaion) o he CQIs repored by he UEs. Moreover, he L1M composes schedules iself, hence each cell only has o place UE daa wihin i. Figure 5 Subframe srucure and ILSs for hree coordinaed cells. Figure 6 Opimal SSC. 9

Heurisic soluion for small-scale coordinaion The key observaion for our heurisic is ha he ob of he L1M can be made considerably easier (hence faser) by having he cells pre-process informaion firs, and paricipae in he scheduling laer on. As said before, he SSC problem presens similariies wih he muli-band allocaion problem, whose main difficuly is deermining he size of each ILS based on he raffic demand. A a high level, our SSC heurisic can be spli in hree seps. Firs, cells make a provisional resource allocaion, deciding which UEs should be served in which ILS. Each cell hen communicaes is requiremens o he L1M. By doing his, i makes a bid on how large each ILS should be o mee is needs. Second, he L1M compues he acual size of each ILS, by composing he cell bids, and curbing requess ha canno be accommodaed. Then, i sends he resuls back o he cells in is cluser. As a hird and las sep, cells perform he acual resource allocaion, in a subframe where he posiion and size of he ILSs are consisen for he whole cluser. We now explain each sep in more deail. + + Sep 1: For each UE under is conrol, cell x compues γ x, π ( CQI x, ) / π ( CQI x, ) =. Values γ, x indicae he hroughpu gain per RB of UE, wih respec o he no-muing ILS, when ha UE is scheduled in ILS. We pariion UEs ino four ses Γ. Each se Γ x x groups UEs ha should be served in ILS. The associaion is made by esing he following rules in cascade: DM - If γ x, h, add o Γ ; oherwise, x + + - If max { γ,, γ, } ++ - Add o Γ. x DM h and SM x x h +, add o he se wih he higher gain (i.e., x + + Γ if γ x, γ x, ); oherwise, SM h are he double-muing and single-muing hresholds. This way, muing of neighboring cells is requesed only when significan gains can be obained. A his poin, he cell performs a provisional schedule, according o is own policy (e.g., MaxC/I, PF ec.). The CQI used for his procedure depends on which for each ILS, Γ he UE has been assigned o. The provisional schedule provides ha cells bid x N, which is sen o he L1M. x Sep 2: In his sep, he L1M ses he size and posiion of he ILSs. To do so, firs i composes all he bids of he coordinaed cells, which may no be simulaneously feasible. Then, if here is room o do so, i reduces he level of inerference of he RBs, e.g. moves hem from he no-muing ILS o he singlemuing and double-muing ILSs as much as possible. The exac algorihms for composing he bids and 10

upgrading he RBs are inuiively simple, bu cumbersome o describe formally. For his reason, we provide an inuiive descripion here, using examples when required, and refer he ineresed reader o he Appendix for an algorihmic descripion. Since he bids are made independenly by he cells in he cluser, he L1M mus ensure ha hey are muually compaible, by checking he following inequaliies homologous o (v-vi) in (1): + + T x y, y { x 1 x+ 1} y x + + ( N x + max { N x 1, N x x+ 1} ) + ++ ++ ++ max { x 1, x, x+ 1} N + N + max N, N M x ( i) + N N N M ( ii) (2) Noe ha inequaliies (i) are one per cell. If (2) holds, hen all he bids are feasible, and he L1M can pariion he subframe ino ILSs. Oherwise, some of hem mus be reduced. The firs par hen consiss in decreasing by one all he bids of he violaed inequaliies in a round-robin fashion, unil all inequaliies are made o hold again. Once he bids have been composed, we can opimize he ILSs, by upgrading RBs o more proeced ILSs whenever possible. Assume ha he siuaion is he one depiced in Figure 7: in ILS 4, i.e. he one + + + + where B only is mued, i is N C > N, hus here are A N C N RBs where C would ransmi alone A in any case. This means ha hese RBs (shown as shaded in he figure) in fac belong o C s doublemuing ILS, o which hey mus be added. The same happens for he N + + B N RBs in ILS 5 and he C N N RBs of ILS 6, boh o be added o cell B s double-muing ILS. Finally, wih reference o + + B A he no-muing ILS 7, where N + + + + + + is equal o he widh of he ILS, here are A N A N B RBs where A is + + + + ransmiing alone, o be moved o is double-muing ILS, and N B N RBs where A and B are in C fac in single-muing, since C is no using hem, and herefore hey should be moved o ILS 6 insead. Figure 8 shows he final pariioning of he subframe for all he hree cells, obained by moving RBs as described above. Noe ha he double-muing ILSs end up being larger, and he no-muing ILS is considerably smaller. As a final sep, he allocaion can be furher enhanced if here are sill empy RBs: for example, we can rearrange he same no-muing RB so as o obain hree double-muing RBs, one per cell, as shown in Figure 9. By opimizing he subframe pariioning as specified above, he number of RBs allocaed o each cell says he same: raher, we modify heir inerference condiions, allowing higher CQIs o be exploied. Finally, he L1M builds a lis of uples {RB_ID, ILS_ID} and sends i o he cells. 11

Sep 3: This is where cells perform he acual scheduling. As ILSs have a well-defined size, any muliband scheduling algorihm can be adaped o his purpose. As already saed, solving opimally he muli-band version of he mos common scheduling algorihms (e.g., MaxC/I and PF) is NP-hard in general (see [28]). We herefore use a commonplace heurisic, which consiss in filling up one ILS a a ime, saring from he double-muing one. Considering ILS, he cell assigns RBs o UEs ha were insered in Γ x N x a sep 1, using he same algorihm as in sep 1. If some RBs remain unallocaed, hen we upgrade UEs from one of he less proeced ses (i.e., corresponding o ILSs in which more cells ransmi simulaneously). UEs are upgraded in order of decreasing gain γ x,. Vice-versa, if no all UEs belonging o Γ x can be served in ILS (e.g., because is widh was reduced during sep 2), we move he remaining ones o he se of a less proeced ILS, i.e., he one where hey have he highes CQI. Once a ISL is full, he nex one is considered. The complexiy of he SSC heurisic is affordable: cells are required o do no more han heir scheduling ob, and he ask of L1M is compuaionally rivial. Moreover, he informaion being exchanged beween he cells and he L1M (shown in Figure 10) is limied, and independen of he number of UEs or he raffic load. For his reason, SSC can be run dynamically, a a TTI imescale. Moreover, he heurisic SSC can accommodae differen (sub-band aware) scheduling algorihms a he cell, e.g., Proporional Fair, Max C/I, ime-based prioriy, ec. I is also worh noing ha he SSC heurisic allows a las word (i.e., sep 3) o he cells, which is done on purpose o achieve scheduling consisency: suppose, in fac, ha he PF crierion would selec UE x, o be scheduled in he double muing ILS, since i is us above he DM h hreshold. If he L1M reduces he double-muing ILS for x, we do wan o allow cell x o choose wheher o schedule in some oher ILS (e.g., eiher of he singlemuing ones) raher han having o drop i alogeher. Similarly, if he double-muing ILS is larger han expeced, we sill wan x o be in conrol of which addiional UEs will be promoed o i (e.g., saring from hose nearer o he DM h hreshold). 12

Figure 7 Frame composiion. Figure 8 Resuling frame pariioning. Figure 9 Furher frame opimizaion. Figure 10 SSC heurisic. Figure 11 LSC informaion flow. b. Large-scale coordinaion I is ofen he case ha cells of neighboring clusers exer a considerable inerference on he UEs of a cell. These are however subec o independen, uncoordinaed insances of SSC, which may resul in inerference-prone schedules if inerfering anennas use he same RBs. We have already ascerained ha i is impracical o handle more han few inerferers per UE (wo, in our algorihm), hence we canno use he SSC approach a a larger scale. We can insead exploi he fac ha he average inerference ha cell x exers on he area covered by cell y can be measured saically. Furhermore, he SSC algorihm described in he previous secion exhibis a degree of freedom, i.e., he posiion of he RBs in each frame: he oupu of he L1M a sep 2, in fac, can be permued. More o he poin, we do no even need ILSs o be coniguous in he subframe. This can be exploied o miigae he iner-cell inerference on a larger scale. More specifically, he oupu of he SSCs of neighboring clusers can be ar- 13

ranged so as o minimize he inerference perceived by heir UEs. We firs formulae he LSC problem as an opimizaion problem, whose obecive is o minimize he overlap of inerfering cells, and hen propose a heurisic soluion. We consider C clusers, deployed as in Figure 4. For each couple of cells x and y belonging o clusers i and respecively, we define an Inerference Coefficien (IC) α, which measures he average iner- x, y ference ha x s UEs will suffer from cell y. In general, α x, y α, since cells are anisoropic. We call y, x T n he se of all ILSs of a cluser n, and S x he se of ILSs where cell x is acive (so ha Sx Tn s he size of ILS s (given by he SSC). We define he following variables: bi, s 0,1, where b, = 1 means ha ILS s is allowed o use RB i, 1 i M; - { } i s if x n ), - oi, s, {0,1}, where o i, s, = 1 means ha boh ILSs s and are allocaed in RB i. I is, of course, o = b AND b. i, s, i, s i, The LSC problem can be formulaed as follows: M min α s.. x, y i, s, x, y i= 1 ( s, ) Sx S y o b i, ( s, ) S S ( i) i, s, i, s x y o b i, ( s, ) S S ( ii) i, s, i, x y o 1 M (2 b b ) i, ( s, ) S S ( iii) o i, s, i, s i, x y i, s, M i= 1 b i, s i, s, 0 i, s, T ( iv) b s ( v) i, s s {0,1} o n i, s ( vi) o {0,1} i, s, ( vii) (3) The obecive funcion is o minimize he amoun of overlapping RBs. Consrains (i-iii) are he linear version of he logical AND beween variables b and i, s b i,. Consrain (iv) saes ha ILSs belonging o he same cluser canno overlap, coherenly wih he SSC approach described in he previous secion. Consrain (v) saes ha he sum of RBs allocaed o a ILS s is no less han is size s. Noe ha equaliy will hold in (v) a he opimum in any case. Finally, consrains (vi-vii) show ha variables are 2 2 binary. This problem is a MILP wih O ( M C T n ) 2 2 O ( M C 2 K ) = variables (i.e., around 120000 for seven clusers of hree cells and frames of 50 RBs), hus i is infeasible o solve i a fas ime scales. Figure 11 shows he informaion flow for he LSC. The oupu of he L2M is a map ha 14

maches RBs o ILSs, which he L1M forwards o he cells in is cluser. Heurisic soluion for large-scale coordinaion In order o solve he above problem fas enough, we use a divide-and-conquer approach, i.e., we spli he LSC in several smaller problems. Our heurisic sors cluser according o some order (e.g., saring from he innermos cluser in Figure 4 and going owards he ouer ones), and adds one cluser a a ime: a sep n 2, ILSs belonging o T n are arranged so as o minimize he overall muual inerference wih clusers T, 1 n 1. Clearly, ILSs of he firs cluser can be placed arbirarily in he frame. Then, he following procedure is repeaed for each of he remaining C-1 clusers. For each couple of ILSs s,, we define βs, = αx, y as he inerference ha cells acive in s produce on users s x, y served by cells acive in. Recall ha we defined α as a saic coefficien measuring he average x, y inerference ha UEs under cell x suffer from y. Then, we solve he following opimizaion problem: min s.. M b 1 i, s β i= s T s, n acive in i (4) i, s M b s T ( i) i= 1 i, s s n s Tn b 1 i ( ii) i, s b {0,1} i, s T ( iii) n The sum β acive in i s, in he obecive is an esimae of he overall inerference ha cells acive in s would produce if s included RB i, knowing which ILSs have been already allocaed in ha RB a earlier seps. Consrain (i) saes ha each ILS consiss of s RBs, whereas (ii) saes ha wo ILSs of he same cluser canno overlap. The oupu of (4) is he frame allocaion for cluser n, which will be aken ino accoun for he allocaion of cluser n+ 1. The heurisic requires solving C-1 insances of he above MILP. Noe ha he above problem is a Linear Assignmen Problem, hence can be solved in polynomi- 3 al ime using he Hungarian algorihm (via minor modificaions). Therefore, O( C M ) is an upper bound on he complexiy of his heurisic, again independen of he number of UEs or he load. Moreover, (4) can be solved a opimaliy using coninuous relaxaion, since is coefficien marix is oally unimodular. This means ha (4) is in fac no harder han a LP. Before evaluaing he performance of SSC and LSC (boh oinly and in isolaion), we discuss he relaed work. 15

Figure 13 Example scenario. Figure 12 PFR and SFR resource pariioning. Figure 14 Allocaion using D-ICIC (op) and SSC (boom). 5. RELATED WORK The simples form of iner-cell inerference miigaion are radiional frequency reuse schemes, e.g. a reuse of hree. Alhough hese echniques do reduce he inerference, pariioning he overall bandwidh among cells impairs he overall hroughpu. Enhanced frequency reuse schemes, such as Parial Frequency Reuse (PFR) [12] and Sof Frequency Reuse (SFR) [13], have hus been inroduced. The idea behind PFR is o pariion he bandwidh so ha only a limied amoun of RBs can be used by all cells, while ohers are used wih higher reuse facor. Cell-edge UEs can ake advanage of lower inerference in hese sub-bands. In he SFR scheme, a cell can allocae he enire subframe, bu differen power levels are employed in cell-cener and cell-edge RBs. Two examples of PFR and SFR wih reuse 3 are shown in Figure 12. Their drawback is ha he pariioning is saic, being par of he nework planning phase, hence does no ake ino accoun he dynamic UE and raffic disribuion. Semi-saic Iner-Cell Inerference Coordinaion (ICIC) schemes were hen proposed, based on he above frequency reuse schemes. In [14], UEs are classified ino four inerference condiions according o he achievable specral efficiency wih differen reuse paerns, which are similar o he four muing configuraions used by our SSC. Bandwidh is pariioned according o he number of UE in each configuraion, wihou aking ino accoun heir requiremens. The scheme proposed in [15] achieves a semi-saic PFR via a coordinaion algorihm run by a cenral conroller ha akes ino accoun average UE raes (on all RBs) and heir minimum daa rae requiremens. The oupu is a nework-wide PFR scheme ha each cell can hen enforce. Auhors of [16] also propose an opimal pariioning of he 16

resources based on user rae requiremens. The problem wih such schemes is ha each RB is assigned a fixed reuse facor, which hampers large-scale coordinaion. Furhermore, such cell planning is ineffecive when UEs are deployed non-uniformly, because i inherenly assumes ha he number of reused RBs is symmeric among he coordinae cells. More flexible soluions have been proposed in [17] and [18]. In [17], he algorihm a he Radio Nework Conroller (RNC) gahers he achievable raes of all users on each RB wih and wihou he highes ( dominan ) inerferer among neighboring cells. The RNC loops on each RB and selecs which cell is allowed o ransmi, based on he achievable gain in he overall sysem hroughpu. The RNC communicaes is decisions o he cell, ogeher wih he recommended UE for each RB. If he recommended UEs have no raffic, he cell selecs he backlogged UE ha yields he maximum gain. This soluion is opporunisic, hence unfair, a boh he RNC and he cell level. Reference [18] employs he same RNC/cell framework as [17]. The RNC ieraively solves L MILPs, one per coordinaed cell. A he l-h ieraion, he algorihm updaes he inerference condiion on each RB, based on he resuls of he previous ieraions, and uses he resuls as coefficiens for he obecive funcion. This approach is similar o our LSC heurisic. However, i is worh noing ha our LSC heurisic coordinaes riples of cells, hence solves fewer problems (roughly one hird), and much easier ones besides. For example, consider a scenario wih C = 7 riples of cells, M = 50 RBs and N = 50 UEs per cell. The algorihm in [18] solves 21 MILPs wih M N = 2500 binary variables each, whereas our LSC heurisic solves six LPs M 2 1 = 350 variables, each of which is polynomial. Furhermore, soluions in [17], [18] K wih ( ) require per-ue, per-rb feedback o be conveyed o he RNC, exacly because hey lack pre-scheduling. This makes i impossible, in heir very auhors opinion, o run RNC coordinaion a imescales comparable o he TTI, hence hey adap worse o variable raffic paerns. For example, wih bursy raffic sources he load may vary grealy from one TTIs o he nex. The coordinaion opporuniies of such siuaions can hardly be exploied if algorihms are run a coarse ime granulariies. Auhors of [19] proposed a resource allocaion scheme ha splis he frame ino a reuse zone and a resource isolaion zone. All cells can ransmi simulaneously in he former, whereas in he laer differen RBs (or subchannels, since he paper is based on WiMAX) are allocaed o differen cells, wih muing requiremens. UEs are scheduled in eiher of he wo zones according o he perceived inerference from neighboring cells. A cenral conroller esablishes he pariioning of he resource isolaion zone based on a conflic graph, which deermines which cells canno use he same RBs. However, he 17

decision on he amoun of RBs o allocae o a cell is only made on he number of UEs ha falls in he isolaion zone (he acual raffic is no considered) and is overly conservaive (i is sufficien ha only one UE of a given cell perceives high inerference from a neighboring one in order o consrain he wo cells o use muually disoin resources). Our algorihm, insead, allows a cell o reques a number of RBs for each ILS depending on he channel qualiy and he buffer saus of is UEs. Moreover, he oupu of he cenral conroller is a long-erm resource sharing plan (o be updaed every hundreds of TTI), hus i is less responsive han our scheme. Work [8] is again a wo-sage scheduling mechanism. Each UE can repor is wo dominan inerferers among a se of non-serving enbs. UEs repor he CQI on each RB in hree possible configuraions (boh inerferers acive, boh inacive, dominan inerferer inacive), similarly o wha we do (we also accoun for non-dominan inerferer inacive). Using a hreshold mechanism, he cell decides he opimal muing paern for each pair UE-RB, and runs he Hungarian algorihm [20] ieraively o preassign all he RBs o UEs and o creae a wish lis of muing of inerferers. This is sen o he cenral conroller, which arbiraes conflicing muing requess by solving one MILP per RB. Then, i sends back he resuling muing paern for each cell o enforce i. This scheme can be regarded as a possible compeior for boh our SSC and LSC. For simpliciy, we refer o [8] as D-ICIC from now on. As far as LSC is concerned, we observe ha D-ICIC suffers from scalabiliy problems: on one hand, a lo more informaion has o be conveyed o he cenral conroller (per-rb pairs of {UE, muing} sen from each coordinaed cell, as opposed o K pairs {muing, ISL size}), similar o wha is done in [17], [18]. Moreover, he number of MILPs o be solved is large (a subframe consiss of 50-100 RBs), and heir dimension scales wih he number of coordinaed cells, o such an exen ha i is impossible o run i a imescales comparable o he TTI. Las, D-ICIC solves independen MILPs for each RB (capializing on per-rb CQIs), while our algorihm solves he allocaion problem considering all RBs in a subframe simulaneously. While he approach wih per-rb CQIs is more fine-grained, i also requires anoher algorihm o selec he one and only MCS o be used in he presence of differing CQIs (see Secion 2). As a rivial example, if D-ICIC allocaes RBs 1 and 2 o he same UE, wih a CQI 15 and 1 respecively, a decision is due on wheher i will send raffic on: a) RB 1 only, ransmiing wih a CQI of 15; b) RB 2 only, wih CQI 1, or c) RBs 1 and 2, his ime wih (possibly) CQI 1 o guaranee correc recepion. While b) is obviously o be avoided, i is no immediaely clear ha opion a) is preferable o c). Obaining he opimal configuraion becomes exponenially 18

hard as he number of allocaed RBs increases, and even finding good subopimal configuraions is non-rivial. None of he above works assuming per-rb CQIs seems o ake his aspec ino accoun. As far as SSC is concerned, we observe ha he D-ICIC scheme exhibis a pahological behavior. Consider he simple scenario in Figure 13 wih wo cells, A and B. Call a and b he wo UEs served by A and B, respecively and assume ha b perceives high inerference from A on all he specrum, hence requess A o be mued on each and every RB. Now b, being cell-edge, has a smaller uiliy value (which is a funcion of is curren and long-erm rae in [8]) han a, which is insead cell-cener. Since he cenral eniy assigns each RB separaely and based on he uiliy values, he resuls will be ha A wins each per-rb cones, hence ges he whole frame, and B is mued on all RBs. This leads o underuilizing he resources, since a may no even have enough backlog o fill he frame, and B would sill be prevened o use lefover RBs o address b. This would no happen wih our SSC, which insead srives o compose conflicing requess from he cells, possibly by reducing hem proporionally. A comparison beween he wo resuling allocaions is shown in Figure 14, where K, K A B are he RBs exploied by a and b respecively. 6. PERFORMANCE EVALUATION In his secion we evaluae he performance of our heurisics. Firs, we describe he simulaion models, hen we provide insigh on boh he SSC and LSC, and finally we compare our soluion o oher schemes reviewed in Secion 5. a. Simulaion model Our evaluaion is carried ou using SimuLTE [9]-[10], a sysem-level simulaor, comprising more han 40k lines of obec-oriened C++ code. SimuLTE is developed for he OMNeT++ simulaion framework [11], which includes a considerable amoun of nework simulaion models, including he INET framework [26], wih all he TCP/IP sack, mobiliy, wireless echnologies, ec. SimuLTE simulaes he daa plane of he LTE/LTE-A radio access nework. I allows simulaion of LTE/LTE-A in Frequency Division Duplexing (FDD) mode, wih heerogeneous enbs (macro, micro, pico ec.), using omnidirecional and/or anisoropic anennas, possibly communicaing via he X2 inerface [27]. The SimuLTE proocol sack includes: - A Packe Daa Convergence Proocol Radio Resource Conrol (PDCP-RRC) module, which performs encapsulaion and decapsulaion and Robus Header Compression (ROHC). 19

- A Radio Link Conrol (RLC) module, ha performs fragmenaion and reassembly and implemens he hree RLC modes, namely Transparen Mode (TM), Unacknowledged Mode (UM) and Acknowledged Mode (AM). - A MAC module, where mos of he inelligence of each node resides. Is main asks are encapsulaion of MAC SDUs ino a TB and vice-versa, channel-feedback managemen, H-ARQ, adapive modulaion and coding (AMC). - A Physical-Layer (PHY) module, ha implemens channel feedback compuaion and reporing, daa ransmission and recepion, air channel simulaion and conrol messages handling. I also sores he physical parameers of he node, such as ransmission power and anenna profile, which allows macro-, micro-, pico-enbs o be insaniaed, wih differen radiaion profiles. - enb scheduling in he downlink and uplink direcions. Only downlink raffic is simulaed. The simulaion scenario is depiced in Figure 15. We assume ha he raffic is generaed by a server and forwarded by a rouer o he serving cell of he receiver. The X2 inerface is considered o be ideal (null laency and infinie bandwidh). We consider seven sies. Each sie consiss of hree cells radiaing oward he cener of neighboring hexagons. Each hexagon has hree sies locaed on hree verices. The disance beween differen sies is 500m. We assume 10 MHz bandwidh, resuling in 50 RBs per frame. Pah loss, shadowing and fading models are aken from [21]. Cells radiaion paerns are anisoropic and aenuaion is A( θ ) = 2 { ( θ ) } min 12 70, 25, where θ is he relaive angle beween he cell and he receiver. Transmission power is he same over he whole bandwidh. UEs are saic and randomly deployed, bu equally disribued among cells. Sysem parameers are summarized in Table 1. Saisics are gahered only in he cenral cluser. In order o evaluae he power consumpion of he sysem, we use he model in [30], which assumes ha he consumed power of an acive enb is an affine funcion of he number of ransmied RBs, i.e. P P n = base + ρ, where base P is he baseline power, n is he number of allocaed RBs and ρ is he power per RB. The power model parameers are lised in Table 2. 20

We simulae boh applicaion-specific and synheic raffics. More specifically, Voice over IP (VoIP) and Video on Demand (VoD) applicaions are simulaed. VoIP is modeled according o [23]. The employed codec is he GSM AMR Narrow Band (12.2 kbi/s) wih Voice Aciviy Deecion (VAD), i.e. no packes are sen during silences. The duraions of alkspurs and silence periods are disribued according o Weibull funcions, coherenly wih a one-o-one conversaion model. Header compression is employed. The se of parameers is summarized in Table 3. VoD raffic is aken from a pre-encoded H.263 race file ([22]) whose parameers are summarized in Table 4. As far as synheic raffic is concerned, we use Consan-bi-rae (CBR) sources generaing 100-bye packes each 10 ms. The laer are used o reach sauraion wih a smaller number of UEs. Table 2 - enb power model. 260 W 3.76 W/RB Table 3 VoIP model parameers. Figure 15 Simulaion scenario. Table 1 Sysem parameers. Parameer Value Cellular layou Hexagonal grid Iner-sie disance 500 m Carrier frequency 2 GHz Bandwidh 10 MHz Number of RBs 50 Pah loss model Urban Macro Fading model Jakes (6 ap channels) enb Tx Power 46 dbm Talkspur duraion (Weibull disribuion) Silence duraion (Weibull disribuion) Codec Type VAD Model Header Compression Packe lengh Scale 1.423 Shape 0.824 Scale 0.899 Shape 1.089 GSM AMR Narrow Band (12.2 kbps) w. VAD One-o-one conversaion Acive ( RTP+UDP+IP headers = 6 byes) 32 byes/frame + 6 byes Hdr + 1 bye RLC Table 4 VoD race saisics. Min frame size 27 Byes Max frame size 6806 Byes Mean frame size 560.703 Byes Mean bi rae 15.598 kbps Peak bi rae 136.133 kbps Frames per second 25 b. Resuls Firs, we presen performance resuls for our scheme: we show how o une he hresholds of he SSC heurisic, we demonsrae he added value of employing large-scale coordinaion, we highligh he differences beween inra-sie and iner-sie clusering, and we discuss he ime cos and he opimaliy of our approach. Then, we compare our scheme o some of hose reviewed in Secion 5. Small-scale coordinaion We assume inra-sie clusering. In his scenario, every cluser runs SSC independenly, and LSC is no employed. In Figure 16, we repor he average MAC-layer cell hroughpu achieved in he cells of he 21

cenral cluser in differen load condiions, using several values for he hresholds, wih CBR raffic. The figure shows ha he sysem behaves similarly for a relaively wide range of hresholds. This is because he opimizaion sep a he L1M increases he number of RBs in he proeced ILS as much as possible, hus miigaing he effecs of possible subopimal choices of hreshold values and making he algorihm more robus. Obviously, here is a limi o wha he L1M can do o counerbalance misconfiguraions: a high loads, if he hresholds are so high ha mos UEs end up in he no-muing ILS, which in urn fills mos of he subframe, here is no room for improving he siuaion. In he following, we use values 5 and 2 for he double- and single-muing hresholds, respecively. Figure 16 Average cell MAC hroughpu wih several hresholds Large-scale coordinaion LSC is independen of SSC and may run a differen ime scales. In Figure 17 we show he cumulaive disribuion funcions (CDFs) of he UE MAC-level hroughpu wih 50 and 75 UEs per cell, wih VoIP raffic. Several periods have been esed, expressed as muliples of TTIs in he capions. Our resuls show ha he benefis of a fas-paced LSC show up a higher loads, in erms of cell-edge hroughpu, idenified by he 5 h percenile of he CDF. Proecing cell-edge UEs is in fac a key operaor requiremen. Recall ha he LSC akes he subframe pariioning generaed by he SSC as an inpu. Thus, when he period is a muliple of he TTI, in beween wo ieraions of he LSC algorihm i may happen ha he subframe generaed by one L1M does no fi well in he arrangemen provided by he L2M earlier on. Thus, i is likely ha some UEs are scheduled in inadequae RBs. Clearly, his is no he case when LSC is run on each TTI. On he oher hand, proecion of cell-edge UEs is no achieved a he expenses of a reducion in cell hroughpu, as esified by boh graphs. We use a period of one for he LSC from now on. SSC vs. LSC We now show ha using he second layer of coordinaion on op of he firs one brings addiional benefis. We repor he CDFs of he UE MAC-level hroughpu in Figure 18. VoIP raffic is used. The green 22

line refers o he case wih boh SSC and LSC enabled, whereas he red line represens he use of SSC wihou LSC. In he laer case, cluser run independen, uncoordinaed insances of SSC. The brown line represens he case wih no coordinaion a all. The blue line repors an ideal baseline, obained by sending he same VoIP raffic on a wired Gb-speed link. The CDF obained wih LSC pracically overlaps he laer, leaving he oher progressively behind as he load increases. Figure 19 repors he average MAC-layer cell hroughpu in he same scenario wih a varying number of UEs. LSC also impacs he user QoS. Figure 20 shows he CDF of he Mean Opinion Score (MOS) of VoIP flows in he same scenarios. The MOS measures he qualiy experienced by human users aking ino accoun mouh-o-ear delays and losses (herein including hose a he playou buffer) [23], and ranges from one (uninelligible) o five (perfec). A MOS above hree is considered saisfacory. We observe ha LSC achieves boh a higher average MOS, even a low loads, and a smaller variaion (which implies higher iner-user fairness) han he oher wo. A high loads uncoordinaed resource allocaion leaves abou 20% of he UEs wih a MOS of 1, i.e. an uninelligible conversaion. Finally, he benefis of LSC also show up when energy efficiency is considered. Figure 21 repors he average number of RBs allocaed by each cell in he wo cases and he resuling consumed power, showing ha LSC achieves a higher hroughpu wih roughly one hird of he RBs wih respec o he SSC case. Looking a he average number of allocaed RBs, one may hink ha he nework is underloaded. For insance, in he scenario wih 75 UEs per cell wih SSC only (second bar from he righ in Figure 21, lef), a cell uses an average 12 RBs in a subframe of 50. However, muing has o be aken ino accoun: in he same scenario, he number of requesed RBs per cell is 22.7, and heir muing requess are such ha on average 40.5 RBs are requesed in oal (i.e., he subframe is 80% full). Inra-sie vs. Iner-sie SSC Clusering is used as a basis for he SSC. In his subsecion, we show how he choice of a cluser affecs he performance. Figure 22 repors he CDF of he MAC-level UE hroughpu wih 75 UEs per cell. Using inra- or iner-sie clusering does no affec fairness among UEs, for boh SSC and LSC, as heir CDFs are almos overlapped. Figure 23 shows he number of allocaed RBs and he corresponding power consumpion. Iner-sie clusering allocaes fewer RBs when only SSC is employed, as UEs are more proeced from heir maor inerferers. Insead, he number of allocaed RBs is essenially he same when he LSC is also run. In a radiional RAN deploymen, SSC can be performed wih shor 23

laency if inra-sie clusering is adoped, since he clusered cells are a he same sie. Iner-sie clusering, insead, requires cells o be conneced hrough addiional wiring, hus i may require higher CAPEX and laency. In a C-RAN deploymen, all cells are conneced o a cenral processing uni, which performs resource allocaion. In his case, he wo clusering schemes are equivalen from he CAPEX and laency viewpoins. Time cos and opimaliy We now invesigae he ime cos of our heurisics and he opimaliy raio of he SSC. We have run he SSC and LSC heurisics on a (raher low-end) PC wih 4 Inel Core I7 CPUs a 2.80 GHz, 8 GB of memory and Ubunu 14.04 OS, and he resuls are shown in Figure 24: he average running ime of he SSC is in he order of few ens of microseconds, even a higher loads. On he oher hand, he solving ime of he whole LSC heurisic ranges beween 1.2 ms and 1.6 ms using CPLPEX, and beween 0.9 ms and 1.2 ms using he Hungarian algorihm. Alhough he laer imes are acually above he TTI hreshold, hese figures confirm ha solving he LSC heurisic a TTI imescales is wihin reach of oday s echnology, e.g. by employing more powerful hardware. As for SSC opimaliy, Figure 25 shows a scaerplo of he opimaliy raio agains he whole cluser backlog in some snapshos of a simulaion run wih 20 UEs per cell. A fully-fledged, per-tti comparison is made impossible by he fac ha he ime o solve he SSC opimizaion problem a opimaliy is in he order of minues per TTI. The opimizaion problem is formulaed as a MILP and solved wih an opimaliy gap of 5%, hence opimaliy raios are rescaled by 0.95 o play on he safe side. Figure 25, lef, shows an average opimaliy raio larger han 0.72, when no LSC is run. A lower loads, i.e., when opimaliy is probably less of a concern, more variabiliy can be observed. Figure 25, righ, shows he same resul when he LSC is enabled as well. In his case, even hough he offered load is he same, he average backlog is considerably smaller, hanks o he reduced iner-cluser inerference. Moreover, he opimaliy raio becomes considerably higher (each marker in he scaerplo encompasses a large number of overlapping poins). We are unforunaely unable o provide figures for he LSC heurisic, since CPLEX refuses o solve opimizaion problem (3) when C = 7 and M = 50. When scaling down o a smaller dimension (i.e., C = 3 ), 40 minues of CPLEX compuaion are no enough o solve problem (3) a opimaliy (gaps are sill high, i.e., around 20%-30%). However he bes resuls found hus far by CPLEX pracically overlap hose of our heurisic. 24

Comparison wih oher schemes We compare our scheme agains hree reference ones, namely he non-coordinaed case, he PFR scheme and he D-ICIC algorihm presened in [8]. We use he MaxC/I scheduler. In he noncoordinaed scheme, each cell performs is own scheduling independenly and can exploi he whole frame. This scheme maximizes he uilizaion of he frequency resources by each cell. However, i is severely affeced by iner-cell inerference, since neighboring cells can ransmi on he same RBs. We use he following saic pariioning of he bandwidh for PFR: he firs 20 RBs are shared among all cells, whereas he remaining RBs are employed wih a reuse-3 paern, i.e. 10 RBs per cell. We call hese wo pariions cell-cener subband and cell-edge subband, respecively. A UE i will be scheduled in he cell-cener or in he cell-edge subband according o he power received from is serving cell. If P i R P h, hen i is a cell-cener UE, oherwise i is a cell-edge UE. We se P h equal o -40 and -50 dbm. We also assume ha he cell scheduler possesses wo CQIs for each UE, one for he cellcener subband and one for he cell-edge subband, and schedules he UE in eiher subband using he correc CQI. In D-ICIC, cells periodically do a pre-assignmen phase using he Hungarian algorihm and send muing requess o a cenral conroller. The laer replies o each cell indicaing which RBs can be used for ransmission during he nex period and which canno. Since [8] does no specify how he acual scheduling is carried ou, i.e., wha cells acually do once he cenral conroller has erminaed is ob, we use an algorihm ha firs schedules UEs in he RBs hey were assigned during he pre-assignmen phase. If here are sill backlogged UEs, we schedule hem in lefover RBs using a MaxC/I scheduler. The cenral-level algorihm is run every 10 ms. Figures 26-27 show he CDFs of he frame delay and loss raio of VoD raffic wih a varying number of UEs. Our scheme achieves lower delay and frame losses han he ohers. This is because our scheme, by improving coordinaion, allows higher CQIs, hence fewer RBs for he same ransmission, and makes inerference more predicable. This is imporan wih bursy raffic, such as VoD. In fac, a low-cqis UE may end up ransmiing a (porion of a) large video frame in a single, large TB. This, coupled wih unpredicable inerference, considerably increases he error probabiliy, o a poin where four H-ARQ reransmissions are no enough o decode he PDU a he desinaion. This effec was already observed in [29], in differen condiions. The behavior of he PFR scheme depends heavily on he hreshold value P h. Wih -40 dbm, more UEs will fall ino he cell-edge subband han wih -50 dbm. On one hand, his 25

makes UEs more proeced from inerference. On he oher hand, his would overload he cell-edge subband faser, since i is only 10 RB wide. Overloading he cell-edge subband increases ransmission delays, as shown in Figure 27. Our resuls sugges ha D-ICIC causes unfairness: in fac, is delay and frame loss raio are comparable o hose obained wih PFR for abou 60% of UEs, bu are much higher for he ohers. A low loads, e.g. wih 10 UEs per cell, D-ICIC is worse han no coordinaion a all, since i prevens cells o exploi he whole bandwidh unnecessarily (see he example of Figure 14), and he preassignmen allocaes he same number of RBs o each UE, regardless of heir demand, which is harmful wih highly variable raffic. On he conrary, he pre-allocaion phase of our SSC algorihm reserves RBs according o he users demand and CQIs. Requess may be reduced by he L1M only if hey canno be accommodaed, and he muing paern of he RBs is upgraded by he L1M as long as here is space o do so. Figure 28 shows he average number of allocaed RBs and he average consumed power, in he scenario wih 10 UEs per cell. The laer shows ha, when LSC is used, fewer RBs are allocaed han wih he oher schemes, which resuls in a more energy-efficien allocaion. 7. CONCLUSIONS This paper presened a resource allocaion framework for dynamic CS-CoMP in LTE-Advanced neworks. Coordinaed scheduling addresses he problem of selecing which cells ransmi in which RBs so as o miigae he inerference suffered by UEs. We showed ha in general his problem canno scale o large dimensions, in erms of number of coordinaed cells, due o he amoun of UE channel reporing required and he complexiy involved in manipulaing i. We have hen proposed a layered approach, which splis he problem ino small-scale and large-scale coordinaion. Small-scale coordinaion (SSC) arbiraes a small cluser of hree cells, by pariioning he frame in inerference logical subbands (ILSs). Each ILS defines he subse of cells ha can ransmi in he same RBs. SSC has been used as a basis for large-scale coordinaion (LSC), which was accomplished by defining he posiion of he ILSs in he frame, so as o minimize he inerference among neighboring clusers. We modeled boh SSC and LSC as opimizaion problems, and showed hem o be oo complex o be solved a opimaliy. Thus, we designed fas heurisics ha can be run a TTI imescale. Sysem-level simulaions showed ha our scheme achieves significan benefis in erms of hroughpu, QoS and fairness among UEs, and ouperforms saic and dynamic schemes proposed in he lieraure. Moreover, i keeps he number of allocaed RBs low, hus increasing he energy efficiency. 26

8. ACKNOWLEDGEMENTS The subec maer of his paper includes descripion of resuls of a oin research proec carried ou by Telecom Ialia and he Universiy of Pisa. Telecom Ialia reserves all proprieary righs in any process, procedure, algorihm, aricle of manufacure, or oher resul of said proec herein described. Auhors would like o hank Prof. Anonio Frangioni and Dr. Laura Galli of he Universiy of Pisa for heir useful suggesions. REFERENCES [1] 3GPP - TS 36.300 Evolved Universal Terresrial Radio Access (E-UTRA) and Evolved Universal Terresrial Radio Access Nework (E-UTRAN); Overall descripion; Sage 2. [2] 3GPP - TR 36.819 v11.2.0, Coordinaed muli-poin operaion for LTE physical layer aspecs (Release 11), Sep 2013 [3] Lee, D., Seo, H., Clerckx, B., Hardouin, E., Mazzarese, D., Nagaa, S., Sayana, K., Coordinaed Mulipoin Transmission and Recepion in LTE-Advanced: Deploymen Scenarios and Operaional Challenges, IEEE Communicaions Magazine, February 2012, pp.148-155 [4] C-RAN The Road Towards Green RAN, v. 2.5 whie paper, China Mobile Research Insiue, Oc. 2011 [5] 3GPP - TS 36.211 v12.4.0, Evolved Universal Terresrial Radio Access (E-UTRA); Physical channels and modulaion, Dec 2014 [6] 3GPP - TS 36.213 v12.4.0, Evolved Universal Terresrial Radio Access (E-UTRA); Physical layer procedures, Dec 2014 [7] Paeromichelakis, E., Sharia, M., ul Quddus, A., Tafazolli, R., (2012) On he Evoluion of Muli-Cell Scheduling in 3GPP LTE / LTE-A, IEEE Communicaions Surveys and Tuorials, 2012 [8] Rahman M., and Yanikomeroglu, H. (2010) Enhancing cell-edge performance: a downlink dynamic inerference avoidance scheme wih iner-cell coordinaion, IEEE Transacions on Wireless Communicaions, vol. 9, pp. 1414-1425, April 2010 [9] Virdis, A., Sea, G., Nardini, G. (2014) SimuLTE: A Modular Sysem-level Simulaor for LTE/LTE-A Neworks based on OMNeT++, proc. of SimulTech 2014, Vienna, AT, Augus 28-30, 2014 [10] SimuLTE webpage. hp://www.simule.com [11] OMNeT++, hp://www.omnepp.org [12] Sernad, M., Oosson, T., Ahlen, A., Svensson, A. (2003), Aaining boh coverage and high specral efficiency wih adapive OFDM downlinks, Proc. of VTC 2003-Fall, pp.2486-2490 6-9 Oc. 2003. [13] 3GPP, Sof Frequency Reuse Scheme for UTRAN LTE, 3rd Generaion Parnership Proec (3GPP), R1-050507, May 2005. [14] Fang, L., Zhang, X., (2008), Opimal Fracional Frequency Reuse in OFDMA Based Wireless Neworks, Proc. WiCOM '08, pp.1-4, 12-14 Oc. 2008. [15] Ali, S.H., Leung, V. C. M., (2009) Dynamic frequency allocaion in fracional frequency reused OFDMA neworks, IEEE Transacions on Wireless Communicaions, vol.8, no.8, pp. 4286-4295, Augus 2009. [16] Hoon, K., Youngnam, H., Jayong, K., (2004) Opimal subchannel allocaion scheme in mulicell OFDMA sysems, Proc. of VTC Spring 04 pp.1821-1825 Vol.3, 17-19 May 2004. 27

[17] Li, G., Liu, H., (2006) Downlink Radio Resource Allocaion for Muli-Cell OFDMA Sysem, IEEE Transacions on Wireless Communicaions, vol.5, no.12, pp.3451-3459, December 2006. [18] Kousimanis, C., Fodor, G., (2008) A Dynamic Resource Allocaion Scheme for Guaraneed Bi Rae Services in OFDMA Neworks, Proc. ICC '08, pp.2524-2530, 19-23 May 2008. [19] Arslan, M. Y., Yoon, J., Sundaresan, K., Krishnamurhy, S. V., and Baneree, S., (2013) A Resource Managemen Sysem for Inerference Miigaion in Enerprise OFDMA Femocells, IEEE/ACM Transacions on Neworking, vol.21, no.5, pp.1447-1460, Oc. 2013 [20] Khun, H.W., (1955) The Hungarian mehod for he assignmen problem, Naval Research Logisic Quarerly, vol.2, pp. 83-97, 1955. [21] 3GPP TR 36.814 v9.0.0, Furher advancemens for E-UTRA physical layer aspecs (Release 9), March 2010. [22] hp://www-kn.ee.u-berlin.de/research/race/pics/frametrace/mp4/index6e27.hml [23] Bacioccola, A., Cicconei, C., Sea, G., (2007) User level performance evaluaion of VoIP using ns-2, Proc. NSTOOLS'07, Nanes, France, Oc. 22, 2007. [24] Nardini, G., Sea, G., Virdis, A., Carei, M., Sabella, D., (2014) Improving nework performance via opimizaion-based cenralized coordinaion of LTE-A cells, Proc. of CLEEN 2014, Isanbul, TK, April 6, 2014 [25] Nardini, G., Sea, G., Virdis, A., Sabella, D., Carei, M., (2014) Effecive dynamic coordinaed scheduling in LTE- Advanced neworks, Proc. of EuCNC 2014, Bologna, Ialy, June 23-26, 2014 [26] INET framework for OMNeT++: hp://ine.omnepp.org/ [Accessed July 2014] [27] 3GPP TR 36.420 v11.0.0, X2 general aspecs and principles (Release 11), Sepember 2012. [28] Accongiagioco, G., Andreozzi, M. M., Migliorini, D., Sea, G., (2013), Throughpu-opimal Resource Allocaion in LTE- Advanced wih Disribued Anennas, Compuer Neworks, 57(2013), pp. 3997-4009, Dec. 2013. [29] Sea, G., Virdis, A., (2014) A comprehensive simulaion analysis of LTE Disconinuous Recepion (DRX), Compuer Neworks, 73 (2014), pp.22-40, DOI 10.1016/.comne.2014.07.014, November 2014. [30] EARTH EU proec websie, hps://www.ic-earh.eu/ [31] Dahlman, E., Parkvall, S., Skold, J., 4G LTE/LTE-Advanced for Mobile Broadband, 2 nd ed., 2014 28

Figure 17 Comparison of LSC imescales: CDF of user MAC hroughpu, 50 UEs per cell (lef), 75 UEs per cell (righ) Figure 18 Comparison of coordinaion schemes: CDF of user MAC hroughpu, 50 UEs per cell, (lef), 75 UEs per cell (righ). Figure 19 Comparison of coordinaion schemes: cell hroughpu as a funcion of he number of users. Figure 20 Comparison of coordinaion schemes: MOS of VoIP flows, 50 UEs per cell (lef), 75 UEs per cell (righ). 29

More figures (2) end of he paper 35 30 SSC+LSC SSC No Coordinaion Avg Allocaed RBs per cell 25 RBs 20 15 10 5 0 50 75 Number of UEs per cell Figure 21 Comparison of coordinaion schemes: number of allocaed RBs per cell (lef) and average consumed power (righ). Figure 22 CDF of user MAC hroughpu, 75 UEs per cell, inra-sie vs. iner-sie. 14 12 Avg Allocaed RBs per cell Inrasie Inersie 10 RBs 8 6 4 2 0 SSC SSC+LSC Figure 23 Inra-sie vs. iner-sie, 75 UEs per cell: allocaed RBs per cell (lef) and average consumed power (righ). Figure 24 Running ime of he heurisics: SSC (lef), LSC (righ). 30