Resource Allocation for Multicell Device-to-Device Communications in Cellular Network: A Game Theoretic Approach

Device-to-Device (D2D) communication has recently emerged as a promising technology to improve the capacity and coverage of cellular systems. To successfully implement D2D communications underlaying a cellular network, resource allocation for D2D links plays a critical role. While most of prior resource allocation mechanisms for D2D communications have focused on interference within a single-cell sys-tem, this paper investigates the resource allocation problem for a multicell cellular network in which a D2D link reuses available spectrum resources of multiple cells. A repeated game theoretic approach is proposed to address the problem. In this game, the base stations (BSs) act as players that compete for resource supply of D2D, and the utility of each player is formulated as revenue collected from both cellular and D2D users using resources. Extensive simulations are conducted to verify the proposed approach and the results show that it can considerably enhance the system performance in terms of sum rate and sum rate gain.


INTRODUCTION
The proliferation of mobile multimedia services has greatly increased the demand for higher wireless data rates delivered over cellular networks.Device-to-Device (D2D) communication over cellular networks has emerged as a disruptive technology that can significantly improve the performance of cellular systems [2].D2D enables devices located in close proximity to communicate with each other directly, thus reducing system overhead, increasing spectrum utilization, and improving cellular coverage [10].With these prominent gains, it has been attracting considerable attentions from both academia and industries recently.
D2D communication reusing the cellular spectrum faces many technical challenges owing to the co-existence D2D and cellular transmissions that can mutually interfere.Although there has been considerable research on interference management in D2D, most existing works focus on single-cell scenarios [3]- [12].Intercell interference in D2D, which needs to be coordinated among multiple cells and among cellular user equipments (CUEs) and D2D user equipments (DUEs), remains a key D2D challenge that has been less investigated.For instance, when a D2D link uses downlink cellular resources, a D2D transmitter may cause strong interference to a cellular UE in the neighboring cell which is receiving downlink traffic using the same resource.Likewise, when a D2D pair utilizes uplink resources, the D2D receiver may suffer high interference from a cellular UE in the neighboring cell transmitting uplink traffic to its serving base station (BS).
While there has been much research developed to mitigate interference between cellular and D2D users, resource allocation plays a critical role in efficient interference coordination.In the existing D2D resource allocation solutions, game the-ory is widely applied to characterize the interactions among D2D devices.Typically, D2D pairs are modeled as players competing for the resources, where the utility of each player is defined as function of achievable data rate and generated interference.Based on the utility function, the equilibrium, i.e., the optimal resource allocation, is obtained through analyzing the players' best reply function.
For interference management, such existing approaches can be extended to the multicell case by simply incorporating interference from other cells.However, they fail to determine the resource configurations when a D2D exploits the commonly shared spectrum of multiple cells.For example, when two users equipment (UEs), say UE 1 and UE 2, are closeby and are reusing the common resources of multiple cells, they must be able to establish a direct link.In this case, D2D pair can generate interference to other cells, hence, a simple extension of previous game models is not able to accommodate this dedicated multicell scenario.
In this paper, we investigate the resource allocation problem for multicell D2D communications.A repeated gametheoretic resource allocation approach is proposed.To the best of our knowledge, this work is an early attempt at addressing the resource allocation for a D2D link utilizing common spectrum resources of multiple cells.The major contributions made in this paper are summarized as follows.
• We consider the scenario that a D2D link reuses common uplink resources of multiple cells.In this scenario, we define the sum rates of BSs and formulate the resource allocation problem.
• We develop a finite repeated game based on the Nash Equilibrium/Equilibria (NE) analysis of a static version.Distinct from existing works that modeled D2D pairs as players, the proposed game characterizes the Base Stations (BSs) as players competing for resource supply of D2D, and defines the utility of each player as revenue collected from both cellular and D2D users using resources.We also propose a resource allocation algorithm.
• We examine the impacting factors and system performance through extensive numerical experiments.The results verify that the approach developed in this paper considerably improve the system performance in terms of sum rate and sum rate gain.It further provides a systematic insight into the resource allocation for multicell D2D communications.
The remainder of this paper is organized as follows.Section 2 briefly reviews the related work.Section 3 presents the system model, in Section 4, we develop a repeated game and analyze the equilibrium to address the resource allocation problem, a resource allocation algorithm is also presented in this section.Section 5 presents the simulation results to validate the proposed model.We conclude the paper in Section 6.

RELATED WORK
Resource allocation for D2D communications is a critical issue that deserves thorough consideration to efficiently address the interference.Various works have been proposed, most of them formulated the problem from an optimization perspective [1,11,6], and they share the same objective in mitigating the interference while improving the QoS performance of the system.However, these studies cannot provide direct insight to the system interact.
Recently, leveraging game theory to allocate D2D resources has become an active research topic since the game theory can provide thorough understanding of the complex interactions among independent rational players.In general, the game model for D2D resource allocation can be classified in two groups: non-cooperative and cooperative.
In the non-cooperative games, D2D UEs are commonly viewed as players competing for resources, for example, [9], [13], [12], and [8] address the D2D resource allocation using auction games.In particular, [9] takes the energy efficiency into account of the optimization objective.[13] presents a sequential second price auction where all the spectrum resources were considered as a set of resource units sequentially auctioned off by groups of D2D pairs.A non-monotonic descending price auction algorithm was presented by [12].The utility function in this game factored the channel gain from D2D and the costs for the system.In [15], a distributed energy-efficient resource allocation algorithm was proposed by exploiting the properties of the nonlinear fractional programming.In [7], authors developed a Stackelberg game, in which a cellular UE and a D2D UE form a leader-follower pair.Then a joint scheduling and resource allocation scheme to improve the performance of D2D communication was proposed.
The cooperative game has also been explored for D2D resource allocation.In [14], a coalitional game with transferable utility is developed.In this game, each D2D user attempts to maximize its own utility and has the incentive to cooperate with other users to form a strengthened user group.In this way, the user can increase the probability of winning its preferred spectrum resources.
In this paper, we study the resource allocation problem for multicell D2D communications using non-cooperative game theory.The novelty of our work compared with aforementioned literature is twofold.First, most of prior works investigated resource allocation in a single cell.Although we have investigated resource allocation for D2D link spanning two neighboring cells in our recent work [3], the case for D2D pair reusing common resource of multiple cells has not been addressed yet.In this work, we extend [3] and design a finite repeated game to solve such problem.Second, unlike existing approaches that typically characterize the D2D users as players, we model the BSs as the players.

SYSTEM MODEL AND ASSUMPTIONS
We consider the scenario that a D2D pair reuses the common resources of m cells (m ≥ 2).In each cell, cellular users communicate with the base station utilizing the same resources among all cells.We assume that the D2D pair reuses the uplink resources with cellular users while cellular communications utilize either uplink or downlink resources, thus they are able to work with coordination from BSs.
We define the sum rate of BSs for the above scenario.Let Gij be the channel power gain between the transmitter i and receiver j over either the cellular link or the D2D link.Let N0 be the noise power of additive white gaussian noise (AWGN) at the receiver.At the i-th BS, the sum rate of uplink direction is if the downlink resources are used, then where Bi is the bandwidth of i-th BS, GC i BS i is the gain between cellular UE Ci and BS i, GBS i C i is the gain between BSi and CUE Ci, G dBS i is the gain between D2D transmitter and BSi, PBS i , PC i , and P d are the power of transmitter BSi, DUE i, and DUE, respectively.
The sum rates of BSs are allowed to obtain their payoffs from cellular communications.Note that the interference caused by the D2D link has been taken the sum rate into account, which is a key parameter affecting the total revenue of each player.Essentially the more resources, for example bandwidth (or physical resource block in LTE-A), allocated to the D2D, the lower interference it generates.Hence, resources should be carefully allocated for D2D.In the following, we apply game theory to determine the amount of resources that should be allocated to D2D transmitter by each BS.We assume that the channel state information (C-SI) of all involved links in the cell is available to the BS so that BSs are capable of coordinating the radio resources.

REPEATED RESOURCE ALLOCATION GAME FOR MULTICELL D2D
As D2D link reuses the common resources of multiple cells, each operator can charge D2D UEs fees and they have incentives to allocate resources to D2D in order to maximize their payoffs.Therefore, the competition among operators for resource allocation quotas can be formulated as a noncooperative game, in which BSs are modeled as players.In this section, we first present a static game to address the resource allocation problem, and then extend the static game model to a repeated version based on the NE derivations.Note that the number of game repetition is finite so that the D2D pair can obtain resources for data transmission immediately.

Static Resource Allocation Game
We define the utility of a player as the revenues which are collected from cellular users and D2D users using radio resources.Specifically, the utility function of i-th player is defined as where U U i and U D i are the utility of players when uplink and downlink resources are used, respectively.α and β are the charging price of unit data rate supported by the base station and bandwidth, respectively.γ is the cost function of a D2D reused resource, B i d is the bandwidth allocated to D2D link from i-th With respect to γ, a pricing function from [5] is employed, which can be expressed as where x, y, τ are non-negative constants, and τ ≥ 1 guarantees that the cost function is convex.
Since there is a trade-off between power and bandwidth [4], we have without loss of generality, we assume where z is a non-negative constant, then For simplicity, denote either U U i or U D i by Ui.The optimization problem of resource allocation for multicell D2D communications can be formulated as where Bmin and Bmax are the minimum and maximum bandwidth constraints on D2D resource allocation.Note that Bmin refers to the minimum resource demand of a D2D pair for data communications, it is used to implicitly restrict the D2D's transmission power that does not cause harmful interferences to both BSs and CUEs.
As Nash Equilibrium (NE) is a strategy profile that no player can increase its payoff by unilaterally deviating the action, the NE can be obtained by solving the best response function, that is To obtain the NE of the above static game, we differentiate each Ui with respect to where the solution of ( 11) is the NE of the static game.

Repeated Resource Allocation Game Model
A repeated resource allocation game model is designed based on the NE properties of above static game. Figure 1 plots the illustrative NE cases, where two dashed lines represent the best reply function of BS 1 and BS 2, the solid lines are the bounds of D2D resource demand constraints, and the gray area, formed by black lines and coordinate axes, is the feasible strategy space of players.Figure 1(a), Figure 1(b), and Figure 1(c) show the intersections of dashed lines that locate inside and outside of gray area, which correspond to the cases that the NE exists (Figure 1(a)) and does not exist (Figure 1(b) and (c)), respectively.Obviously, it can be envisioned from Figure 1(a) that the solution of the static game, i.e., the intersection of two dashed lines, could be improved by increasing both B 1 d and B 2 d iteratively while keeping the NE within the gray area.In the extreme, the solution is able to be refined intersecting at the boundary of gray area, namely the upper solid line.As in Figure 1(c), one can expect that the game repetition might enable the NE feasible.Inspired by these observations, we propose a repeated resource allocation game in the following.
Since the BSs are rational players, they can adjust the resource allocations to maximize their payoffs.The adjustment of each player can be expressed as where B i d (t + 1) is the allocated resources at (t + 1)-th stage, a is the adjustment step such that a > 0.
With above equation, the repeated game can be formulated as Note that repetition times of the game are limited.This is because the D2D pair in practical cellular networks has to obtain the resources immediately.For a finite repeated game, the NE can be obtained as long as it exists.

Game Theoretic Resource Allocation Algorithm
Algorithm 1: Resource Allocation Algorithm Input: N0, Gij, x, y, z, τ, α, β, Bmax, tmax, a.A resource allocation algorithm, shown in Algorithm 1, is designed on the above repeated game model basis.The algorithm initializes the sum rate for each BS, and then it steps into a game: the while-loop as indicated between lines 3 and 29.The game will be played repeatedly by updating B i d (line 25) in the case that any termination condition is not satisfied.More specifically, if m i=1 B i d (1) > Bmax, the NE in the first stage of the game does not exist (the case illustrated in Figure 1(b)), the algorithm exits with as described between lines 10 and 12. Also, if t = tmax and m i=1 B i d (1) < Bmin, the NE does not exist, that is, the case illustrated in Figure 1    Note that the resource allocations are unpredictable when the NE does not exist.In response, resource allocations from BSs are all set to be Bmax m or B min m to ensure the fairness.The term of fairness refers to the revenues of all players on the D2D communication being the same, that is, the amount of resources obtained by D2D from each player is equal.

NUMERICAL EXPERIMENTS 5.1 Experiment Setup
We conduct experiments with two typical scenarios.In the first, the D2D pair reuses the uplink resources of two neighboring cells for data transmission.In the second, the D2D pair reuses the uplink resources of three cells.For both settings, each cell has a cellular user that exploits the same resource blocks for cellular communications.
The experiments results involve two aspects: NE evaluation and system performance.The parameter configurations for experiments are listed in Table 1.In the experiments, all of the data presented are collected by averaging the results from 100 runs, which makes the evaluations more representative and not heavily affected by stochastic factors.

NE Evaluation and Discussion
In the first set of experiments, we evaluate NE with different a in both simulation scenarios.Figure 2    Figure 3 draws the NE evaluations for the second simulation scenario when a = 0.01.As we observed from Figure 3(a), (b), (c), the NE is improved with game repeat, which is aligned with the insights provided by Figure 2.While for the case when a = 0.03 and a = 0.08, the same observation can be made.
Based on the above results, we claim that, given the reaction functions of BSs, the NE is sensitive to the adjusting parameter of repeated game and the system constraint.In a practical scenario, one possible way for choosing such adjusting parameter is to set a relatively small value guaranteeing the existence of NE in each game repeat in the initial stage, and then increase empirically.

System Performance Evaluation
In order to further validate the advantage of the developed game, the system performance including the sum rate and the sum rate gain of BSs are evaluated in the second set of experiments.The sum rate of BSs for each simulation case are specified by equations ( 1) or (2), while the latter one is defined as to reflect the sum rate gain of cellular system with D2D communications where Ri is defined by equation ( 1) or ( 2) corresponding to the sum rate of BSs, RD2D is the sum rate of D2D communication, and R cellular is the sum rate of cellular communication in the absence of D2D.
We consider the above two performance metrics for the first simulation scenario in the condition of a = 0.01 and 20 ≤ B 1 d (t) + B 2 d (t) ≤ 30.The comparison results of sum rate and sum rate gain are provided in Figure 4(a) and (b), respectively.In these two figures, we observe that both of sum rate and sum rate gain increase with game repeating, implying that BSs have incentive to allocate more resources to D2D pair for revenue improvement as long as the sum of their allocations meet the constraint and the resources allocated to cellular users are fixed.For the second simulation scenario, we are able to derive the same insights with Figure 4.Those results are skipped here due to the space limitation.In summary, the performance simulations show that the repeated game enhances the system performance significantly.

CONCLUSIONS
In this paper, we have considered the resource allocation problem for multicell D2D communications underlaying a cellular network where a D2D link utilizes common resources of multiple cells.We have proposed a game-theoretic approach to address the resource allocation problem.In the developed game, the BSs have been characterized as players competing for resource allocation quota from the D2D demand, and the utility of each player as payoff from both cellular and D2D communications leasing the resources has been defined.Through the NE derivation and analysis, we have presented a resource allocation algorithm.Simulations have verified that the approach developed in this work significantly enhances system performance.The results have further provided a global insight into resource allocation for multicell D2D communications.
) > Bmax during the game repeating, the game is then terminated immediately with B1 (c), the algorithm exits with B1 d = B2 d = • • • = Bm d = B min m as indicated from line 16 to line 18.At last, if t = tmax and m i=1 B i d (t) ≤ Bmax, the game is stopped with B1 d = B 1 d (t), • • • , Bm d = B m d (t) as described between lines 19 and 21.

Figure 1 :
Figure 1: NE analysis for static resource allocation game.

Figure 2 :
Figure 2: NE evaluations with different a for the first simulation scenario.
(a), (b), (c) display NE evaluations for the first scenario with constraints 20 ≤ B 1 d + B 2 d ≤ 30, when a = 0.01, a = 0.03, and a = 0.08, respectively.In this figure, func1 and func2 represent the best response function of BS 1 and BS 2, and t denotes the game repetition time.

Figure 2 (
Figure 2(a) shows that the intersections of best response func1 and func2 satisfy constraints 20 ≤ B 1 d + B 2 d ≤ 30, indicating that the NE exists.With the game repeating toward improvement, both BSs allocate B 1 d (tmax) and B 2 d (tmax) to D2D users.Thus the D2D pair receives B 1 d (tmax)+B 2 d (tmax) resources in total for data transmission.In Figure 2(b), the NE does not exist at the last repetition as the intersection of func1 and func2 in the last round is located out of the feasible strategy space.In this case, D2D pair obtains B 1 d (2) + B 2 d (2) resources for communication according to the resource allocation algorithm designed in previous section.It is interesting to notice that Figure 2(c) only draws the NE of first two repetitions.This is because both BSs will no longer play the game if the NE does not exist in the second stage, and thus the D2D pair eventually receives

Figure 3 :
Figure 3: NE evaluations for the second simulation scenario when a = 0.01.
Sum rate gain comparison.

Figure 4 :
Figure 4: Performance comparisons for the first scenario.