Proof-of-Concept System for Opportunistic Spectrum Access in Multi-user Decentralized Networks

Poor utilization of an electromagnetic spectrum and ever increasing demand for spectrum have led to surge of interests in opportunistic spectrum access (OSA) based paradigms like cognitive radio and unlicensed LTE. In OSA for decentralized networks, frequency band selection from wideband spectrum is a challenging task since secondary users (SUs) do not share any information with each other. In this paper, a new decisionmaking policy (DMP) has been proposed for OSA in the multi-user decentralized networks. First contribution is an accurate characterization of frequency bands using Bayes-UCB algorithm. Then, a novel SU orthogonization scheme using Bayes-UCB algorithm is proposed replacing randomization based scheme. At the end, USRP testbed has been developed for analyzing the performance of DMPs using real radio signals. Experimental results show that the proposed DMP offers significant improvement in spectrum utilization, fewer subband switching and collisions compared to other DMPs. Received on ; accepted on ; published on


Introduction
Poor utilization of an electromagnetic spectrum and ever increasing demand for spectrum to support data intensive services envisioned in 5G have led academia as well as industrial partners to explore the feasibility of opportunistic spectrum access (OSA).In OSA, secondary (i.e., unlicensed) users (SUs) are allowed to transmit in the vacant licensed bands given that they do not cause any interference to the active licensed users and terminate their transmission as soon as licensed user arrives in the frequency band.The paradigms such as cognitive radio, device-to-device (D2D) communications, unlicensed LTE etc. are based on OSA and have potential to improve the spectrum utilization up to 20-30%.Illustrative examples include an underlay inband unlicensed D2D communications that allow direct communication between SUs over identified vacant licensed subband(s) without the need of base stations or access points for establishing the communication link.Also, cognitive radio and LTEunlicensed allow SUs to access vacant TV spectrum and 5 GHz spectrum, respectively.Though most of the existing works focus on the centralized networks, decentralized networks would be an efficient choice over the centralized approach due to advantages such as ease of implementation, robustness to link or node failures, zero communication overhead and lower delay [1,2].Furthermore, decentralized approach is a preferred choice for public safety networks and proximity-aware social networking services.
The realization of OSA in the decentralized netoworks is a challenging task due to two reasons: 1) Hardware constraints at SUs which limit the number of bands that can be sensed simultaneously, and 2) Higher probability collisions among SUs as they do not share any information with each other.Hence, SUs need decision making policies (DMPs) in order to: 1) Characterize frequency band based on their potential (for e.g.probability of being vacant), 2) Minimize collisions among SUs, and 3) Minimize the number of frequency band switching (FBS) to reduce the penalty incurred in terms of delay, power, hardware reconfiguration and protocol overhead when SU switches from one frequency band to another.From the energy efficiency perspective, number of FBS and collisions should be as minimum as possible.Design and validation of such DMPs using real radio signals in the context of the decentralized networks is the objective the work presented in this paper.
In this paper, a new DMP has been proposed for frequency band characterization and orthogonalization of SUs into optimal set of frequency bands.In the proposed DMP, Bayesian Upper Confidence Bound (Bayes-UCB) algorithm is used for accurate characterization of frequency bands.Furthermore, Bayes-UCB based orthogonization scheme is proposed replacing existing randomization based schemes.Then, a testbed using USRP has been developed as a proof-of-concept system for analyzing the performance of DMPs using real radio signals.Based on experimental results, we show that the proposed DMP is superior in terms of improvement in spectrum utilization when compared to existing DMPs.Added advantages of fewer number of FBS as well as collisions make the proposed DMP energy efficient and hence, suitable for resource-constrained batteryoperated radio terminals.The proposed work is extension of our past work in [3].The novelty of this paper lies in the design of Bayes-UCB based orthogonization scheme and its validation on USRP testbed.
The paper is organized as follows.The literature review of various DMPs is given in Section 2. The proposed DMP is presented in Section 3 followed by the testbed description in Section 4. The experimental results are discussed in Section 5. Section 6 concludes the paper.

Literature Review: Decision Making Policies for Decentralized Networks
In the decentralized networks, SUs need to characterize frequency bands based on the parameters such as probability of being vacant, transition probability from vacant state to occupied state and vice-versa etc.Such characterization allows SU to identify good subset of frequency bands for transmission.However, hardware and timing constraints limit single frequency band selection in each time slot and hence, sequential frequency band selection algorithm is desired.The challenge for such algorithm is to achieve good tradeoff between exploration versus exploitation where SU must choose between exploring all bands to find good bands, and exploiting based on current characterization of bands [3][4][5][6].Various sequential selection algorithms have been proposed including frequentist approach based upper confidence bound (UCB) algorithm and its extensions, ε−greedy, optimization based Kullback-Leibler UCB (KL-UCB), Bayes-UCB and Thompson Sampling [4][5][6][7][8][9].These algorithms are optimal in the sense that they achieve good trade-off with logarithmic regret that is the best one can expect when there is no prior information about frequency band statistics [4][5][6][7][8][9].Here, the term regret refers to the loss in transmission opportunities when compared to the genie-aided algorithm which always selects the vacant band.Recently, Bayes-UCB has been shown to offer better regret compared to other algorithms which makes it our preferred choice for the proposed DMP [3,6,8].
Next task of the DMP is to select the frequency band for data transmission.This is a challenging task especially in the decentralized network where the probability of collisions is high since SUs do not share any information with each other [4][5][6][7][9][10][11][12].To minimize collisions, ρ rand scheme in [5] randomly and independently assigns the rank, R(k) ∈ {1, 2, ..M} to SUs when they join the network.Here, M denotes the number of active SUs in the network.In subsequent time slots, SU with the rank R(k) chooses the frequency band with the R(k) th best quality index based on characterization by underlining algorithm.When SUs collide, a new rank is chosen randomly and independently.The time division fair sharing (TDFS) scheme in [10] is similar to ρ rand except the rank of each SU is rotated in circular fashion between 1 to M to allow an equal access to the optimum frequency bands to all SUs.The DMPs consisting of UCB algorithm for characterization and ρ rand or TDFS scheme for frequency band selection have been proved to have logarithmic regret and hence, asymptotically optimal.However, the number of FBS in TDFS [10] increases linearly with the number of time slots compared to ρ rand , where the rank is changed only when corresponding SU collides with other SUs and hence, offers fewer number of FBS.Thus, ρ rand is preferred when FBS incurs high penalty.The DMP in [5] is further improved in [11] by minimizing the number of collisions as well as FBS among SUs using wider range for rank i.e. 1 ≤ R(k) < N, ∀k.However, the improvement in regret is not substantial due to selection of sub-optimal frequency bands.In [12], variable filtering architecture and its integration with tunable frequency band access DMP, ρ t_rand , has been proposed that takes into account tunable bandwidth requirements of SUs.Still, the number of FBS in [11,12] is high.
To summarize, existing DMPs incur significant regret due to collisions among SUs.The possibility of reducing the number of collisions via characterizing the rank based on past collision events has not been

System Model
Consider the slotted decentralized network consisting of multiple primary users and M SUs.The bandwidth of wideband input signal, B, is divided into N uniform frequency bands of bandwidth, B s .Hence, B s =(B/N ).The status of each band (i.e., vacant or occupied) is independent of the status of other bands.For a given band, vacancy statistic depends on the underlining traffic model which can be either independent and identically distributed (i.i.d.) model or Markovian model.In case of i.i.d.model, the status of i th , i ∈ {1, 2, ...N } band depends on its P vac (i) distribution and is independent of its status in previous time slots.In case of Markovian model, i th band switches its state from being vacant to occupied and vice-versa according to a discrete Markov process with the probabilities, p vo (i) and p ov (i), respectively.Then, using Markov chain analysis, steady-state probability of a frequency band being vacant, denoted as P vac (i), i ∈ {1, 2, ...N }, is given by [13] Basic assumptions made in this paper for SUs in decentralized network are: 1. Infrastructure less decentralized network where all SUs employ the same DMP but do not exchange information with other SUs.
2. SU can sense only one frequency band in each time slot.

P vac
4. All SUs must sense the chosen frequency band at the start of each time slot irrespective of sensing outcomes in the previous time slots.
At the beginning of each time slot, SU chooses the frequency band for sensing.Let N k (t) ∈ {1, 2, .., N } be the frequency band chosen by k th SU in time slot t.The analog front-end and digital front-end of SU filter the chosen band, down-convert it to baseband and pass it to the spectrum detector.If spectrum detector sense it as vacant, it is assumed that the SU transmits over that band.When multiple SUs transmit on the same band i.e., N k (t) = N j (t) for any k j, collision occurs leading to transmission failure.Otherwise, it is assumed that SU transmits successfully.Let Δ k (t) be instantaneous reward of k th SU in time slot t and is given by, Let r k (t) be the total number of successful transmissions by k th SU and is given by, Let S * (t) and S(t) denote the total number of successful transmissions by genie-aided DMP (i.e. the DMP where P vac (i), ∀i are known a priori and central unit allocates the SUs to M best bands) and decentralized DMP, respectively.Then, total loss in terms of transmission opportunities, U(t), up to time t is given by Eq. 6.
For higher utilization of spectrum, U(t) should be as minimum as possible.In addition, the number of FBS and number of collisions, C(t), given by Eq. 7 and Eq. 8, respectively, should be as minimum as possible.

Frequency Band Characterization
The frequency band characterization based on the probability of vacancy of the frequency band is accomplished using the Bayes-UCB algorithm in the proposed DMP.Bayes-UCB is a sequential learning algorithm as shown below in Algorithm 1.In Algorithm 1, H is horizon size, T k (i, t) indicates the number of times the frequency band i is chosen by k th SU up to time t, S k (i, t) indicates the number of times out of T k (i, t), the band i is observed as vacant by k th SU and R(k) is the rank of k th SU.The rank calculation is discussed later in Section 3.3.Initially, all frequency bands are sensed once as shown in Steps 1-6 of Algorithm 1.Then, at each time slot, t > N, Bayes-UCB calculates quality index, G(i, t), ∀i for each band as shown in Step 9 of Algorithm 1.The G(i, t), ∀i calculation involves computation of quantile of order i for a given beta distribution as given below [8].
Higher the value of quality index, higher is the probability that corresponding frequency band is The motivation behind the selection of Bayes-UCB algorithm over others is that Bayes-UCB is proved to offer better balance between exploration and exploitation.For instance, any optimal algorithm must satisfy following condition.
, ∀i, ∀k (10) where KL stands for the Kullback-Leibler divergence and β ≤ 1.For asymptotically optimal algorithm, β=1.It has been proved that the parameter β for Bayes-UCB is higher than other algorithms which guarantees accurate frequency band characterization.Another advantage of Bayes-UCB algorithm is that the number of FBS is low through empirical observations.This means that Bayes-UCB chooses the same band consecutively more number of times than other algorithms.This feature might also be advantageous for accurate estimation of transition probabilities, i.e., P vo and P ov .However, usefulness of transition probabilities for DMP is out of scope of this paper.Next, rank calculation for frequency band selection in decentralized network consisting of multiple SUs is presented.
EAI European Alliance for Innovation

Rank Selection in Multi-User Decentralized Network
For network with single SU, i.e., M=1, the regret of the DMP is given by [4,5,9,10], where P vac (1 * ) is the maximum value of probablity of frequency band being vacant among all bands and 1_worst is set of all bands excluding first band when arranged according to decreasing values of their P vac .
Hence to have lower regret, the rank of SU, R(k)=1, which means SU always chooses the frequency band with highest quality index given by Eq. 9.However, in the decentralized network with multiple SUs, the regret, U M (t), is greater than or equal to U 1 (t) and is given by [4,5], where P vac (k * ) is the k th maximum value of probablity of frequency band being vacant among all bands, M_worst is set of all bands excluding first M bands when arranged according to decreasing values of their P vac and C(t) are the number of collisions when SU chooses any of the M_best frequency bands.
The first component of U M (t) can be minimized using accurate characterization of frequency bands as discussed in Section 3.2.However, to minimize the number of collisions, C(t), rank should be chosen carefully.In the existing works, rank is chosen randomly at each SU whenever collision occurs and the maximum value of rank is M. The drawbacks of this approach are: 1) Single collision leads to multiple collisions among other SUs since the rank is updated randomly, 2) Number of collisions increases exponentially as number of SUs, M, increases, and 3) When P vac of different bands are close to each other, the ordering of frequency bands may not be identical at each SUs.In such case, collision will occur even if SU has different ranks since two different ranks at different SUs may correspond to same frequency band.
In the proposed DMP, a new rank selection scheme has been proposed to alleviate these drawbacks.The proposed scheme is designed using Bayes-UCB algorithm which exploits past collision events to accurately characterize rank for each SU.The proposed rank selection scheme is shown in Algorithm 2. In Algorithm 2, H is horizon size, P k (i, t) indicates the number of times the rank i is chosen by k th SU up to time t, Y k (i, t) indicates the number of times out of T k (i, t), the k th SU does not experience any collision over the band chosen using the rank i and R(k) is the rank of k th SU.

Algorithm 2: Rank characterization and selection using
Bayes-UCB algorithm for k th secondary user Parameters: Output: Initially, all the ranks are chosen once as shown in Steps 1-6 of Algorithm 1.Then, at each time slot, t > N, Bayes-UCB calculates quality index, Z(i, t), ∀i for each rank as shown in Step 9 of Algorithm 1 and Eq. 13 which involves calculation of quantile of order i for a given beta distribution.
Higher the value of quality index, lower is the probability of collision if SU chooses corresponding rank.Hence, SU chooses the rank i having maximum value of quality index as shown in Step 10.Then, SU selects the frequency band as discussed in Algorithm 1.Based on the transmission feedback, parameters S k (N k (t), t) and T k (N k (t), t) are updated as shown in Steps 11-14.Thus, the use of learning algorithm allows SU to continue with the rank with which SU had experienced fewer number of collisions in the past.This means that single collision event can not change the rank of SU.This not only leads to faster convergence of SUs into different ranks but also leads to fewer number of FBS.However, the drawback of using Bayes-UCB algorithm based rank selection is that it can not distinguish between past collisions and recent collisions.This may lead to SU persisting with the same rank which has historically fewer collisions but significant number of collisions in the recent past.To take this into account, the flag cont_coll has been introduced in Algorithm 2. cont_coll becomes 1 if corresponding SUs experience collision on the chosen rank consecutively θ number of times.When cont_coll = 1, Bayes-UCB algorithm resets forgetting all collision events in the past as shown in Steps 15-16 in Algorithm 2. Thereafter, Bayes-UCB start estimating new rank for corresponding SU using the information gained from subsequent collision events without forcing other SUs to change their rank.This features makes the proposed DMP superior to DMPs in [5,10,11].Based on empirical observations, the value of θ can be set anywhere between 5 and 10.The analysis of effect of value of θ on the DMP is out of scope of this paper and is a part of ongoing work.

Proposed USRP Testbed
The proposed USRP testbed is shown in Fig. 2 and is a significant extension of the testbed in [14].It consists of two units: 1) Left hand side unit is primary user traffic generator, and 2) Right hand side unit acts as SUs.Both the units are discussed in detail next.

Primary User Traffic Generator
The chosen design environment for the primary user traffic generator is GNU Radio Companion (GRC) and the hardware platform is made of a USRP from Ettus Research.The main reason for choosing GRC is the precise control on each parameter of the transmission chain compared to other environments.The detailed design of the proposed primary user traffic generator is shown in Fig. 3.In the beginning, number of frequency bands, traffic model (i.i.d. or Markovian) and corresponding frequency band statistics are fixed using the block named Traffic Model in Fig. 3.The transmission bandwidth, which is restricted by bandwidth of analog front-end of USRP, is divided into N uniform bandwidth frequency bands.In each time slot, masking vector of size N is generated by Traffic Model block based on given frequency band statistics.This masking vector can have 1 or 0 values where 1 and 0 indicate that corresponding band is occupied and vacant, respectively.Next step is mapping data to be transmitted on sub-carriers of occupied bands.The data modulation used is a differential QPSK modulation with Gray encoding.This is followed by sub-carrier mapping using OFDM and transmission via USRP.In the proposed tested, number of sub-carriers, center frequency and transmission bandwidth are 256, 433.5 MHz and 1 MHz, respectively.For demonstration purpose, each time slot duration is one second so that it can be followed by human eye.However, it can be reduced to the order of milliseconds and will have no direct effect on the performance of DMP.

Secondary User with Decision Making Policy
The chosen design environment for the SU terminal is Matlab/Simulink and USRP from Ettus Research.USRP is tuned to receive signal of bandwidth 1 MHz centered at 433.5 MHz.The received signal is then downsampled, digitized and passed to the DMP implemented using Simulink.An online learning algorithm of the DMP selects one frequency band in each time slot.The chosen band is sensed using an energy detector.Note that energy detector is not ideal and sensing errors may occur [15].If the band is sensed as vacant, it is assumed that SU transmits over the chosen band.If multiple SUs choose the same frequency band, it is assumed that both users suffer collision and transmission fails.In case of  multiple SUs, each user is independently implemented in Simulink with their respective DMP.In existing work, sensing is assumed as perfect which is not true in real radio conditions.Thus, proposed testbed with nonideal energy detectors will enable to study performance of DMPs in the presence of sensing errors.However, performance comparisons of various detectors and their effect on DMPs is not discussed here due to brevity of the paper.

Synchronization
The synchronization between transmitter and receiver is an important aspect of slotted decentralized network infrastructure considered in this paper.For demonstration purposes, synchronization has been achieved by switching first band from occupied to vacant states or vice-versa in each time slot.This enables SUs to detect the transitions between OFDM symbols as well as to synchronize the energy detection phase on an entire OFDM symbol of the primary traffic.In a real OSA scenario, SU should be able to synchronize with PU network via synchronization signals or pilot carriers.Note that the synchronization band in the proposed approach is not wasted because DMP does not consider it as synchronization band and sees it as possible option for data transmission.The proposed DMP is compared with four other DMPs: 1) ρ rand (α = 2): DMP in [5] with UCB parameter α=2, 2) ρ rand (α = 0.5): DMP in [5] with UCB parameter α=0.5, 3) ρ rand + KLUCB: DMP in [5] where UCB is replaced with KLUCB, and 4) Proposed DMP in [3].Each numerical result reported hereafter is the average of values obtained over 15 independent experiments on USRP testbed and each experiment consider a time horizon of 1000 iterations i.e. 1000 time slots for each SU and one time slot corresponds to one second.It is assumed that all SUs employ the same DMP but do not exchange any information with other SUs.

Experimental Results and Analysis
For Case 1, Fig. 4(a) and Fig. 4(b) show total number of successful transmissions, S(t), in percentage for various DMPs w.r.t.genie-aided DMP when M=2 and M=4, respectively.It can be observed that the proposed DMP offers higher transmission opportunities compared to existing DMPs.Since the probability of for Innovation collision among SUs increases with M, S(t) in % is lower when M=4 compared to M=2.On the other hand, average spectrum utilization for M=4 is higher than the same when M=2.For instance, average spectrum utilization due to licensed users was only 47% for Case 1. Due to OSA with M=2, average spectrum utilization can be increased to 58%, 63%, 66% and 70% using ρ rand (α = 2), ρ rand (α = 0.5), ρ rand + KLUCB and proposed DMPs, respectively.In case of M=4, average spectrum utilization can be increased to 70%, 76%, 78% and 84% using ρ rand (α = 2) , ρ rand (α = 0.5), ρ rand + KLUCB and proposed DMPs, respectively.Fig. 5 and Fig. 6 show total number of successful transmissions, S(t), in percentage for various DMPs w.r.t.genie-aided DMP for statistics in Case 2 and Case 3, respectively.Improvements in the average spectrum utilization, similar to Case 1, can also be observed for Case 2 and Case 3. As discussed in Section 1, the number of FBS should be as minimum as possible for making SU terminals energy efficient.In Fig. 7(a) and Fig. 7(b), number of FBSs of different DMPs are compared for frequency band distributions in Case 1, Case 2 and Case 3 when M=2 and M=4, respectively.It can be observed that the proposed DMP offers lowest number of FBS.Numerically, average number of FBS of the proposed DMP is 4, 2.5 and 2 times lower than that of ρ rand (α = 2), ρ rand (α = 0.5) and ρ rand + KLUCB, respectively.
In additions to FBS, the number of collisions should be as minimum as possible.This is because, collision leads to waste of the energy required fordata preprocessing and transmission and it may be higher than the energy required for FBS.In Fig. 8(a) and Fig. 8(b), the number of collisions suffered by all SUs are compared for frequency band distributions in Case 1, Case 2 and Case 3 when M=2 and M=4, respectively.Numericaly, SUs employing proposed DMP suffers 2, 1.8 and 1.5 times fewer number of collisions than SUs employing ρ rand (α = 2), ρ rand (α = 0.5) and ρ rand + KLUCB, respectively.Thus, lower number of FBS as well as collisions make proposed DMP energy efficient and suitable for resource constrained battery operated SU terminals.Based on experimental results, we argue that proposed DMP using Bayesian MAB algorithm for OSA in multi-user decentralized network is not only superior in terms of spectrum utilization but also energy efficient.

Conclusion and Future Works
An USRP based testbed for experimentally analyzing the performance of decision making policies (DMPs) for opportunistic spectrum access (OSA) in the decentralized cognitive radio networks has been proposed.To the best of our knowledge, the proposed testbed is the first proof-of-concept which compares the performance of various DMPs using real radio signals.Furthermore, experimental results showed that the proposed DMP designed using Bayesian online learning algorithm offers superior performance over existing DMPs in terms of average spectrum utilization, number of frequency band switchings as well as number of

Figure 1 .
Figure 1.The decision making framework of the proposed DMP.

Figure 2 .
Figure 2. Proposed USRP based testbed for analyzing the performance of DMPs using real radio signals and non-ideal spectrum detectors.

Figure 3 .
Figure 3. Detailed block diagram of the proposed primary user traffic generator.

Figure 4 .Figure 5 .Figure 6 .
Figure 4. Comparisons of average S(t) in % of different DMPs with respect to the genie-aided DMP for frequency band statistics in Case 1 with (a) M=2 and (b) M=4.

Algorithm 1: Frequency band characterization and selection using Bayes-UCB algorithm for k th secondary user
SU chooses the frequency band i having R(k) th maximum value of quality index as shown in Step 10.The chosen frequency band is then sensed by spectrum detector as shown in Fig. 1.Based on the sensing outcome, parameters S k (N k (t), t) and T k (N k (t), t) are updated as shown in Steps 11-14.Note that, S k In this section, performance of various DMPs in terms of number of transmission opportunities, i.e. s(t), number of collisions and FBS are compared on the proposed testbed discussed in Section 4. Consider N =8 and since B=1 MHz, we have, B s =125 KHz.For i.i.d.type of frequency band statistics, two different P vac distributions, denoted as Case 1 and Case 2 are considered.Similarly, P vo and P ov distributions for Markovian type of frequency band statistics are given by Case 3.