ws 17(11): e4

Research Article

Efficient Learning in Stationary and Non-stationary OSA Scenario with QoS Guaranty

  • @ARTICLE{10.4108/eai.9-1-2017.152098,
        author={Navikkumar Modi and Philippe Mary and Christophe Moy},
        title={Efficient Learning in Stationary and Non-stationary OSA Scenario with QoS Guaranty},
        journal={EAI Endorsed Transactions on Wireless Spectrum},
        volume={3},
        number={11},
        publisher={EAI},
        journal_a={WS},
        year={2017},
        month={1},
        keywords={Cognitive Radio, Machine Learning, Opportunistic Spectrum Access, Upper Confidence Bound},
        doi={10.4108/eai.9-1-2017.152098}
    }
    
  • Navikkumar Modi
    Philippe Mary
    Christophe Moy
    Year: 2017
    Efficient Learning in Stationary and Non-stationary OSA Scenario with QoS Guaranty
    WS
    EAI
    DOI: 10.4108/eai.9-1-2017.152098
Navikkumar Modi1,*, Philippe Mary2, Christophe Moy1
  • 1: CentraleSupelec, IETR, UMR CNRS 6164, Cesson Sevigne, France 35576
  • 2: INSA de Rennes, IETR, UMR CNRS 6164, F-35043 Rennes, France
*Contact email: navikkumar.modi@centralesupelec.fr

Abstract

In this work, the opportunistic spectrum access (OSA) problem is addressed with stationary and non-stationary Markov multi-armed bandit (MAB) frameworks. We propose a novel index based algorithm named QoS-UCB that balances exploration in terms of occupancy and quality, e.g. signal to noise ratio (SNR) for transmission, for stationary environments. Furthermore, we propose another learning policy, named discounted QoS-UCB (DQoS-UCB), for the non-stationary case. Our contribution in terms of numerical analysis is twofold: i) In stationary OSA scenario, we numerically compare our QoS-UCB policy with an existing UCB1 and also show that QoS-UCB outperforms UCB1 in terms of regret and ii) in non-stationary OSA scenario, numerical results state that proposed DQoS-UCB policy outperforms other simple UCBs and also QoS-UCB policy. To the best of our knowledge, this is the first learning algorithm which adapts to non-stationary Markov MAB framework and also quantifies channel quality information.