EAI Endorsed Transactions on Wireless Spectrum 17(11): e4

Research Article

Efficient Learning in Stationary and Non-stationary OSA Scenario with QoS Guaranty

  • @ARTICLE{10.4108/eai.9-1-2017.152098,
        author={Navikkumar Modi and Philippe Mary and Christophe Moy},
        title={Efficient Learning in Stationary and Non-stationary OSA Scenario with QoS Guaranty},
        journal={EAI Endorsed Transactions on Wireless Spectrum},
        volume={17},
        number={11},
        publisher={EAI},
        journal_a={WS},
        year={2017},
        month={1},
        keywords={Cognitiv e Radio, Machine Learning, Opportunistic Spectrum Access, Upper Confidenc Bound},
        doi={10.4108/eai.9-1-2017.152098}
    }
    
  • Navikkumar Modi
    Philippe Mary
    Christophe Moy
    Year: 2017
    Efficient Learning in Stationary and Non-stationary OSA Scenario with QoS Guaranty
    WS
    EAI
    DOI: 10.4108/eai.9-1-2017.152098
Navikkumar Modi1,*, Philippe Mary2, Christophe Moy1
  • 1: CentraleSupelec, IETR, UMR CNRS 6164, Cesson Sevigne, France 35576
  • 2: INSA de Rennes, IETR, UMR CNRS 6164, F-35043 Rennes, France
*Contact email: navikkumar.modi@centralesupelec.fr

Abstract

In this work, the opportunistic spectrum access (OSA) problem is addressed with sta tionary and non-sta tionary Markov mul ti-armed bandit (MAB) framew orks. We propose a nov el index based algorithm named QoS-UCB tha t balances expl oration in terms of occupancy and quality , e.g. signal to noise ratio (SNR) for transmission, for sta tionary environmen ts. Furthermore, we propose another learning policy , named discoun ted QoS-UCB (DQoS-UCB), for the non-sta tionary case. Our contribution in terms of numerical anal ysis is tw ofold: i) In sta tionary OSA scenario, we numericall y compare our QoS-UCB policy with an existing UCB1 and also show tha t QoS-UCB outperf orms UCB1 in terms of regret and ii) in non-sta tionary OSA scenario, numerical resul ts sta te tha t proposed DQoS-UCB policy outperf orms other sim ple UCBs and also QoS-UCB policy . To the best of our knowledg e, this is the firs learning algorithm which adapts to non-sta tionary Markov MAB framew ork and also quan tifie channel quality inf orma tion.