Multi-Armed Bandit Learning in IoT Networks: Learning Helps Even in Non-stationary Settings

Rémi Bonnefoi; Lilian Besson; Christophe Moy; Emilie Kaufmann; Jacques Palicot

Cognitive Radio Oriented Wireless Networks. 12th International Conference, CROWNCOM 2017, Lisbon, Portugal, September 20-21, 2017, Proceedings

Research Article

Multi-Armed Bandit Learning in IoT Networks: Learning Helps Even in Non-stationary Settings

Download

359 downloads

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-319-76207-4_15,
    author={R\^{e}mi Bonnefoi and Lilian Besson and Christophe Moy and Emilie Kaufmann and Jacques Palicot},
    title={Multi-Armed Bandit Learning in IoT Networks: Learning Helps Even in Non-stationary Settings},
    proceedings={Cognitive Radio Oriented Wireless Networks. 12th International Conference, CROWNCOM 2017, Lisbon, Portugal, September 20-21, 2017, Proceedings},
    proceedings_a={CROWNCOM},
    year={2018},
    month={3},
    keywords={Internet of Things Multi-Armed Bandits Reinforcement learning Cognitive Radio Non-stationary bandits},
    doi={10.1007/978-3-319-76207-4_15}
}

Rémi Bonnefoi
Lilian Besson
Christophe Moy
Emilie Kaufmann
Jacques Palicot
Year: 2018
Multi-Armed Bandit Learning in IoT Networks: Learning Helps Even in Non-stationary Settings
CROWNCOM
Springer
DOI: 10.1007/978-3-319-76207-4_15

Rémi Bonnefoi¹^,*, Lilian Besson^,*, Christophe Moy¹^,*, Emilie Kaufmann²^,*, Jacques Palicot¹^,*

1: CentraleSupélec (campus of Rennes), IETR, SCEE Team
2: Univ. Lille 1, CNRS, Inria, SequeL Team, UMR 9189 - CRIStAL

*Contact email: Remi.Bonnefoi@CentraleSupelec.fr, Lilian.Besson@CentraleSupelec.fr, Christophe.Moy@CentraleSupelec.fr, Emilie.Kaufmann@Univ-Lille1.fr, Jacques.Palicot@CentraleSupelec.fr

Abstract

Setting up the future Internet of Things (IoT) networks will require to support more and more communicating devices. We prove that intelligent devices in unlicensed bands can use Multi-Armed Bandit (MAB) learning algorithms to improve resource exploitation. We evaluate the performance of two classical MAB learning algorithms, and Thomson Sampling, to handle the decentralized decision-making of Spectrum Access, applied to IoT networks; as well as learning performance with a growing number of intelligent end-devices. We show that using learning algorithms does help to fit more devices in such networks, even when all end-devices are intelligent and are dynamically changing channel. In the studied scenario, stochastic MAB learning provides a up to gain in term of successful transmission probabilities, and has near optimal performance even in non-stationary and non- settings with a majority of intelligent devices.

Keywords: Internet of Things Multi-Armed Bandits Reinforcement learning Cognitive Radio Non-stationary bandits

Published: 2018-03-07
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-319-76207-4_15

Multi-Armed Bandit Learning in IoT Networks: Learning Helps Even in Non-stationary Settings

Abstract

About EAI

Community

Publish with EAI