
Research Article
Monte Carlo Reinforcement Learning for Cooperative Spectrum Sensing in Decision Fusion
@INPROCEEDINGS{10.1007/978-3-031-65123-6_16, author={Qingying Wu and Benjamin K. Ng and Han Zhu and Chan-Tong Lam}, title={Monte Carlo Reinforcement Learning for Cooperative Spectrum Sensing in Decision Fusion}, proceedings={Quality, Reliability, Security and Robustness in Heterogeneous Systems. 19th EAI International Conference, QShine 2023, Shenzhen, China, October 8 -- 9, 2023, Proceedings, Part II}, proceedings_a={QSHINE PART 2}, year={2024}, month={8}, keywords={Internet of Things cognitive radio sensor networks cooperative spectrum sensing Monte Carlo Control}, doi={10.1007/978-3-031-65123-6_16} }
- Qingying Wu
Benjamin K. Ng
Han Zhu
Chan-Tong Lam
Year: 2024
Monte Carlo Reinforcement Learning for Cooperative Spectrum Sensing in Decision Fusion
QSHINE PART 2
Springer
DOI: 10.1007/978-3-031-65123-6_16
Abstract
As one of the key enablers, the Wireless Sensor Network (WSN) plays an important role in wide application scenarios of the Internet of Things (IoT). However, the rapid spread of wireless applications contributed to the extreme crowd in the radio spectrum. Cognitive Radio Sensor Network (CRSN) emerges as a promising solution to the problem of spectrum scarcity considering the heterogeneous properties of both the Primary User (PU) and Secondary User (SU). In a multi-stage Cooperative Spectrum Sensing (CSS) system with a fusion center, hard fusion rules are widely used to fusion local decisions due to their simplicity. In this way, the sensing performance is closely related to the underlying parameters of the system but is hard to adjust when the fusion policy is fixed. This paper investigates the application of Monte Carlo Reinforcement Learning (MCRL) algorithms for CSS. Specifically, after replacing the traditional FC with a soft-created Agent, the policy for the fusion on local decisions can be improved intelligently using Monte Carlo Control while positively guiding the optimization of system performance. Experiments demonstrate that the proposed scheme can help achieve an ideal policy for better system performance in the global probabilities of detection and false alarm under various Signal-to-Noise Ratios (SNRs).