Maximum Entropy Deep Reinforcement Learning Based Power Allocation for NOMA Maritime Network

Jiayi He; Yakai Zhang; Zhiyong Liu

Wireless and Satellite Systems. 14th EAI International Conference, WiSATS 2024, Harbin, China, August 23–25, 2024, Proceedings, Part II

Research Article

Maximum Entropy Deep Reinforcement Learning Based Power Allocation for NOMA Maritime Network

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-031-86203-8_18,
    author={Jiayi He and Yakai Zhang and Zhiyong Liu},
    title={Maximum Entropy Deep Reinforcement Learning Based Power Allocation for NOMA Maritime Network},
    proceedings={Wireless and Satellite Systems. 14th EAI International Conference, WiSATS 2024, Harbin, China, August 23--25, 2024, Proceedings, Part II},
    proceedings_a={WISATS PART 2},
    year={2025},
    month={3},
    keywords={maritime network deep reinforcement learning power allocation non-orthogonal multiple access (NOMA)},
    doi={10.1007/978-3-031-86203-8_18}
}

Jiayi He
Yakai Zhang
Zhiyong Liu
Year: 2025
Maximum Entropy Deep Reinforcement Learning Based Power Allocation for NOMA Maritime Network
WISATS PART 2
Springer
DOI: 10.1007/978-3-031-86203-8_18

Jiayi He¹, Yakai Zhang¹, Zhiyong Liu¹^,*

1: School of Information Science and Engineering of HIT

*Contact email: lzyhit@hit.edu.cn

Abstract

In order to address the challenges of high propagation delays and limited service capabilities in maritime satellite communications, unmanned aerial vehicles have been proposed as an airborne backhaul solution to enhance communications between satellites and maritime base stations. The non-orthogonal multiple access (NOMA) framework can solve the user sparsity problem in maritime networks. In this paper, a deep reinforcement learning algorithm is used to solve the nonconvex power allocation problem under NOMA. In order to mitigate the risk of overestimation of Q values and local optimal convergence of Deep Q Network (DQN) algorithm, we propose an algorithm called Soft Agent Critical Ocean Satellite Communication Power Allocation (SAC-OSCPA) based on the idea of maximum entropy and compare it with the traditional DQN algorithm. The main goal of this research is to maximize network throughput in scenarios with randomly distributed users. Simulation results show that the average system throughput is improved by 13.18% with the SAC-OSCPA algorithm, and the average throughput of the worst performing user is significantly improved by 41.59%. These results demonstrate the efficacy of the proposed algorithm in optimizing the communication performance of maritime satellite networks.

Keywords: maritime network, deep reinforcement learning, power allocation, non-orthogonal multiple access (NOMA)

Published: 2025-03-27
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-031-86203-8_18

Maximum Entropy Deep Reinforcement Learning Based Power Allocation for NOMA Maritime Network

Abstract

About EAI

Community

Publish with EAI