
Research Article
Maximum Entropy Deep Reinforcement Learning Based Power Allocation for NOMA Maritime Network
@INPROCEEDINGS{10.1007/978-3-031-86203-8_18, author={Jiayi He and Yakai Zhang and Zhiyong Liu}, title={Maximum Entropy Deep Reinforcement Learning Based Power Allocation for NOMA Maritime Network}, proceedings={Wireless and Satellite Systems. 14th EAI International Conference, WiSATS 2024, Harbin, China, August 23--25, 2024, Proceedings, Part II}, proceedings_a={WISATS PART 2}, year={2025}, month={3}, keywords={maritime network deep reinforcement learning power allocation non-orthogonal multiple access (NOMA)}, doi={10.1007/978-3-031-86203-8_18} }
- Jiayi He
Yakai Zhang
Zhiyong Liu
Year: 2025
Maximum Entropy Deep Reinforcement Learning Based Power Allocation for NOMA Maritime Network
WISATS PART 2
Springer
DOI: 10.1007/978-3-031-86203-8_18
Abstract
In order to address the challenges of high propagation delays and limited service capabilities in maritime satellite communications, unmanned aerial vehicles have been proposed as an airborne backhaul solution to enhance communications between satellites and maritime base stations. The non-orthogonal multiple access (NOMA) framework can solve the user sparsity problem in maritime networks. In this paper, a deep reinforcement learning algorithm is used to solve the nonconvex power allocation problem under NOMA. In order to mitigate the risk of overestimation of Q values and local optimal convergence of Deep Q Network (DQN) algorithm, we propose an algorithm called Soft Agent Critical Ocean Satellite Communication Power Allocation (SAC-OSCPA) based on the idea of maximum entropy and compare it with the traditional DQN algorithm. The main goal of this research is to maximize network throughput in scenarios with randomly distributed users. Simulation results show that the average system throughput is improved by 13.18% with the SAC-OSCPA algorithm, and the average throughput of the worst performing user is significantly improved by 41.59%. These results demonstrate the efficacy of the proposed algorithm in optimizing the communication performance of maritime satellite networks.