
Research Article
A Beam Tracking Scheme Based on Deep Reinforcement Learning for Multiple Vehicles
@INPROCEEDINGS{10.1007/978-3-030-99200-2_23, author={Binyao Cheng and Long Zhao and Zibo He and Ping Zhang}, title={A Beam Tracking Scheme Based on Deep Reinforcement Learning for Multiple Vehicles}, proceedings={Communications and Networking. 16th EAI International Conference, ChinaCom 2021, Virtual Event, November 21-22, 2021, Proceedings}, proceedings_a={CHINACOM}, year={2022}, month={4}, keywords={Internet of Vehicles Beam tracking Deep reinforcement learning}, doi={10.1007/978-3-030-99200-2_23} }
- Binyao Cheng
Long Zhao
Zibo He
Ping Zhang
Year: 2022
A Beam Tracking Scheme Based on Deep Reinforcement Learning for Multiple Vehicles
CHINACOM
Springer
DOI: 10.1007/978-3-030-99200-2_23
Abstract
In Internet of Vehicles (IoV), beam tracking for multiple vehicles is a challenging topic due to the nonlinear mobility and inter-vehicle interference (IVI). This paper considers the scenario that multiple vehicles with high mobility are periodically served by the radiated beams of millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) systems. The main objective is to maximize the probability of successful information transmission in each beam tracking period, where successful transmission is defined by signal-to-interference-plus-noise ratio (SINR) exceeding a threshold. Based on deep reinforcement learning, we propose a position prediction and joint selection (PPJS) scheme for beam selection of multiple vehicles in consideration of both the coverage and IVI. On one hand, long short-term memory (LSTM) network is employed to predict the future trajectory in upcoming beam tracking period for providing better beam coverage. On the other hand, multi-layer perception (MLP) network is designed to select the served beams by taking into account the IVI, where the vehicles are divided into clusters and the objective of beam tracking in each cluster is decomposed to reduce the scheme complexity. Simulation results demonstrate that the proposed PPJS scheme performs better than both the traditional position-based algorithm and deep Q-learning (DQN) algorithm.