
Research Article
Robot Navigation in Crowds Environment Base Deep Reinforcement Learning with POMDP
@INPROCEEDINGS{10.1007/978-3-031-18123-8_53, author={Qinghua Li and Haiming Li and Jiahui Wang and Chao Feng}, title={Robot Navigation in Crowds Environment Base Deep Reinforcement Learning with POMDP}, proceedings={Multimedia Technology and Enhanced Learning. 4th EAI International Conference, ICMTEL 2022, Virtual Event, April 15-16, 2022, Proceedings}, proceedings_a={ICMTEL}, year={2022}, month={10}, keywords={Deep reinforcement learning Robot navigation Partially observable Markov decision process}, doi={10.1007/978-3-031-18123-8_53} }
- Qinghua Li
Haiming Li
Jiahui Wang
Chao Feng
Year: 2022
Robot Navigation in Crowds Environment Base Deep Reinforcement Learning with POMDP
ICMTEL
Springer
DOI: 10.1007/978-3-031-18123-8_53
Abstract
With the development of deep learning technology, the navigation technology of mobile robot based on deep reinforcement learning is developing rapidly. But, navigation policy based on deep reinforcement learning still needs to be improved in crowds environment. The motion intention of pedestrians in crowds environment is variable, and the current motion intention information of pedestrian cannot be judged by only relying on a single frame of sensor sensing information. Therefore, in the case of only one frame of input, the pedestrian motion state information is partially observable. To dealing with this problem, we present the P-RL algorithm in this paper. The algorithm replaces traditional deep reinforcement learning Markov Decision Process model with a Partially Observable Markov Decision Process model, and introduces the LSTM neural network into the deep reinforcement learning algorithm. The LSTM neural network has the ability to process time series information, so that makes the algorithm has the ability to perceive the relationship between the observation data of each frame, which enhances the robustness of the algorithm. Experimental results show our algorithm is superior to other algorithms in time overhead and navigation success rate in crowds environment.