
Research Article
Deep Reinforcement Learning for Multi-Sequence Combinatorial Optimization Problems
@INPROCEEDINGS{10.4108/eai.21-11-2024.2354636, author={Dai Zhang and Xue Zhang}, title={Deep Reinforcement Learning for Multi-Sequence Combinatorial Optimization Problems}, proceedings={Proceedings of the 2nd International Conference on Machine Learning and Automation, CONF-MLA 2024, November 21, 2024, Adana, Turkey}, publisher={EAI}, proceedings_a={CONF-MLA}, year={2025}, month={3}, keywords={combinatorial optimization problem multiple-sequence cop deep reinforcement learning}, doi={10.4108/eai.21-11-2024.2354636} }
- Dai Zhang
Xue Zhang
Year: 2025
Deep Reinforcement Learning for Multi-Sequence Combinatorial Optimization Problems
CONF-MLA
EAI
DOI: 10.4108/eai.21-11-2024.2354636
Abstract
We propose a formulation for solving sequential combinatorial optimization problems (COP) by deep reinforcement learning (DRL) method. Some types of decision problems can be reduced to sequence combination optimization problems(SCOP) , such as knapsack problems, flow-shop problems, etc. Besides some single sequence combinatorial optimization problems, multi-sequence combinatorial optimization problems have more complex constraints and are difficult to solve quickly in finite time. In this paper we extend a DRL model based on multiple pointer networks and policy gradient approach to solve multiple-sequence COP. We take flow-shop problem as example to verify the efficiency of improved model. As the result shows, the DRL method can calculate better results within 1 minute than some heuristic algorithms that should optimize with 0.5 hour or 1 hour time consumption. Due to its randomness heuristic optimization method can also produce better solutions, but the difference between DRL and the best heuristic results is within 5%. Therefore, compared to heuristic optimization methods, DRL has competitive optimization capabilities and significantly higher computational efficiency.