About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Proceedings of the 2nd International Conference on Machine Learning and Automation, CONF-MLA 2024, November 21, 2024, Adana, Turkey

Research Article

Deep Reinforcement Learning for Multi-Sequence Combinatorial Optimization Problems

Download192 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.4108/eai.21-11-2024.2354636,
        author={Dai  Zhang and Xue  Zhang},
        title={Deep Reinforcement Learning for Multi-Sequence Combinatorial Optimization Problems},
        proceedings={Proceedings of the 2nd International Conference on Machine Learning and Automation, CONF-MLA 2024, November 21, 2024, Adana, Turkey},
        publisher={EAI},
        proceedings_a={CONF-MLA},
        year={2025},
        month={3},
        keywords={combinatorial optimization problem multiple-sequence cop deep reinforcement learning},
        doi={10.4108/eai.21-11-2024.2354636}
    }
    
  • Dai Zhang
    Xue Zhang
    Year: 2025
    Deep Reinforcement Learning for Multi-Sequence Combinatorial Optimization Problems
    CONF-MLA
    EAI
    DOI: 10.4108/eai.21-11-2024.2354636
Dai Zhang1,*, Xue Zhang1
  • 1: Hitachi (China) Ltd., Corp
*Contact email: zhangd@hitachi.cn

Abstract

We propose a formulation for solving sequential combinatorial optimization problems (COP) by deep reinforcement learning (DRL) method. Some types of decision problems can be reduced to sequence combination optimization problems(SCOP) , such as knapsack problems, flow-shop problems, etc. Besides some single sequence combinatorial optimization problems, multi-sequence combinatorial optimization problems have more complex constraints and are difficult to solve quickly in finite time. In this paper we extend a DRL model based on multiple pointer networks and policy gradient approach to solve multiple-sequence COP. We take flow-shop problem as example to verify the efficiency of improved model. As the result shows, the DRL method can calculate better results within 1 minute than some heuristic algorithms that should optimize with 0.5 hour or 1 hour time consumption. Due to its randomness heuristic optimization method can also produce better solutions, but the difference between DRL and the best heuristic results is within 5%. Therefore, compared to heuristic optimization methods, DRL has competitive optimization capabilities and significantly higher computational efficiency.

Keywords
combinatorial optimization problem multiple-sequence cop deep reinforcement learning
Published
2025-03-11
Publisher
EAI
http://dx.doi.org/10.4108/eai.21-11-2024.2354636
Copyright © 2024–2025 EAI
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL