About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Collaborative Computing: Networking, Applications and Worksharing. 19th EAI International Conference, CollaborateCom 2023, Corfu Island, Greece, October 4-6, 2023, Proceedings, Part II

Research Article

Defeating the Non-stationary Opponent Using Deep Reinforcement Learning and Opponent Modeling

Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-54528-3_4,
        author={Qian Yao and Xinli Xiong and Peng Wang and Yongjie Wang},
        title={Defeating the Non-stationary Opponent Using Deep Reinforcement Learning and Opponent Modeling},
        proceedings={Collaborative Computing: Networking, Applications and Worksharing. 19th EAI International Conference, CollaborateCom 2023, Corfu Island, Greece, October 4-6, 2023, Proceedings, Part II},
        proceedings_a={COLLABORATECOM PART 2},
        year={2024},
        month={2},
        keywords={Deep reinforcement learning Opponent modeling FlipIt game Non-stationary environment},
        doi={10.1007/978-3-031-54528-3_4}
    }
    
  • Qian Yao
    Xinli Xiong
    Peng Wang
    Yongjie Wang
    Year: 2024
    Defeating the Non-stationary Opponent Using Deep Reinforcement Learning and Opponent Modeling
    COLLABORATECOM PART 2
    Springer
    DOI: 10.1007/978-3-031-54528-3_4
Qian Yao1, Xinli Xiong1, Peng Wang1, Yongjie Wang1,*
  • 1: College of Electronic Engineering, National University of Defense Technology
*Contact email: wangyongjie17@nudt.edu.cn

Abstract

In the cyber attack and defense process, the opponent’s strategy is often dynamic, random, and uncertain. Especially in an advanced persistent threat scenario, it is not easy to capture its behavior strategy when confronted with a long-term latent, highly dynamic and unpredictable opponent. FlipIt game can model the stealth interaction of advanced persistent threat. However, it is insufficient for traditional reinforcement learning approach to solve real-time and non-stationary game model. Therefore, how to model a non-stationary opponent implicitly and keep the defense agent’s advantage continuously is essential. In this paper, we propose an extended FlipIt game model incorporating opponent modeling. And then we propose an approach that combines deep reinforcement learning, opponent modeling, and dropout technology to perceive the behavior of a non-stationary opponent and defeat it. Instead of explicitly identifying the opponent’s intention, the defense agent observes the opponent’s last move actions from the game environment, stores the information in its knowledge, then perceives the opponent’s strategy and finally makes a decision to maximize its benefits. We show the excellent performance of our approach whether the opponent adopts traditional, random or composite strategies. The experimental results demonstrated that our approach can perceive the opponent quickly and maintain the superiority of suppressing the opponent.

Keywords
Deep reinforcement learning Opponent modeling FlipIt game Non-stationary environment
Published
2024-02-23
Appears in
SpringerLink
http://dx.doi.org/10.1007/978-3-031-54528-3_4
Copyright © 2023–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL