About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Smart Grid and Innovative Frontiers in Telecommunications. 8th EAI International Conference, EAI SmartGIFT 2024a, Santa Clara, United States, March 23-24, 2024, Proceedings

Research Article

On IT and OT Cybersecurity Datasets for Machine Learning-Based Intrusion Detection in Industrial Control Systems

Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-78806-2_3,
        author={Mohammad Pasha Shabanfar and Yiheng Zhao and Jun Yan and Mohsen Ghafouri},
        title={On IT and OT Cybersecurity Datasets for Machine Learning-Based Intrusion Detection in Industrial Control Systems},
        proceedings={Smart Grid and Innovative Frontiers in Telecommunications. 8th EAI International Conference, EAI SmartGIFT 2024a, Santa Clara, United States, March 23-24, 2024, Proceedings},
        proceedings_a={SMARTGIFT},
        year={2025},
        month={1},
        keywords={Information Technology Operational Technology Datasets Cybersecurity Intrusion Detection System},
        doi={10.1007/978-3-031-78806-2_3}
    }
    
  • Mohammad Pasha Shabanfar
    Yiheng Zhao
    Jun Yan
    Mohsen Ghafouri
    Year: 2025
    On IT and OT Cybersecurity Datasets for Machine Learning-Based Intrusion Detection in Industrial Control Systems
    SMARTGIFT
    Springer
    DOI: 10.1007/978-3-031-78806-2_3
Mohammad Pasha Shabanfar, Yiheng Zhao, Jun Yan,*, Mohsen Ghafouri
    *Contact email: jun.yan@concordia.ca

    Abstract

    Intrusion detection plays a pivotal role in the cybersecurity of industrial control systems (ICS) to safeguard the safety of individuals, communities, and nations. Lately, intrusion detection models based on machine learning have been adopted to improve the detection of cyberattacks. However, there is a lack of a systematic approach to selecting the appropriate dataset for training these models. An appropriately selected dataset should be based on the needed collection environment, i.e., Information Technology (IT) and Operational Technology (OT), and include required specifications of the under-study ICS, e.g., deployed protocols. On this basis, this paper classifies the existing intrusion detection datasets into IT and OT datasets. The IT datasets are investigated from the perspectives of attack/normal traffic inclusion and their anonymity, number of packets, duration, and kind of traffic. On the other hand, the OT datasets are studied based on features such as data protocols, distribution, and data domain. Then, we have discussed the gap between the method of detection and the selection of the appropriate dataset in terms of (i) performance indicators, i.e., detection time and imbalanced distribution of data, and (ii) use case, i.e., summarizing communication layers, protocols, and attack types contained in datasets. Finally, the essential features for constructing an effective cybersecurity dataset are discussed to illustrate how to establish an ideal dataset accordingly.

    Keywords
    Information Technology Operational Technology Datasets Cybersecurity Intrusion Detection System
    Published
    2025-01-09
    Appears in
    SpringerLink
    http://dx.doi.org/10.1007/978-3-031-78806-2_3
    Copyright © 2024–2025 ICST
    EBSCOProQuestDBLPDOAJPortico
    EAI Logo

    About EAI

    • Who We Are
    • Leadership
    • Research Areas
    • Partners
    • Media Center

    Community

    • Membership
    • Conference
    • Recognition
    • Sponsor Us

    Publish with EAI

    • Publishing
    • Journals
    • Proceedings
    • Books
    • EUDL