Research Article
Forget the Myth of the Air Gap: Machine Learning for Reliable Intrusion Detection in SCADA Systems
@ARTICLE{10.4108/eai.25-1-2019.159348, author={Rocio Lopez Perez and Florian Adamsky and Ridha Soua and Thomas Engel}, title={Forget the Myth of the Air Gap: Machine Learning for Reliable Intrusion Detection in SCADA Systems}, journal={EAI Endorsed Transactions on Security and Safety}, volume={6}, number={19}, publisher={EAI}, journal_a={SESA}, year={2019}, month={1}, keywords={Critical Infrastructures, SCADA, Anomaly detection, Machine Learning, SVM, Random Forest, BLSTM}, doi={10.4108/eai.25-1-2019.159348} }
- Rocio Lopez Perez
Florian Adamsky
Ridha Soua
Thomas Engel
Year: 2019
Forget the Myth of the Air Gap: Machine Learning for Reliable Intrusion Detection in SCADA Systems
SESA
EAI
DOI: 10.4108/eai.25-1-2019.159348
Abstract
Since Critical Infrastructures (CIs) use systems and equipment that are separated by long distances, Supervisory Control And Data Acquisition (SCADA) systems are used to monitor their behaviour and to send commands remotely. For a long time, operator of CIs applied the air gap principle, a security strategy that physically isolates the control network from other communication channels. True isolation, however, is difficult nowadays due to the massive spread of connectivity: using open protocols and more connectivity opens new network attacks against CIs. To cope with this dilemma, sophisticated security measures are needed to address malicious intrusions, which are steadily increasing in number and variety. However, traditional Intrusion Detection Systems (IDSs) cannot detect attacks that are not already present in their databases. To this end, we assess in this paper Machine Learning (ML) techniques for anomaly detection in SCADA systems using a real data set collected from a gas pipeline system and provided by the Mississippi State University (MSU). The contribution of this paper is two-fold: 1) The evaluation of four techniques for missing data estimation and two techniques for data normalization, 2) The performances of Support Vector Machine (SVM), Random Forest (RF), Bidirectional Long Short Term Memory (BLSTM) are assessed in terms of accuracy, precision, recall and F1 score for intrusion detection. Two cases are differentiated: binary and categorical classifications. Our experiments reveal that RF and BLSTM detect intrusions effectively, with an F1 score of respectively > 99% and > 96%.
Copyright © 2019 Rocio Lopez Perez et al., licensed to EAI. This is an open access article distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/3.0/), which permits unlimited use, distribution and reproduction in any medium so long as the original work is properly cited.