inis 15(3): e5

Research Article

An Investigation of Performance Analysis of Anomaly Detection Techniques for Big Data in SCADA Systems

Download1483 downloads
  • @ARTICLE{10.4108/inis.2.3.e5,
        author={Mohiuddin  Ahmed and Adnan  Anwar and Abdun Naser  Mahmood and Zubair  Shah and Michael J.  Maher},
        title={An Investigation of Performance Analysis of Anomaly Detection Techniques for Big Data in SCADA Systems},
        journal={EAI Endorsed Transactions on Industrial Networks and Intelligent Systems},
        volume={2},
        number={3},
        publisher={ICST},
        journal_a={INIS},
        year={2015},
        month={5},
        keywords={Anomaly detection, SCADA systems, big data},
        doi={10.4108/inis.2.3.e5}
    }
    
  • Mohiuddin Ahmed
    Adnan Anwar
    Abdun Naser Mahmood
    Zubair Shah
    Michael J. Maher
    Year: 2015
    An Investigation of Performance Analysis of Anomaly Detection Techniques for Big Data in SCADA Systems
    INIS
    ICST
    DOI: 10.4108/inis.2.3.e5
Mohiuddin Ahmed1, Adnan Anwar1, Abdun Naser Mahmood1, Zubair Shah1, Michael J. Maher1
  • 1: School of Engineering and Information Technology, UNSW Canberra, ACT 2600, Australia

Abstract

Anomaly detection is an important aspect of data mining, where the main objective is to identify anomalous or unusual data from a given dataset. However, there is no formal categorization of application-specific anomaly detection techniques for big data and this ignites a confusion for the data miners. In this paper, we categorise anomaly detection techniques based on nearest neighbours, clustering and statistical approaches and investigate the performance analysis of these techniques in critical infrastructure applications such as SCADA systems. Extensive experimental analysis is conducted to compare representative algorithms from each of the categories using seven benchmark datasets (both real and simulated) in SCADA systems. The effectiveness of the representative algorithms is measured through a number of metrics. We highlighted the set of algorithms that are the best performing for SCADA systems.