sis 20(24): e6

Research Article

Machine Learning Based Hybrid Model for Fault Detection in Wireless Sensors Data

Download1341 downloads
  • @ARTICLE{10.4108/eai.13-7-2018.161368,
        author={P.  Raghu  Vamsi and Anjali  Chahuan},
        title={Machine Learning Based Hybrid Model for Fault Detection in Wireless Sensors Data},
        journal={EAI Endorsed Transactions on Scalable Information Systems},
        keywords={Anomaly Detection, Outliers, Fault Detection, Wireless Sensor Networks, Internet Of Things (IOT), Intel Berkeley Research lab (IBRL), Knowledge Discovery, Time-series data, Pattern Recognition, Histogram Based Outlier Score (HBOS), Minimum Covariant Determinant (MCD), Isolation Forests (IF)},
  • P. Raghu Vamsi
    Anjali Chahuan
    Year: 2019
    Machine Learning Based Hybrid Model for Fault Detection in Wireless Sensors Data
    DOI: 10.4108/eai.13-7-2018.161368
P. Raghu Vamsi1, Anjali Chahuan2,*
  • 1: Assistant Professor, Department of CSE, Jaypee Institute of Information Technology, Noida, India
  • 2: Assistant Professor, Department of CSE, Inderprastha Engineering College, Ghaziabad, India
*Contact email:


Wireless Sensor Networks (WSN) refers to a group of spatially deployed and dedicated sensors for sending, recording, and monitoring the physical conditions of the environment and transmitting the collected data to a central location. The major challenge is to extract high level knowledge from such data. Detecting abnormality in such data can help finding the faulty sensor and also the sensor collecting the most interesting reading from the dataset. This paper proposes a machine learning based hybrid model for knowledge discovery that works best with multivariate time-series data. The Intel Berkeley Research lab (IBRL) dataset is one of the most trending dataset collected by a WSN is considered for the study. The spatial-temporal correlation was also taken as reference to find anomalies in the dataset using three models - 1) Histogram Based Outlier Score (HBOS), 2) Minimum Covariant Determinant (MCD) and 3) Isolation Forests (IF). Further, the electrical configuration about components of WSN has been used to find faults among the outliers found in the dataset. The results show that the proposed hybrid model with Isolation Forest outperformed with a precision of 94.86%. The experiment was also able to spot the least trustful or faulty sensors among the deployed sensors in IBRL dataset.