casa 19(18): e3

Research Article

An approach to reduce data dimension in building effective Network Intrusion Detection Systems

Download541 downloads
  • @ARTICLE{10.4108/eai.13-7-2018.162633,
        author={Hoang Ngoc Thanh and Tran Van Lang},
        title={An approach to reduce data dimension in building effective Network Intrusion Detection Systems},
        journal={EAI Endorsed Transactions on Context-aware Systems and Applications},
        volume={6},
        number={18},
        publisher={EAI},
        journal_a={CASA},
        year={2019},
        month={8},
        keywords={Intrusion Detection System, Machine learning, Feature selection, UNSW-NB15 dataset},
        doi={10.4108/eai.13-7-2018.162633}
    }
    
  • Hoang Ngoc Thanh
    Tran Van Lang
    Year: 2019
    An approach to reduce data dimension in building effective Network Intrusion Detection Systems
    CASA
    EAI
    DOI: 10.4108/eai.13-7-2018.162633
Hoang Ngoc Thanh1,*, Tran Van Lang2
  • 1: Lac Hong University, Vietnam
  • 2: Institute of Applied Mechanics and Informatics, VAST, Vietnam
*Contact email: thanhhn@bvu.edu.vn

Abstract

The main function of the network Intrusion Detection System (IDS) is to protect the system, analyze and predict network access behavior of users. These behaviors are considered normal or an attack. Machine learning methods (ML) are used in IDSs because of the ability to learn from past attack patterns to recognize new attack patterns. These methods are effective but have relatively high computational costs. Meanwhile, the traffic of network data is growing rapidly, the computational cost issues need to be addressed. This paper addresses the use of algorithms combined with information metrics to reduce the features of the dataset to be analyzed. As the result, it helps to build IDSs with lower cost but higher performance suitable for large scale networks. The test results on the UNSW-NB15 dataset demonstrate: with the optimal set of features suitable for the attack type as well as the machine learning method, the quality of classification is improved with less training and testing time.