sis 18: e68

Research Article

Impact of Features Reduction on Machine Learning Based Intrusion Detection Systems

Download173 downloads
  • @ARTICLE{10.4108/eetsis.vi.447,
        author={Masooma Fatima and Osama Rehman and Ibrahim M. H. Rahman},
        title={Impact of Features Reduction on Machine Learning Based Intrusion Detection Systems},
        journal={EAI Endorsed Transactions on Scalable Information Systems: Online First},
        volume={},
        number={},
        publisher={EAI},
        journal_a={SIS},
        year={2022},
        month={4},
        keywords={DDoS attacks, Random Forest, Na\~{n}ve Bayes, SVM, WEKA, IDS},
        doi={10.4108/eetsis.vi.447}
    }
    
  • Masooma Fatima
    Osama Rehman
    Ibrahim M. H. Rahman
    Year: 2022
    Impact of Features Reduction on Machine Learning Based Intrusion Detection Systems
    SIS
    EAI
    DOI: 10.4108/eetsis.vi.447
Masooma Fatima1,*, Osama Rehman2, Ibrahim M. H. Rahman3
  • 1: Systems Ltd, Karachi, Pakistan
  • 2: Department of Software Engineering, Bahria University, Karachi, Pakistan
  • 3: The Open Polytechnic of New Zealand, Wellington, New Zealand
*Contact email: masoomafatima69@gmail.com

Abstract

INTRODUCTION: As the use of the internet is increasing rapidly, cyber-attacks over user’s personal data and network resources are on the rise. Due to the easily accessible cyber-attack tools, attacks on cyber resources are becoming common including Distributed Denial-of-Service (DDoS) attacks. Intruders are using enhanced techniques for executing DDoS attacks.

OBJECTIVES: Machine Learning (ML) based classification modules integrated with Intrusion Detection System (IDS) has the potential to detect cyber-attacks. This research aims to study the performance of several machine learning algorithms, namely Naïve Bayes, Decision Tree, Random Forest, and Support Vector Machine in classifying DDoS attacks from normal traffic.

METHODS: The paper focuses on DDoS attacks identification for which multiclass dataset is being used including Smurf, SIDDoS, HTTP-Flood and UDP-Flood. balanced datasets are used for both training and testing purposes in order to obtain biased free results. four experimental scenarios are conducted in which each experiment contains a different set of reduced features.

RESULTS: Result of each experiment is computed individually and the best algorithm among the four is highlighted by mean of its accuracy, detection rates and processing time required to build and test the classifiers.

CONCLUSION: Based on all experimental results, it is found that Decision Tree algorithm has shown promising cumulative performances in terms of the metrics investigated.