
Research Article
Enhancing Network Intrusion Detection with Deep Oversampling and Convolutional Autoencoder for Imbalanced Dataset
@INPROCEEDINGS{10.1007/978-3-031-67162-3_14, author={Xuanrui Xiong and Junfeng Li and Huijun Zhang and Han Shen and Mengru Liu and Wei Peng and Qi Huang and Yuan Zhang}, title={Enhancing Network Intrusion Detection with Deep Oversampling and Convolutional Autoencoder for Imbalanced Dataset}, proceedings={Communications and Networking. 18th EAI International Conference, ChinaCom 2023, Sanya, China, November 18--19, 2023, Proceedings}, proceedings_a={CHINACOM}, year={2024}, month={8}, keywords={Network Intrusion Detection Imbalanced dataset Convolutional autoencoder Data enhancement}, doi={10.1007/978-3-031-67162-3_14} }
- Xuanrui Xiong
Junfeng Li
Huijun Zhang
Han Shen
Mengru Liu
Wei Peng
Qi Huang
Yuan Zhang
Year: 2024
Enhancing Network Intrusion Detection with Deep Oversampling and Convolutional Autoencoder for Imbalanced Dataset
CHINACOM
Springer
DOI: 10.1007/978-3-031-67162-3_14
Abstract
Network intrusion detection is confronted with a shortage of intrusion samples about uncommon attacks, resulting in an imbalance in data distribution across most network intrusion detection datasets. Traditional machine learning methods encounter challenges in effectively handling unbalanced massive high-dimensional data, resulting in a low detection rate for minority attack classes. We propose a data generation method based on Deep Convolutional Autoencoder-SMOTE (DCAES) generation model. We first generate new attack samples by feeding minority class samples into the DCAES generative model. This process aims to increase minority class samples to balance the dataset. Furthermore, we use DBSCAN clustering undersampling and Tomek Links methods to refine the dataset to eliminate redundant and noisy samples from the majority classes. Finally, we obtain a dataset that shows relative balance and high quality. To assess the efficacy of our approach, we performed comparison experiments using the UNSW-NB15 dataset. The experimental findings demonstrate that the proposed strategy yields a balanced dataset that can be utilized well for classification learning. Furthermore, the detection rate of minority class attacks has also been greatly improved.