Data Balancing Technique Based on AE-Flow Model for Network Instrusion Detection

Xuanrui Xiong; Yufan Zhang; Huijun Zhang; Yi Chen; Hailing Fang; Wen Xu; Weiqing Lin; Yuan Zhang

Communications and Networking. 17th EAI International Conference, Chinacom 2022, Virtual Event, November 19-20, 2022, Proceedings

Research Article

Data Balancing Technique Based on AE-Flow Model for Network Instrusion Detection

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-031-34790-0_14,
    author={Xuanrui Xiong and Yufan Zhang and Huijun Zhang and Yi Chen and Hailing Fang and Wen Xu and Weiqing Lin and Yuan Zhang},
    title={Data Balancing Technique Based on AE-Flow Model for Network Instrusion Detection},
    proceedings={Communications and Networking. 17th EAI International Conference, Chinacom 2022, Virtual Event, November 19-20, 2022, Proceedings},
    proceedings_a={CHINACOM},
    year={2023},
    month={6},
    keywords={Imbalanced data Deep generative model-Flow AutoEncoder Network Intrusion Detection},
    doi={10.1007/978-3-031-34790-0_14}
}

Xuanrui Xiong
Yufan Zhang
Huijun Zhang
Yi Chen
Hailing Fang
Wen Xu
Weiqing Lin
Yuan Zhang
Year: 2023
Data Balancing Technique Based on AE-Flow Model for Network Instrusion Detection
CHINACOM
Springer
DOI: 10.1007/978-3-031-34790-0_14

Xuanrui Xiong¹, Yufan Zhang¹^,*, Huijun Zhang², Yi Chen¹, Hailing Fang¹, Wen Xu¹, Weiqing Lin¹, Yuan Zhang³

1: College of Communication and Information Engineering, Chongqing University of Posts and Telecommunications
2: College of Environmental Resources, Chongqing Technology and Business University
3: School of Computing, Chongqing Institute of Engineering

*Contact email: zhanghj@ctbu.edu.cn

Abstract

In network intrusion detection, the frequency of some rare network attacks is low, and such samples collected are relatively few. It results in an imbalanced proportion of each category in the dataset. Training the classifier with imbalanced datasets will bias the classifier to majority class samples and affect the classification performance on minority class samples. In response to this problem, researchers usually increase minority class samples and reduce majority class samples to get a balanced dataset. Therefore, we propose a data balancing technique based on AutoEncoder-Flow (AE-Flow) Model. Firstly, we use AutoEncoder (AE) to improve the deep generative model-Flow, obtaining AE-Flow. Then we use it to learn the distribution of minority class samples and generate new samples. Secondly, we use K-means and OneSidedSelection (OSS) algorithms to finish the undersampling of majority class samples. Finally we get a balanced dataset and use machine learning (ML) classifier to finish intrusion detection. We conducted comparative experiments on NSL-KDD dataset. The experimental results show that the balanced dataset obtained by our proposed method can effectively improve the Recall rate on minority class samples and the classification performance on overall samples.

Keywords: Imbalanced data Deep generative model-Flow AutoEncoder Network Intrusion Detection

Published: 2023-06-10
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-031-34790-0_14

Data Balancing Technique Based on AE-Flow Model for Network Instrusion Detection

Abstract

About EAI

Community

Publish with EAI