
Research Article
A Semi-supervised Learning Method for Malware Traffic Classification with Raw Bitmaps
@INPROCEEDINGS{10.1007/978-3-031-54528-3_19, author={Jingrun Ma and Xiaolin Xu and Tianning Zang and Xi Wang and Beibei Feng and Xiang Li}, title={A Semi-supervised Learning Method for Malware Traffic Classification with Raw Bitmaps}, proceedings={Collaborative Computing: Networking, Applications and Worksharing. 19th EAI International Conference, CollaborateCom 2023, Corfu Island, Greece, October 4-6, 2023, Proceedings, Part II}, proceedings_a={COLLABORATECOM PART 2}, year={2024}, month={2}, keywords={Malware traffic classification Semi-Supervised Learning}, doi={10.1007/978-3-031-54528-3_19} }
- Jingrun Ma
Xiaolin Xu
Tianning Zang
Xi Wang
Beibei Feng
Xiang Li
Year: 2024
A Semi-supervised Learning Method for Malware Traffic Classification with Raw Bitmaps
COLLABORATECOM PART 2
Springer
DOI: 10.1007/978-3-031-54528-3_19
Abstract
The rapid growth of malware and its variants has a significant detrimental effect on the security of the Internet infrastructure. In recent years, deep learning-based methods have demonstrated significant success in malware detection. Nonetheless, there are concerns regarding the requirement for substantial labeled data and the feature selection methods used in present approaches. In this paper, we propose a semi-supervised learning-based method for malware traffic classification, which exploits the raw bitmap representation of malware traffic. We employ stacked bi-LSTM to learn the feature representation of malware traffic and adopt semi-supervised learning (SSL) to enhance the model performance by leveraging unlabeled traffic. Pseudo-labeling and consistency regularization are used to produce pseudo-labels, which can compute unsupervised loss. The loss function consists of two terms: a supervised loss applied to labeled data and an unsupervised loss, which are combined together for model training. Experiments indicate that our method is capable of classifying malware traffic with satisfactory accuracy.