
Research Article
An Efficient Unsupervised Domain Adaptation Deep Learning Model for Unknown Malware Detection
@INPROCEEDINGS{10.1007/978-3-030-96791-8_5, author={Fangwei Wang and Guofang Chai and Qingru Li and Changguang Wang}, title={An Efficient Unsupervised Domain Adaptation Deep Learning Model for Unknown Malware Detection}, proceedings={Security and Privacy in New Computing Environments. 4th EAI International Conference, SPNCE 2021, Virtual Event, December 10-11, 2021, Proceedings}, proceedings_a={SPNCE}, year={2022}, month={3}, keywords={Deep transfer learning Malware detection Domain adaptation Self-attention module}, doi={10.1007/978-3-030-96791-8_5} }
- Fangwei Wang
Guofang Chai
Qingru Li
Changguang Wang
Year: 2022
An Efficient Unsupervised Domain Adaptation Deep Learning Model for Unknown Malware Detection
SPNCE
Springer
DOI: 10.1007/978-3-030-96791-8_5
Abstract
Emerging malware and zero-day vulnerabilities present new challenges to malware detection. Currently, numerous proposed malware detection approaches are based on supervised learning. However, these methods rely on a large amount of labeled data, which is usually difficult to obtain. Moreover, since the newly emerging malware has a different data distribution from the original training samples, the detection performance of the model will degrade when facing new malware. To solve the problems mentioned above, this paper proposes an unsupervised domain adaptation-based malware detection method to align the joint distribution of known malware and unknown malware. First, the distribution divergence between the source and target domain is minimized by adversarial learning to learn shared feature representations. Second, to further obtain semantic information of unlabeled target domain data, this paper reduces the class-level distribution divergence by aligning the class centers of labeled source data and pseudo-labeled target data. To improve the ability of the model for extracting feature information, this paper mainly uses a residual network with a self-attention mechanism as a pre-trained model. Extensive experiments are conducted on two datasets. The experimental results illustrate that the proposed method outperforms the state-of-art detection methods with an accuracy of 95.63% and a recall of 95.30% in detecting unknown malware.