About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
IoT 24(1):

Research Article

Synthetic Malware Using Deep Variational Autoencoders and Generative Adversarial Networks

Download106 downloads
Cite
BibTeX Plain Text
  • @ARTICLE{10.4108/eetiot.6566,
        author={Aaron Choi and Albert Giang and Sajit Jumani and David Luong and Fabio Di Troia},
        title={Synthetic Malware Using Deep Variational Autoencoders and Generative Adversarial Networks},
        journal={EAI Endorsed Transactions on Internet of Things},
        volume={10},
        number={1},
        publisher={EAI},
        journal_a={IOT},
        year={2024},
        month={7},
        keywords={Malware, Synthetic Malware, GAN, VAE},
        doi={10.4108/eetiot.6566}
    }
    
  • Aaron Choi
    Albert Giang
    Sajit Jumani
    David Luong
    Fabio Di Troia
    Year: 2024
    Synthetic Malware Using Deep Variational Autoencoders and Generative Adversarial Networks
    IOT
    EAI
    DOI: 10.4108/eetiot.6566
Aaron Choi1, Albert Giang1, Sajit Jumani1, David Luong1, Fabio Di Troia1,*
  • 1: San Jose State University
*Contact email: fabio.ditroia@sjsu.edu

Abstract

The effectiveness of detecting malicious files heavily relies on the quality of the training dataset, particularly its size and authenticity. However, the lack of high-quality training data remains one of the biggest challenges in achieving widespread adoption of malware detection by trained machine and deep learning models. In response to this challenge, researchers have made initial strides by employing generative techniques to create synthetic malware samples. This work utilizes deep variational autoencoders (VAE) and generative adversarial networks (GAN) to produce malware samples as opcode sequences. The generated malware opcodes are then distinguished from authentic opcode samples using machine and deep learning techniques as validation methods. The primary objective of this study was to compare synthetic malware generated using VAE and GAN technologies. The results showed that neither approach could create synthetic malware that could deceive machine learning classification. However, the WGAN-GP algorithm showed more promise by requiring a higher number of synthetic malware samples in the train set to effectively be detected, proving it a better approach in synthetic malware generation.

Keywords
Malware, Synthetic Malware, GAN, VAE
Received
2024-02-15
Accepted
2024-06-30
Published
2024-07-09
Publisher
EAI
http://dx.doi.org/10.4108/eetiot.6566

Copyright © 2024 A. Choi et al., licensed to EAI. This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.

EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL