About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Digital Forensics and Cyber Crime. 12th EAI International Conference, ICDF2C 2021, Virtual Event, Singapore, December 6-9, 2021, Proceedings

Research Article

Automated Software Vulnerability Detection via Pre-trained Context Encoder and Self Attention

Download(Requires a free EAI acccount)
4 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-06365-7_15,
        author={Na Li and Haoyu Zhang and Zhihui Hu and Guang Kou and Huadong Dai},
        title={Automated Software Vulnerability Detection via Pre-trained Context Encoder and Self Attention},
        proceedings={Digital Forensics and Cyber Crime. 12th EAI International Conference, ICDF2C 2021, Virtual Event, Singapore, December 6-9, 2021, Proceedings},
        proceedings_a={ICDF2C},
        year={2022},
        month={6},
        keywords={Automated vulnerability detection Self attention Pre-trained language model Transfer learning},
        doi={10.1007/978-3-031-06365-7_15}
    }
    
  • Na Li
    Haoyu Zhang
    Zhihui Hu
    Guang Kou
    Huadong Dai
    Year: 2022
    Automated Software Vulnerability Detection via Pre-trained Context Encoder and Self Attention
    ICDF2C
    Springer
    DOI: 10.1007/978-3-031-06365-7_15
Na Li1, Haoyu Zhang1, Zhihui Hu1, Guang Kou1, Huadong Dai1,*
  • 1: Artificial Intelligence Research Center
*Contact email: hddai@vip.163.com

Abstract

With the increasing size and complexity of modern software projects, it is almost impossible to discover all software vulnerabilities in time by manual analysis. Most existing vulnerability detection methods rely on manual designed vulnerability features, which is costly and leads to high false positive rates. Pre-trained models for programming language have been used to gain dramatic improvements to code-related tasks, which considers syntactic-level structure of code further. Thus, we propose an automated vulnerability detection method based on pre-trained context encoder as well as self-attention mechanism. Instead of current static analysis approaches, we treat the program source code as natural language and introduce the pre-trained contextualized language model to capture the program local dependencies and learn a better contextualized representation. The extracted source code feature vectors are then fed into a designed Self Attention Networks (SAN) module. We develop the SAN module based on Long-Short Term Memory (LSTM) model and self attention, which learns the long-range dependencies of program vulnerable points more efficiently. We conduct experiments on two source code level C program benchmark datasets, where four different evaluation metrics are applied for comparing the vulnerability detection performances of different systems. Extensive experimental results demonstrate that our proposed model outperforms previous state-of-the-art automated vulnerability detection method by around 7.2% in F1-measure and 2.6% in precision.

Keywords
Automated vulnerability detection Self attention Pre-trained language model Transfer learning
Published
2022-06-04
Appears in
SpringerLink
http://dx.doi.org/10.1007/978-3-031-06365-7_15
Copyright © 2021–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL