sesa 21(29): e2

Research Article

Leveraging attention-based deep neural networks for security vetting of Android applications

Download686 downloads
  • @ARTICLE{10.4108/eai.27-9-2021.171168,
        author={Prabesh Pathak and Prabesh Poudel and Sankardas Roy and Doina Caragea},
        title={Leveraging attention-based deep neural networks for security vetting of Android applications},
        journal={EAI Endorsed Transactions on Security and Safety},
        volume={8},
        number={29},
        publisher={EAI},
        journal_a={SESA},
        year={2021},
        month={9},
        keywords={Android Apps, Android Security, Malware Detection, Deep Neural Networks, Attention},
        doi={10.4108/eai.27-9-2021.171168}
    }
    
  • Prabesh Pathak
    Prabesh Poudel
    Sankardas Roy
    Doina Caragea
    Year: 2021
    Leveraging attention-based deep neural networks for security vetting of Android applications
    SESA
    EAI
    DOI: 10.4108/eai.27-9-2021.171168
Prabesh Pathak1,*, Prabesh Poudel1, Sankardas Roy1, Doina Caragea2
  • 1: Bowling Green State University, Bowling Green, Ohio, USA
  • 2: Kansas State University, Manhattan, Kansas, USA
*Contact email: ppathak@bgsu.edu

Abstract

Many traditional machine learning and deep learning algorithms work as a black box and lack interpretability. Attention-based mechanisms can be used to address the interpretability of such models by providing insights into the features that a model uses to make its decisions. Recent success of attention-based mechanisms in natural language processing motivates us to apply the idea for security vetting of Android apps. An Android app’s code contains API-calls that can provide clues regarding the malicious or benign nature of an app. By observing the pattern of the API-calls being invoked, we can interpret the predictions of a model trained to separate benign apps from malicious apps. In this paper, using the attention mechanism, we aim to find the API-calls that are predictive with respect to the maliciousness of Android apps. More specifically, we target to identify a set of API-calls that malicious apps exploit, which might help the community discover new signatures of malware. In our experiment, we work with two attention-based models: Bi-LSTM Attention and Self-Attention. Our classification models achieve high accuracy in malware detection. Using the attention weights, we also extract the top 200 API-calls (that reflect the malicious behavior of the apps) from each of these two models, and we observe that there is significant overlap between the top 200 API-calls identified by the two models. This result increases our confidence that the top 200 API-calls can be used to improve the interpretability of the models.