Best-Effort Adversarial Approximation of Black-Box Malware Classifiers

Abdullah Ali; Birhanu Eshete

Security and Privacy in Communication Networks. 16th EAI International Conference, SecureComm 2020, Washington, DC, USA, October 21-23, 2020, Proceedings, Part I

Research Article

Best-Effort Adversarial Approximation of Black-Box Malware Classifiers

Download

55 downloads

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-030-63086-7_18,
    author={Abdullah Ali and Birhanu Eshete},
    title={Best-Effort Adversarial Approximation of Black-Box Malware Classifiers},
    proceedings={Security and Privacy in Communication Networks. 16th EAI International Conference, SecureComm 2020, Washington, DC, USA, October 21-23, 2020, Proceedings, Part I},
    proceedings_a={SECURECOMM},
    year={2020},
    month={12},
    keywords={Model extraction Model stealing Adversarial machine learning},
    doi={10.1007/978-3-030-63086-7_18}
}

Abdullah Ali
Birhanu Eshete
Year: 2020
Best-Effort Adversarial Approximation of Black-Box Malware Classifiers
SECURECOMM
Springer
DOI: 10.1007/978-3-030-63086-7_18

Abdullah Ali, Birhanu Eshete^,*

*Contact email: birhanu@umich.edu

Abstract

An adversary who aims to steal a black-box model repeatedly queries it via a prediction API to learn its decision boundary. Adversarial approximation is non-trivial because of the enormous alternatives of model architectures, parameters, and features to explore. In this context, the adversary resorts to abest-effort strategythat yields the closest approximation. This paper explores best-effort adversarial approximation of a black-box malware classifier in themost challenging setting, where the adversary’s knowledge is limited to label only for a given input. Beginning with a limited input set, we leveragefeature representation mappingandcross-domain transferabilityto locally approximate a black-box malware classifier. We do so withdifferent feature typesfor the target and the substitute model while also usingnon-overlapping datafor training the target, training the substitute, and the comparison of the two. Against a Convolutional Neural Network (CNN) trained on raw byte sequences of Windows Portable Executables (PEs), our approach achieves a 92% accurate substitute (trained on pixel representations of PEs), and nearly 90% prediction agreement between the target and the substitute model. Against a 97.8% accurate gradient boosted decision tree trained on static PE features, our 91% accurate substitute agrees with the black-box on 90% of predictions, suggesting the strength of our purely black-box approximation.

Keywords: Model extraction, Model stealing, Adversarial machine learning

Published: 2020-12-12
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-030-63086-7_18

Best-Effort Adversarial Approximation of Black-Box Malware Classifiers

Abstract

About EAI

Community

Publish with EAI