
Research Article
Understanding the Security of Deepfake Detection
@INPROCEEDINGS{10.1007/978-3-031-06365-7_22, author={Xiaoyu Cao and Neil Zhenqiang Gong}, title={Understanding the Security of Deepfake Detection}, proceedings={Digital Forensics and Cyber Crime. 12th EAI International Conference, ICDF2C 2021, Virtual Event, Singapore, December 6-9, 2021, Proceedings}, proceedings_a={ICDF2C}, year={2022}, month={6}, keywords={Deepfake detection Security}, doi={10.1007/978-3-031-06365-7_22} }
- Xiaoyu Cao
Neil Zhenqiang Gong
Year: 2022
Understanding the Security of Deepfake Detection
ICDF2C
Springer
DOI: 10.1007/978-3-031-06365-7_22
Abstract
Deepfakes pose growing challenges to the trust of information on the Internet. Thus, detecting deepfakes has attracted increasing attentions from both academia and industry. State-of-the-art deepfake detection methods consist of two key components, i.e.,face extractorandface classifier, which extract the face region in an image and classify it to be real/fake, respectively. Existing studies mainly focused on improving the detection performance innon-adversarial settings, leaving security of deepfake detection inadversarial settingslargely unexplored. In this work, we aim to bridge the gap. In particular, we perform a systematic measurement study to understand the security of the state-of-the-art deepfake detection methods in adversarial settings. We use two large-scale public deepfakes data sources including FaceForensics++ and Facebook Deepfake Detection Challenge, where the deepfakes are fake face images; and we train state-of-the-art deepfake detection methods. These detection methods can achieve 0.94–0.99 accuracies in non-adversarial settings on these datasets. However, our measurement results uncover multiple security limitations of the deepfake detection methods in adversarial settings. First, we find that an attacker can evade a face extractor, i.e., the face extractor fails to extract the correct face regions, via adding small Gaussian noise to its deepfake images. Second, we find that a face classifier trained using deepfakes generated by one method cannot detect deepfakes generated by another method, i.e., an attacker can evade detection via generating deepfakes using a new method. Third, we find that an attacker can leveragebackdoor attacksdeveloped by the adversarial machine learning community to evade a face classifier. Our results highlight that deepfake detection should consider the adversarial nature of the problem.