Phishing Web Page Detection with Semi-Supervised Deep Anomaly Detection

Linshu Ouyang; Yongzheng Zhang

Security and Privacy in Communication Networks. 17th EAI International Conference, SecureComm 2021, Virtual Event, September 6–9, 2021, Proceedings, Part II

Research Article

Phishing Web Page Detection with Semi-Supervised Deep Anomaly Detection

Download

10 downloads

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-030-90022-9_20,
    author={Linshu Ouyang and Yongzheng Zhang},
    title={Phishing Web Page Detection with Semi-Supervised Deep Anomaly Detection},
    proceedings={Security and Privacy in Communication Networks. 17th EAI International Conference, SecureComm 2021, Virtual Event, September 6--9, 2021, Proceedings, Part II},
    proceedings_a={SECURECOMM PART 2},
    year={2021},
    month={11},
    keywords={Phishing Semi-supervised learning Anomaly detection},
    doi={10.1007/978-3-030-90022-9_20}
}

Linshu Ouyang
Yongzheng Zhang
Year: 2021
Phishing Web Page Detection with Semi-Supervised Deep Anomaly Detection
SECURECOMM PART 2
Springer
DOI: 10.1007/978-3-030-90022-9_20

Linshu Ouyang¹^,*, Yongzheng Zhang¹

1: Institute of Information Engineering

*Contact email: ouyanglinshu@iie.ac.cn

Abstract

Phishing web page is one of the most serious threats to the users of the Internet. Recently, deep learning-based phishing detection methods have achieved significant improvement. However, these supervised deep neural networks require a large number of training samples. They also have difficulties in detecting novel phishing web pages. Using anomaly detection approaches is a possible way out yet is currently less explored, possibly due to two reasons. First, HTML codes lie in high dimensional discrete space which is difficult to handle for existing anomaly detection methods. Second, existing anomaly detection methods may find other types of anomalies that are beyond the scope of phishing.

In this paper, we propose a novel semi-supervised deep anomaly detection-based phishing webpage detection method. We first utilize a multi-head self-attention network to learn feature representation that is suitable for anomaly detection from HTML codes. Then we build a semi-supervised learner with Gaussian prior and contrastive loss to fulfill an end-to-end anomaly detector that is specifically optimized for detecting phishing webpages. Extensive experiments on a real-world dataset demonstrate that the accuracy of our method outperforms other state-of-the-art methods by a large margin.

Keywords: Phishing, Semi-supervised learning, Anomaly detection

Published: 2021-11-04
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-030-90022-9_20

Phishing Web Page Detection with Semi-Supervised Deep Anomaly Detection

Abstract

About EAI

Community

Publish with EAI