4th International ICST Conference on Security and Privacy in Communication Networks

Research Article

Visual-Similarity-Based Phishing Detection

  • @INPROCEEDINGS{10.1145/1460877.1460905,
        author={Eric Medvet and Engin Kirda and Christopher Kruegel},
        title={Visual-Similarity-Based Phishing Detection},
        proceedings={4th International ICST Conference on Security and Privacy in Communication Networks},
        publisher={ACM},
        proceedings_a={SECURECOMM},
        year={2008},
        month={9},
        keywords={Anti-Phishing Web document analysis Security Visual Similarity},
        doi={10.1145/1460877.1460905}
    }
    
  • Eric Medvet
    Engin Kirda
    Christopher Kruegel
    Year: 2008
    Visual-Similarity-Based Phishing Detection
    SECURECOMM
    ACM
    DOI: 10.1145/1460877.1460905
Eric Medvet1,*, Engin Kirda2,*, Christopher Kruegel3,*
  • 1: DEEI University of Trieste, Italy
  • 2: Eurecom, France
  • 3: University of California, Santa Barbara, USA
*Contact email: emedvet@units.it, kirda@eurecom.fr, chris@cs.ucsb.edu

Abstract

Phishing is a form of online fraud that aims to steal a user’s sensitive information, such as online banking passwords or credit card numbers. The victim is tricked into entering such information on a web page that is crafted by the attacker so that it mimics a legitimate page. Recent statistics about the increasing number of phishing attacks suggest that this security problem still deserves significant attention. In this paper, we present a novel technique to visually compare a suspected phishing page with the legitimate one. The goal is to determine whether the two pages are suspiciously similar. We identify and consider three page features that play a key role in making a phishing page look similar to a legitimate one. These features are text pieces and their style, images embedded in the page, and the overall visual appearance of the page as rendered by the browser. To verify the feasibility of our approach, we performed an experimental evaluation using a dataset composed of 41 real-world phishing pages, along with their corresponding legitimate targets. Our experimental results are satisfactory in terms of false positives and false negatives.