
Research Article
Comparative Analysis of Machine Learning Algorithms for Phishing Detection Using URL Features
@INPROCEEDINGS{10.4108/eai.28-4-2025.2357976, author={M A Mukunthan and P. Shashi Vardhan Reddy and M. Trinath Reddy and P. Chakradhar Reddy}, title={Comparative Analysis of Machine Learning Algorithms for Phishing Detection Using URL Features}, proceedings={Proceedings of the 4th International Conference on Information Technology, Civil Innovation, Science, and Management, ICITSM 2025, 28-29 April 2025, Tiruchengode, Tamil Nadu, India, Part II}, publisher={EAI}, proceedings_a={ICITSM PART II}, year={2025}, month={10}, keywords={phishing cybersecurity machine learning phishing detection url-based dataset decision tree linear regression random forest}, doi={10.4108/eai.28-4-2025.2357976} }
- M A Mukunthan
P. Shashi Vardhan Reddy
M. Trinath Reddy
P. Chakradhar Reddy
Year: 2025
Comparative Analysis of Machine Learning Algorithms for Phishing Detection Using URL Features
ICITSM PART II
EAI
DOI: 10.4108/eai.28-4-2025.2357976
Abstract
Though phishing was first implemented in 1996, it still remains the most dangerous and severe form of cybercrime. Phishing entails the danger of email imputation, and follow- up phishing sites to gather information from a user. Various studies have varied forms of measures, such as detection and awareness, for identifying phishing attacks; however, there is no well-defined framework for this issue. Cybercrime, relating with phishing, require active and sophisticated technologies like machine learning for better protection. The dataset, which is the basis of this study was taken from esteemed dataset repositories that contain features of both phishing and non-phishing URLs with their vectors from over 11000 websites. The phishing URLS can further be routed with applied machine learning algorithms which are designed to enable user and system information protection from such attacks. Proposed hybrid LSD model, along with known machine learning models such as decision tree (DT), linear regression (LR), random forest (RF), naive Bayes (NB), gradient boosting classifier (GBM), K-neighbours’ classifier (KNN), and support vector classifier (SVC) are used in this study. Therefore, the reality-based detection of new phishing pages is one primary challenge in cyber security. To address these issues, this research develops a hybrid url and hyperlink extraction feature based anti-phishing strategy.