Research Article
Comparing Supervised Learning Classifiers to Detect Advanced Fee Fraud Activities on Internet
@INPROCEEDINGS{10.1007/978-3-642-27317-9_10, author={Abiodun Modupe and Oludayo Olugbara and Sunday Ojo}, title={Comparing Supervised Learning Classifiers to Detect Advanced Fee Fraud Activities on Internet}, proceedings={Advances in Computer Science and Information Technology. Computer Science and Information Technology. Second International Conference, CCSIT 2012, Bangalore, India, January 2-4, 2012. Proceedings, Part III}, proceedings_a={CCSIT PART III}, year={2012}, month={11}, keywords={Advanced Fee Fraud Word Clustering Supervised Learning Cluster Features}, doi={10.1007/978-3-642-27317-9_10} }
- Abiodun Modupe
Oludayo Olugbara
Sunday Ojo
Year: 2012
Comparing Supervised Learning Classifiers to Detect Advanced Fee Fraud Activities on Internet
CCSIT PART III
Springer
DOI: 10.1007/978-3-642-27317-9_10
Abstract
Due to its inherent vulnerability, internet is frequently abused for various criminal activities such as Advanced Fee Fraud (AFF). At present, it is difficult to accurately detect activities of AFF defrauders on internet. For this purpose, we compare classification accuracies of Binary Logistic Regression (BLR), Back-propagation Neural Network (BNN), Naive Bayesian Classifier (NBC) and Support Vector Machine (SVM) learning methods. The word clustering method (globalCM) is used to create clusters of words present in the training dataset. A Vector Space Model (VSM) is calculated from words in each e-mail in the training set. The WEKA data mining framework is selected as a tool to build supervised learning classifiers from the set of VSMs using the learning methods. Experiments are performed using stratified 10-fold cross-validation method to estimate classification accuracies of the classifiers. Results generally show that SVM utilizing a polynomial kernel gives the best classification accuracy. This study makes a positive contribution to the problem of detecting unwanted e-mails. The comparison of different learning methods is also valuable for a decision maker to consider tradeoffs in method accuracy versus complexity.