
Research Article
Financial Fraud Detection Using Rich Mobile Money Transaction Datasets
@INPROCEEDINGS{10.1007/978-3-031-81573-7_16, author={Denish Azamuke and Marriette Katarahweire and Engineer Bainomugisha}, title={Financial Fraud Detection Using Rich Mobile Money Transaction Datasets}, proceedings={Towards new e-Infrastructure and e-Services for Developing Countries. 15th International Conference, AFRICOMM 2023, Bobo-Dioulasso, Burkina Faso, November 23--25, 2023, Proceedings, Part II}, proceedings_a={AFRICOMM PART 2}, year={2025}, month={2}, keywords={Mobile money transactions Simulation Agent-based modelling Fraud detection Machine learning}, doi={10.1007/978-3-031-81573-7_16} }
- Denish Azamuke
Marriette Katarahweire
Engineer Bainomugisha
Year: 2025
Financial Fraud Detection Using Rich Mobile Money Transaction Datasets
AFRICOMM PART 2
Springer
DOI: 10.1007/978-3-031-81573-7_16
Abstract
In an era marked by the rise of digital transactions, mobile money platforms continue to experience rampant fraud and thus effective fraud detection approaches are key for maintaining the integrity of financial systems, especially in the Sub-Saharan region. This study simulates known fraudulent scenarios found in mobile money platforms in Sub-Saharan Africa using a multi-agent-based simulation platform called MoMTSim. MoMTSim generates rich synthetic mobile money transaction datasets that are statistically close to the real mobile money transaction data. The study examines common classification models including Logistic regression, Gradient boosting, Decision trees, AdaBoost, XGBoost, and Random forest for financial fraud detection. The models were evaluated using several performance metrics including Precision, Recall, F1-score, AUC-ROC, and notably, the Matthews correlation coefficient (MCC), which is particularly effective for imbalanced classes common in financial data. The results demonstrate that all tested models are capable of identifying fraudulent transactions, with varying degrees of success. The XGBoost model stood out with the highest MCC (0.82) and AUC of 0.97, indicating superior overall performance. Meanwhile, the Logistic regression model served as a benchmark with an MCC of 0.67, revealing the performance enhancements offered by more complex models. However, the study also underscores the importance of considering the computational costs associated with more complex models. The findings affirm the potential of machine learning algorithms for fraud detection and provide valuable insights into model selection based on performance and computational requirements.