
Research Article
Development of Churn Prediction System for a Marketing Company Using Machine Learning Technique
@ARTICLE{10.4108/eetismla.9577, author={Ismaila Yusuf and Sulaiman Olaniyi Abdulsalam and Temitope Oyewumi Jimoh and Ismail Akuji}, title={Development of Churn Prediction System for a Marketing Company Using Machine Learning Technique}, journal={EAI Endorsed Transactions on Intelligent Systems and Machine Learning Applications}, volume={2}, number={1}, publisher={EAI}, journal_a={ISMLA}, year={2025}, month={10}, keywords={Churn Prediction, Machine Learning, Ensemble Learning, Adaboost, Gradient Boosting, Extreme Gradient Boost}, doi={10.4108/eetismla.9577} }- Ismaila Yusuf
Sulaiman Olaniyi Abdulsalam
Temitope Oyewumi Jimoh
Ismail Akuji
Year: 2025
Development of Churn Prediction System for a Marketing Company Using Machine Learning Technique
ISMLA
EAI
DOI: 10.4108/eetismla.9577
Abstract
INTRODUCTION: In today's competitive marketing landscape, customer churn prediction is vital for marketing organizations to identify patterns, factors, and indicators contributing to customer attrition. This paper focuses on developing a customer churn prediction system using machine learning algorithms. OBJECTIVES: This paper employed an e-commerce dataset, obtained from the Kaggle repository, and was preprocessed. Important features were selected from the preproccessed dataset before models’ development. The parameters of AdaBoost, Gradient Boosting (GB), and Extreme Gradient Boosting (XGB) were optimized to improve their performance. METHODS: Techniques such as label encoder, mean imputation, and synthetic minority over-sampling technique (SMOTE) were applied during data preprocessing stage. Ensemble learning algorithms, namely AdaBoost, GB, and XGB were used to develop the model while random search was employed for parameter optimization. Accuracy, precision, recall, and F1-score metrics were used to evaluate the models’ performance. RESULTS: The results of the models with 15 important selected features before parameter tuning yielded the following scores: AdaBoost attained 87% accuracy, 77% precision, 81% recall, and an 79% F1-score. Gradient boosting outperformed AdaBoost with 89% accuracy, 80% precision, 82% recall, and an 81% F1-score. XGB outperformed the two algorithms (AdaBoost and GB), achieving 97% accuracy, 96% precision, 94% recall, and 95% F1-score. Notably, the Random Search significantly improves Gradient boosting's performance, increasing accuracy from 89% to 97%, precision from 80% to 97%, recall from 82% to 93%, and F1-score from 81% to 95%., making it comparable to XGB results. SHAP analysis reveals that the "Complain" feature was a consistent and key positive driver of predicted churn across all models, implying that customers who register complaints are significantly more likely to churn than those who do not. Additionally, the ‘tenure’ feature which has a strong negative impact on the prediction across the three models implies that longer tenure logically means less likely to churn, which makes the model's behaviour intuitive and trustworthy. CONCLUSION: The results demonstrate the effectiveness of the system in identifying at-risk customers, enabling businesses to proactively retain customers and reduce churn rates. The findings of this paper showcase the critical importance of effective complaint management and rapid response strategies in customer retention efforts, and suggest that fostering long-term relationships and increasing customer loyalty can be highly effective in reducing churn.
Copyright © 2025 I. Yusuf et al., licensed to EAI. This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.


