
Research Article
Optimizing Loan Default Prediction with Advanced Ensemble Learning Models
@INPROCEEDINGS{10.4108/eai.28-4-2025.2358020, author={Mohan Durga Sriram Bollu and Koshwitha B and Kavya Dharmireddi and Bharath Kumar Gorle and Mutahar Sulthana and Dinesh Koka}, title={Optimizing Loan Default Prediction with Advanced Ensemble Learning Models}, proceedings={Proceedings of the 4th International Conference on Information Technology, Civil Innovation, Science, and Management, ICITSM 2025, 28-29 April 2025, Tiruchengode, Tamil Nadu, India, Part II}, publisher={EAI}, proceedings_a={ICITSM PART II}, year={2025}, month={10}, keywords={loan default prediction machine learn- 1 ing random forest gradient boos- ting stacked ensem- bling catboost adaboost xgboost}, doi={10.4108/eai.28-4-2025.2358020} }
- Mohan Durga Sriram Bollu
Koshwitha B
Kavya Dharmireddi
Bharath Kumar Gorle
Mutahar Sulthana
Dinesh Koka
Year: 2025
Optimizing Loan Default Prediction with Advanced Ensemble Learning Models
ICITSM PART II
EAI
DOI: 10.4108/eai.28-4-2025.2358020
Abstract
The prediction of loan default is important for the management of risk in financial institutions. This paper provides a comprehensive approach to forecasting loan default using advanced machine learning (ML) techniques. Operationally, data were summarized through descriptive statistics, encoded into dummy variables, and normalized to ensure better model convergence as part of preprocessing. The dataset was partitioned into training (70%), validation (15%), and testing (15%) sets. Model selection combined Random Forest and Gradient Boosting algorithms (CatBoost, XGBoost, and AdaBoost) to capture complex patterns in the data. The stacked ensemble approach was then applied to integrate these models, improving predictive performance. The model was evaluated using standard metrics such as accuracy, precision, recall, and F1-score. This method provides an efficient solution for loan default prediction and can serve as a useful decision-making tool for financial applications.