About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Proceedings of the 4th International Conference on Information Technology, Civil Innovation, Science, and Management, ICITSM 2025, 28-29 April 2025, Tiruchengode, Tamil Nadu, India, Part II

Research Article

Predictive Modeling of Diabetes Using Ensemble Learning and Feature Optimization

Download8 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.4108/eai.28-4-2025.2358067,
        author={M. Dhilsath Fathima and M.  Akash and A. Yashwanth Reddy and G.  Trilok},
        title={Predictive Modeling of Diabetes Using Ensemble Learning and Feature Optimization },
        proceedings={Proceedings of the 4th International Conference on Information Technology, Civil Innovation, Science, and Management, ICITSM 2025, 28-29 April 2025, Tiruchengode, Tamil Nadu, India, Part II},
        publisher={EAI},
        proceedings_a={ICITSM PART II},
        year={2025},
        month={10},
        keywords={diabetes prediction ensemble learning gradient boosting feature optimization xgboost shap smote healthcare analytics},
        doi={10.4108/eai.28-4-2025.2358067}
    }
    
  • M. Dhilsath Fathima
    M. Akash
    A. Yashwanth Reddy
    G. Trilok
    Year: 2025
    Predictive Modeling of Diabetes Using Ensemble Learning and Feature Optimization
    ICITSM PART II
    EAI
    DOI: 10.4108/eai.28-4-2025.2358067
M. Dhilsath Fathima1,*, M. Akash1, A. Yashwanth Reddy1, G. Trilok1
  • 1: Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology
*Contact email: dilsathveltech123@gmail.com

Abstract

Diabetes has emerged as a huge global health burden as a chronic metabolic disorder. Early and accurate prediabetes detection is important to prevent complications like cardiovascular diseases and neuropathy. In this paper, we present an ensemble-based robust predictive framework incorporating advanced feature optimization methods, which is based on the extreme gradient boosting (XGBoost) method. Data pre-processing steps including imputation, Normalization, outlier deletion, and features elimination were applied to improve the accuracy of the model. Synthetic Minority Oversampling Technique (SMOTE) handled class imbalance and SHAP (Shapley Additive explanations) values was used to obtain feature importance interpretability. The proposed model is trained and tested using the PIMA Indian Diabetes Dataset and obtained better results compared with other classical classifiers in accuracy and AUC-ROC. The system was implemented as a web-based application for on-line risk prediction. Here we show that the combination of ensemble learning and the incorporation of optimization preprocessing allow reliable, scalable and interpretable diabetes risk prediction to be generated.

Keywords
diabetes prediction, ensemble learning gradient boosting, feature optimization, xgboost, shap, smote, healthcare analytics
Published
2025-10-14
Publisher
EAI
http://dx.doi.org/10.4108/eai.28-4-2025.2358067
Copyright © 2025–2025 EAI
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL