
Research Article
Predicting Diabetes Disease in the Female Adult Population, Using Data Mining
@INPROCEEDINGS{10.1007/978-3-030-99197-5_6, author={Carolina Marques and Vasco Ramos and Hugo Peixoto and Jos\^{e} Machado}, title={Predicting Diabetes Disease in the Female Adult Population, Using Data Mining}, proceedings={IoT Technologies for Health Care. 8th EAI International Conference, HealthyIoT 2021, Virtual Event, November 24-26, 2021, Proceedings}, proceedings_a={HEALTHYIOT}, year={2022}, month={3}, keywords={Data mining Diabetes CRISP-DM Classification ML models}, doi={10.1007/978-3-030-99197-5_6} }
- Carolina Marques
Vasco Ramos
Hugo Peixoto
José Machado
Year: 2022
Predicting Diabetes Disease in the Female Adult Population, Using Data Mining
HEALTHYIOT
Springer
DOI: 10.1007/978-3-030-99197-5_6
Abstract
The aim of this study is to predict, through data mining, the incidence of diabetes disease in the Pima Female Adult Population. Diabetes is a chronic disease that occurs either when the pancreas does not produce enough insulin or when the body cannot effectively use the insulin it produces and is a major cause of blindness, kidney failure, heart attacks, stroke and lower limb amputation. The information collected from this population combined with the data mining techniques, may help to detect earlier the presence of this decease. To achieve the best possible ML model, this work uses the CRISP-DM methodology and compares the results of five ML models (Logistic Regression, Naive Bayes, Random Forest, Gradient Boosted Trees and k-NN) obtained from two different datasets (originated from two different data preparation strategies). The study shows that the most promising model as k-NN, which produced results of 90% of accuracy and also 90% of F1 Score, in the most realistic evaluation scenario.