About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
IoT Technologies for Health Care. 8th EAI International Conference, HealthyIoT 2021, Virtual Event, November 24-26, 2021, Proceedings

Research Article

Predicting Diabetes Disease in the Female Adult Population, Using Data Mining

Download(Requires a free EAI acccount)
4 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-030-99197-5_6,
        author={Carolina Marques and Vasco Ramos and Hugo Peixoto and Jos\^{e} Machado},
        title={Predicting Diabetes Disease in the Female Adult Population, Using Data Mining},
        proceedings={IoT Technologies for Health Care. 8th EAI International Conference, HealthyIoT 2021, Virtual Event, November 24-26, 2021, Proceedings},
        proceedings_a={HEALTHYIOT},
        year={2022},
        month={3},
        keywords={Data mining Diabetes CRISP-DM Classification ML models},
        doi={10.1007/978-3-030-99197-5_6}
    }
    
  • Carolina Marques
    Vasco Ramos
    Hugo Peixoto
    José Machado
    Year: 2022
    Predicting Diabetes Disease in the Female Adult Population, Using Data Mining
    HEALTHYIOT
    Springer
    DOI: 10.1007/978-3-030-99197-5_6
Carolina Marques1, Vasco Ramos1, Hugo Peixoto2,*, José Machado2
  • 1: University of Minho
  • 2: Centro Algoritmi, University of Minho
*Contact email: hpeixoto@di.uminho.pt

Abstract

The aim of this study is to predict, through data mining, the incidence of diabetes disease in the Pima Female Adult Population. Diabetes is a chronic disease that occurs either when the pancreas does not produce enough insulin or when the body cannot effectively use the insulin it produces and is a major cause of blindness, kidney failure, heart attacks, stroke and lower limb amputation. The information collected from this population combined with the data mining techniques, may help to detect earlier the presence of this decease. To achieve the best possible ML model, this work uses the CRISP-DM methodology and compares the results of five ML models (Logistic Regression, Naive Bayes, Random Forest, Gradient Boosted Trees and k-NN) obtained from two different datasets (originated from two different data preparation strategies). The study shows that the most promising model as k-NN, which produced results of 90% of accuracy and also 90% of F1 Score, in the most realistic evaluation scenario.

Keywords
Data mining Diabetes CRISP-DM Classification ML models
Published
2022-03-23
Appears in
SpringerLink
http://dx.doi.org/10.1007/978-3-030-99197-5_6
Copyright © 2021–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL