About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Data and Information in Online Environments. Second EAI International Conference, DIONE 2021, Virtual Event, March 10–12, 2021, Proceedings

Research Article

Feature Importance Investigation for Estimating Covid-19 Infection by Random Forest Algorithm

Download(Requires a free EAI acccount)
3 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-030-77417-2_20,
        author={Andr\^{e} Vin\^{\i}cius Gon\`{e}alves and Ione Jayce Ceola Schneider and Fernanda Vargas Amaral and Leandro Pereira Garcia and Gustavo Medeiros de Ara\^{u}jo},
        title={Feature Importance Investigation for Estimating Covid-19 Infection by Random Forest Algorithm},
        proceedings={Data and Information in Online Environments. Second EAI International Conference, DIONE 2021, Virtual Event, March 10--12, 2021, Proceedings},
        proceedings_a={DIONE},
        year={2021},
        month={6},
        keywords={Feature importance Feature engineering Machine learning Prediction model COVID-19},
        doi={10.1007/978-3-030-77417-2_20}
    }
    
  • André Vinícius Gonçalves
    Ione Jayce Ceola Schneider
    Fernanda Vargas Amaral
    Leandro Pereira Garcia
    Gustavo Medeiros de Araújo
    Year: 2021
    Feature Importance Investigation for Estimating Covid-19 Infection by Random Forest Algorithm
    DIONE
    Springer
    DOI: 10.1007/978-3-030-77417-2_20
André Vinícius Gonçalves, Ione Jayce Ceola Schneider, Fernanda Vargas Amaral, Leandro Pereira Garcia, Gustavo Medeiros de Araújo1
  • 1: PGCIN

Abstract

The present work raises an investigation about the feature importance to estimate the COVID-19 infection, using Machine Learning approach. Our work analyzed 175 features, using the Permutation Importance method, to assess the importance and list the twenty most relevant ones that represent the probability of infection of the disease. Among all features, the most important were: i) the period comprised between the date of notification and symptom onset stand out, ii) the rate of confirmed in the territory of health units in the last 14 days, iii) the rate of discarded and removed from the health territory, iv) the age, v) variables of the traffic flow and vi) symptoms features as fever, cough and sore throat. The model was validated and reached an accuracy average of 78.19%, whereas the sensitivity and specificity achieved 83.05% and the 75.50% respectively in the infection estimate. Therefore, the proposed investigation represents an alternative to guide authorities in understanding aspects related to the disease.

Keywords
Feature importance Feature engineering Machine learning Prediction model COVID-19
Published
2021-06-15
Appears in
SpringerLink
http://dx.doi.org/10.1007/978-3-030-77417-2_20
Copyright © 2021–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL