Wireless Mobile Communication and Healthcare. 9th EAI International Conference, MobiHealth 2020, Virtual Event, November 19, 2020, Proceedings

Research Article

Using Bayesian Optimization to Effectively Tune Random Forest and XGBoost Hyperparameters for Early Alzheimer’s Disease Diagnosis

Download
228 downloads
  • @INPROCEEDINGS{10.1007/978-3-030-70569-5_18,
        author={Louise Bloch and Christoph M. Friedrich},
        title={Using Bayesian Optimization to Effectively Tune Random Forest and XGBoost Hyperparameters for Early Alzheimer’s Disease Diagnosis},
        proceedings={Wireless Mobile Communication and Healthcare. 9th EAI International Conference, MobiHealth 2020, Virtual Event, November 19, 2020, Proceedings},
        proceedings_a={MOBIHEALTH},
        year={2021},
        month={7},
        keywords={Bayesian optimization Computer-aided diagnosis Early Alzheimer’s Disease diagnosis eXtreme Gradient Boosting Random Forests},
        doi={10.1007/978-3-030-70569-5_18}
    }
    
  • Louise Bloch
    Christoph M. Friedrich
    Year: 2021
    Using Bayesian Optimization to Effectively Tune Random Forest and XGBoost Hyperparameters for Early Alzheimer’s Disease Diagnosis
    MOBIHEALTH
    Springer
    DOI: 10.1007/978-3-030-70569-5_18
Louise Bloch1, Christoph M. Friedrich1
  • 1: University of Applied Sciences and Arts Dortmund

Abstract

Many research articles used Machine Learning (ML) for early detection of Alzheimer’s Disease (AD) especially based on Magnetic Resonance Imaging (MRI). Most ML algorithms depend on a large number of hyperparameters. Those hyperparameters have a strong influence on the model performance and thus choosing good hyperparameters is important in ML. In this article, Bayesian Optimization (BO) was used to time-efficiently find good hyperparameters for Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) models, which are based on four and seven hyperparameters and promise good classification results. Those models are applied to distinguish if mild cognitive impaired (MCI) subjects from the Alzheimer’s disease neuroimaging initiative (ADNI) dataset will prospectively convert to AD. The results showed comparable cross-validation (CV) classification accuracies for models trained using BO and grid-search, whereas BO has been less time-consuming. The initial combinations for BO were set using Latin Hypercube Design (LHD) and via Random Initialization (RI). Furthermore, many models trained using BO achieved better classification results for the independent test dataset than the model based on the grid-search. The best model achieved an accuracy of 73.43% for the independent test dataset. This model was an XGBoost model trained with BO and RI.