
Research Article
A Machine Learning-based approach to predicting tuberculosis in the Democratic Republic of Congo
@ARTICLE{10.4108/eetismla.9073, author={Pierre Tshibanda wa Tshibanda and Bopatriciat Boluma Mangata and Marina Mbombo Kabongo and Guy-Patient Mbiya Mpoyi}, title={A Machine Learning-based approach to predicting tuberculosis in the Democratic Republic of Congo}, journal={EAI Endorsed Transactions on Intelligent Systems and Machine Learning Applications}, volume={2}, number={1}, publisher={EAI}, journal_a={ISMLA}, year={2025}, month={7}, keywords={machine Learning, tuberculosis, tuberculosis control in DRC, automatic prediction, automatic learning}, doi={10.4108/eetismla.9073} }- Pierre Tshibanda wa Tshibanda
Bopatriciat Boluma Mangata
Marina Mbombo Kabongo
Guy-Patient Mbiya Mpoyi
Year: 2025
A Machine Learning-based approach to predicting tuberculosis in the Democratic Republic of Congo
ISMLA
EAI
DOI: 10.4108/eetismla.9073
Abstract
INTRODUCTION: Tuberculosis remains a public health problem in Democratic Republic of Congo (DRC), despite advances in Machine Learning for the prediction of this disease. However, existing models are often adapted to Asian contexts and do not take into account the specific epidemiological and social characteristics of the DRC. Given this shortcoming, our study explores a Machine Learning approach specifically designed to improve the prediction of tuberculosis in the Congolese population. OBJECTIVES: Our problem is based on the following question: "What approach, based on Machine Learning and specific to the population of DRC, is likely to improve the prediction of tuberculosis?" To answer this, we adopted an exploratory paradigm with a sequential mixed design (qualitative and quantitative). The study was conducted on a sample of 1505 patients and six healthcare professionals in the health zones of Lubumbashi and Nzanza. METHODS: The data was collected using questionnaires and semi-structured interviews, then analysed using bivariate and multivariate approaches. RESULTS: The results show that incorporating Congolese specificities into Machine Learning models significantly improves the prediction of tuberculosis. Of the models tested, Random Forest and Decision Tree performed best in terms of precision, recall, F1-score and AUC, while Voting Classifier, Stacking and Adaboost showed a good compromise between precision and robustness. CONCLUSION: This study highlights the need to develop predictive models adapted to the local context in order to improve tuberculosis control in DRC. We propose an optimised model incorporating characteristics specific to the Congolese population, with a possible large-scale application to improve detection and prevention of the disease.
Copyright © 2025 Tshibanda wa Tshibanda Pierre et al., licensed to EAI. This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.


