
Research Article
Towards a Framework for the Preparation of High Quality Data for Use by Machine Learning Algorithms
@INPROCEEDINGS{10.1007/978-3-031-81573-7_12, author={Rasidatou Nabi and Yaya Traor\^{e} and Julie Thiombiano}, title={Towards a Framework for the Preparation of High Quality Data for Use by Machine Learning Algorithms}, proceedings={Towards new e-Infrastructure and e-Services for Developing Countries. 15th International Conference, AFRICOMM 2023, Bobo-Dioulasso, Burkina Faso, November 23--25, 2023, Proceedings, Part II}, proceedings_a={AFRICOMM PART 2}, year={2025}, month={2}, keywords={Data processing Quality data Missing data Encoding Normalization}, doi={10.1007/978-3-031-81573-7_12} }
- Rasidatou Nabi
Yaya Traoré
Julie Thiombiano
Year: 2025
Towards a Framework for the Preparation of High Quality Data for Use by Machine Learning Algorithms
AFRICOMM PART 2
Springer
DOI: 10.1007/978-3-031-81573-7_12
Abstract
Nowadays, companies and organizations have access to various data collection tools that enable them to amass vast amounts of data, which can be stored in databases. This data can be leveraged by machine learning algorithms to extract valuable information for decision-makers. However, this raw data is often of poor quality, containing errors such as missing data and outliers, requiring the intervention of technicians and domain specialists to prepare the data to ensure the(F1_Score )of the analysis. This article proposes a framework for preparing high-quality data for machine learning algorithms, as manually identifying reliable data from a large pool can be challenging and time-consuming. Our approach is an architectural method that combines data preparation techniques to generate dataset quality.