
Research Article
Machine Learning for Drug Efficiency Prediction
@INPROCEEDINGS{10.1007/978-3-031-32029-3_27, author={Hafida Tiaiba and Lyazid Sabri and Abdelghani Chibani and Okba Kazar}, title={Machine Learning for Drug Efficiency Prediction}, proceedings={Wireless Mobile Communication and Healthcare. 11th EAI International Conference, MobiHealth 2022, Virtual Event, November 30 -- December 2, 2022, Proceedings}, proceedings_a={MOBIHEALTH}, year={2023}, month={5}, keywords={Machine Learning Text Classification Word Embedding Health Predict Drug Efficiency Natural Language Processing}, doi={10.1007/978-3-031-32029-3_27} }
- Hafida Tiaiba
Lyazid Sabri
Abdelghani Chibani
Okba Kazar
Year: 2023
Machine Learning for Drug Efficiency Prediction
MOBIHEALTH
Springer
DOI: 10.1007/978-3-031-32029-3_27
Abstract
Health-related social media data, particularly patients’ opinions about drugs, have recently provided knowledge for research on the adverse reactions, allergies that a patient experiences and drug efficacy and safety. We develop an effective method for analyzing medicines’ efficiency and conditions-specific prescription from patient reviews provided by Drug Review Dataset (drug.com). Our approach relies on the Natural Language Processing (NLP) principle and a word embedding vectorization method to preserve semantics. For this purpose, we conducted experiments using various sampling techniques, precisely random sampling and balanced random sampling. Furthermore, we applied several statistical models: Logistic Regression, Decision Tree, Random Forests, K-Nearest Neighbors (KNN) and Neural Network models (simple perceptron, multilayer perceptron and convolutional neural network). We varied the size of training and test data sets to study the effect of the sampling techniques on model efficiency. Compared to other models, the results show that the proposed models in this paper: KNN, Embedding-100, and CNN-Maxpooling outclass models proposed by several researchers. Indeed, Embedding-100 has achieved better training accuracy and test accuracy. Moreover, during our study, we concluded that different factors influence the effectiveness of the models, mainly the text preprocessing method, sampling techniques in terms of size and type, text vectorization method and machine learning models.