Research Article
A Chatbot Intent Classifier for Supporting High School Students
@ARTICLE{10.4108/eetsis.v10i2.2948, author={Suha Khalil Assayed and Khaled Shaalan and Manar Alkhatib}, title={A Chatbot Intent Classifier for Supporting High School Students}, journal={EAI Endorsed Transactions on Scalable Information Systems}, volume={10}, number={3}, publisher={EAI}, journal_a={SIS}, year={2022}, month={12}, keywords={intent classification, features extraction, countvectorizer, tf-idf, multinomial naive-bayes, random forest, chatbot, nlp}, doi={10.4108/eetsis.v10i2.2948} }
- Suha Khalil Assayed
Khaled Shaalan
Manar Alkhatib
Year: 2022
A Chatbot Intent Classifier for Supporting High School Students
SIS
EAI
DOI: 10.4108/eetsis.v10i2.2948
Abstract
INTRODUCTION: An intent classification is a challenged task in Natural Language Processing (NLP) as we are asking the machine to understand our language by categorizing the users’ requests. As a result, the intent classification plays an essential role in having a chatbot conversation that understand students’ requests. OBJECTIVES: In this study, we developed a novel chatbot called “HSchatbot” for predicting the intent classifications from high school students’ enquiries. Evidently, students in high schools are the most concerned among all students about their future; thus, in this stage they need an instant support in order to prepare them to take the right decision for their career choice. METHODS: The authors in this study used the Multinomial Naive-Bayes and Random Forest classifiers for predicting the students’ enquiries, which in turn improved the performance of the classifiers by using the feature’s extractions. RESULTS: The results show that the random forest classifier performed better than Multinomial Naive-Bayes since the performance of this model is checked by using different metrics like accuracy, precision, recall and F1 score. Moreover, all showed high accuracy scores exceeding 90% in all metrics. However, the accuracy of Multinomial Naive-Bayes classifier performed much better when using CountVectorizers compared to using the TF-IDF. CONCLUSION: In the future work, the results will be analysed and investigated in order to figure out the main factors that affect the performance of Multinomial Naive-Bayes classifier, as well as evaluating the model with using a large corpus of students’ questions and enquiries.
Copyright © 2022 Suha K. Assayed et al., licensed to EAI. This is an open access article distributed under the terms of the CC BYNC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.