sis 23(3): e1

Research Article

A Chatbot Intent Classifier for Supporting High School Students

Download677 downloads
  • @ARTICLE{10.4108/eetsis.v10i2.2948,
        author={Suha Khalil Assayed and Khaled Shaalan and Manar Alkhatib},
        title={A Chatbot Intent Classifier for Supporting High School Students},
        journal={EAI Endorsed Transactions on Scalable Information Systems},
        volume={10},
        number={3},
        publisher={EAI},
        journal_a={SIS},
        year={2022},
        month={12},
        keywords={intent classification, features extraction, countvectorizer, tf-idf, multinomial naive-bayes, random forest, chatbot, nlp},
        doi={10.4108/eetsis.v10i2.2948}
    }
    
  • Suha Khalil Assayed
    Khaled Shaalan
    Manar Alkhatib
    Year: 2022
    A Chatbot Intent Classifier for Supporting High School Students
    SIS
    EAI
    DOI: 10.4108/eetsis.v10i2.2948
Suha Khalil Assayed1,*, Khaled Shaalan1, Manar Alkhatib1
  • 1: British University in Dubai
*Contact email: sassayed@gmail.com

Abstract

INTRODUCTION: An intent classification is a challenged task in Natural Language Processing (NLP) as we are asking the machine to understand our language by categorizing the users’ requests. As a result, the intent classification plays an essential role in having a chatbot conversation that understand students’ requests. OBJECTIVES: In this study, we developed a novel chatbot called “HSchatbot” for predicting the intent classifications from high school students’ enquiries. Evidently, students in high schools are the most concerned among all students about their future; thus, in this stage they need an instant support in order to prepare them to take the right decision for their career choice. METHODS: The authors in this study used the Multinomial Naive-Bayes and Random Forest classifiers for predicting the students’ enquiries, which in turn improved the performance of the classifiers by using the feature’s extractions. RESULTS: The results show that the random forest classifier performed better than Multinomial Naive-Bayes since the performance of this model is checked by using different metrics like accuracy, precision, recall and F1 score. Moreover, all showed high accuracy scores exceeding 90% in all metrics. However, the accuracy of Multinomial Naive-Bayes classifier performed much better when using CountVectorizers compared to using the TF-IDF. CONCLUSION: In the future work, the results will be analysed and investigated in order to figure out the main factors that affect the performance of Multinomial Naive-Bayes classifier, as well as evaluating the model with using a large corpus of students’ questions and enquiries.