sis 18: e42

Research Article

Speech emotion recognition method in educational scene based on machine learning

Download394 downloads
  • @ARTICLE{10.4108/eai.10-2-2022.173380,
        author={Yanning Zhang and Gautam Srivastava},
        title={Speech emotion recognition method in educational scene based on machine learning},
        journal={EAI Endorsed Transactions on Scalable Information Systems: Online First},
        keywords={Machine learning, Educational scenes, Speech emotion recognition, Kernel canonical correlation analysis, Support vector machine},
  • Yanning Zhang
    Gautam Srivastava
    Year: 2022
    Speech emotion recognition method in educational scene based on machine learning
    DOI: 10.4108/eai.10-2-2022.173380
Yanning Zhang1, Gautam Srivastava2,3,*
  • 1: School of Telecommunication Engineering, Beijing Polytechnic, Beijing 100176, China
  • 2: Department of Mathematics and Computer Science, Brandon University, Brandon, Canada
  • 3: Research Centre for Interneural Computing, China Medical University, Taichung, Taiwan
*Contact email:


In order to effectively improve the accuracy and anti noise performance of speech emotion recognition in educational scenes, a new method based on machine learning is studied. Based on the fundamental frequency and resonance degree, the speech emotional characteristics of educational scenes are collected respectively. Using the kernel canonical correlation analysis in machine learning algorithm, the emotional feature samples are nonlinearly mapped to the high-level feature space, the correlation between different emotional features is analyzed, the nonlinear correlation between the two groups of variables is obtained, the two speech emotional features are integrated, and the feature samples are constructed. SVM is used to establish speech emotion recognition classifier, and genetic algorithm is used to determine the optimal parameters. The experimental results show that the emotion recognition rate of this method is more than 90%, and the emotion recognition rate of anger, fear, happiness and sadness is more than 95%; After adding a variety of noise, the speech emotion recognition results are completely consistent with the actual speech emotion, which shows that this method has high anti noise performance.