About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Proceedings of the 2nd International Conference on Machine Learning and Automation, CONF-MLA 2024, November 21, 2024, Adana, Turkey

Research Article

Enhancing Video Based Emotion Recognition with Multi-Head Attention and Modality Dropout

Download103 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.4108/eai.21-11-2024.2354608,
        author={Xu  Li},
        title={Enhancing Video Based Emotion Recognition with Multi-Head Attention and Modality Dropout},
        proceedings={Proceedings of the 2nd International Conference on Machine Learning and Automation, CONF-MLA 2024, November 21, 2024, Adana, Turkey},
        publisher={EAI},
        proceedings_a={CONF-MLA},
        year={2025},
        month={3},
        keywords={multimodal model emotion recognition modality dropout},
        doi={10.4108/eai.21-11-2024.2354608}
    }
    
  • Xu Li
    Year: 2025
    Enhancing Video Based Emotion Recognition with Multi-Head Attention and Modality Dropout
    CONF-MLA
    EAI
    DOI: 10.4108/eai.21-11-2024.2354608
Xu Li1,*
  • 1: Northeastern University
*Contact email: li.xu2@northeastern.edu

Abstract

Multimodal emotion recognition has become a critical component in enhancing human-computer interaction systems due to its capacity to integrate multiple modalities. In this paper, a novel cross-modal fusion model CFNSR-MSAFNet was proposed with Multi-Head Attention mechanism and modality drop out to improve the accuracy of emotion recognition. The Multi-Head Attention mechanism allows the model to learn and observe multiple aspects from both audio and video input, capturing complex interactions between these two modalities. Additionally, modality dropout is introduced during training, forcing the model to learn representations to handle the missing or noisy data. The proposed model achieved 78.33% of accuracy on the RAVDESS dataset. Our results demonstrate the effectiveness of MHA and modality dropout in improving the performance of multimodal emotion recognition systems by enhancing cross-modal alignment and generalization.

Keywords
multimodal model emotion recognition modality dropout
Published
2025-03-11
Publisher
EAI
http://dx.doi.org/10.4108/eai.21-11-2024.2354608
Copyright © 2024–2025 EAI
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL