
Research Article
A CNN-Based Algorithm with an Optimized Attention Mechanism for Sign Language Gesture Recognition
@INPROCEEDINGS{10.1007/978-3-031-50580-5_8, author={Kai Yang and Zhiwei Yang and Li Liu and Yuqi Liu and Xinyu Zhang and Naihe Wang and Shengwei Zhang}, title={A CNN-Based Algorithm with an Optimized Attention Mechanism for Sign Language Gesture Recognition}, proceedings={Multimedia Technology and Enhanced Learning. 5th EAI International Conference, ICMTEL 2023, Leicester, UK, April 28-29, 2023, Proceedings, Part IV}, proceedings_a={ICMTEL PART 4}, year={2024}, month={2}, keywords={Sign language MobileNet YOLO gesture recognition}, doi={10.1007/978-3-031-50580-5_8} }
- Kai Yang
Zhiwei Yang
Li Liu
Yuqi Liu
Xinyu Zhang
Naihe Wang
Shengwei Zhang
Year: 2024
A CNN-Based Algorithm with an Optimized Attention Mechanism for Sign Language Gesture Recognition
ICMTEL PART 4
Springer
DOI: 10.1007/978-3-031-50580-5_8
Abstract
Sign language is the main method for people with hearing impairment to communicate with others and obtain information from the outside world. It is also an important tool to help them integrate into society. Continuous sign language recognition is a challenging task. Most current models need to pay more attention to the ability to model lengthy sequences as a whole, resulting in low accuracy in the recognition and translation of longer sign language videos. This paper proposes a sign language recognition network based on a target detection network model. First, an optimized attention module is introduced in the backbone network of YOLOv4-tiny, which optimizes channel attention and spatial attention and replaces the original feature vectors with weighted feature vectors for residual fusion. Thus, it can enhance feature representation and reduce the influence of other background sounds; In addition, to reduce the time-consuming object detection, three identical MobileNet modules are used to replace the three CSPBlock modules in the YOLOv4-tiny network to simplify the network structure. The experimental results show that the enhanced network model has improved the average precision mean, precision rate, and recall rate, respectively, effectively improving the detection accuracy of the sign language recognition network.