
Research Article
Multi-scale and Coordinate Attention Residual Network for Efficient Keyword Spotting
@INPROCEEDINGS{10.1007/978-3-031-67162-3_31, author={Siying Chen and Hongqing Liu and Zhen Luo and Yi Zhou}, title={Multi-scale and Coordinate Attention Residual Network for Efficient Keyword Spotting}, proceedings={Communications and Networking. 18th EAI International Conference, ChinaCom 2023, Sanya, China, November 18--19, 2023, Proceedings}, proceedings_a={CHINACOM}, year={2024}, month={8}, keywords={Keyword spotting Coordinate attention Multi-scale feature fusion Lightweight model}, doi={10.1007/978-3-031-67162-3_31} }
- Siying Chen
Hongqing Liu
Zhen Luo
Yi Zhou
Year: 2024
Multi-scale and Coordinate Attention Residual Network for Efficient Keyword Spotting
CHINACOM
Springer
DOI: 10.1007/978-3-031-67162-3_31
Abstract
Small footprint and low computations are necessary for Keyword Spotting (KWS) models. The baseline model BC-ResNet is a well-known representative of that but lacks adequate channel and global features of input signals. To this end, two lightweight modules were proposed in this work, referred to as Lightweight Residual Coordinate Attention Module (LRCA) and Lightweight Multi-scale Feature Extraction Module (LMSFE). LRCA captures both potential channel features and shallow features by introducing the Coordinate attention (CA) and residual connections, respectively. Different from traditional subsampling methods, LMSFE can acquire rich global features at that stage. We propose a novel network based on the two modules, termed Multi-scale and Coordinate Attention Residual Network (MSCA-ResNet). Validation experiments are conducted on the public Google speech command dataset v2. The results demonstrate that the proposed MSCA-ResNet significantly improves the accuracy and slightly lower parameters and FLOPs compared with the baseline.