Multi-scale and Coordinate Attention Residual Network for Efficient Keyword Spotting

Siying Chen; Hongqing Liu; Zhen Luo; Yi Zhou

Communications and Networking. 18th EAI International Conference, ChinaCom 2023, Sanya, China, November 18–19, 2023, Proceedings

Research Article

Multi-scale and Coordinate Attention Residual Network for Efficient Keyword Spotting

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-031-67162-3_31,
    author={Siying Chen and Hongqing Liu and Zhen Luo and Yi Zhou},
    title={Multi-scale and Coordinate Attention Residual Network for Efficient Keyword Spotting},
    proceedings={Communications and Networking. 18th EAI International Conference, ChinaCom 2023, Sanya, China, November 18--19, 2023, Proceedings},
    proceedings_a={CHINACOM},
    year={2024},
    month={8},
    keywords={Keyword spotting Coordinate attention Multi-scale feature fusion Lightweight model},
    doi={10.1007/978-3-031-67162-3_31}
}

Siying Chen
Hongqing Liu
Zhen Luo
Yi Zhou
Year: 2024
Multi-scale and Coordinate Attention Residual Network for Efficient Keyword Spotting
CHINACOM
Springer
DOI: 10.1007/978-3-031-67162-3_31

Siying Chen¹^,*, Hongqing Liu², Zhen Luo, Yi Zhou²

1: School of Communications and Information Engineering
2: Intelligent Speech and Audio Research Lab

*Contact email: s210101014@stu.cqupt.edu.cn

Abstract

Small footprint and low computations are necessary for Keyword Spotting (KWS) models. The baseline model BC-ResNet is a well-known representative of that but lacks adequate channel and global features of input signals. To this end, two lightweight modules were proposed in this work, referred to as Lightweight Residual Coordinate Attention Module (LRCA) and Lightweight Multi-scale Feature Extraction Module (LMSFE). LRCA captures both potential channel features and shallow features by introducing the Coordinate attention (CA) and residual connections, respectively. Different from traditional subsampling methods, LMSFE can acquire rich global features at that stage. We propose a novel network based on the two modules, termed Multi-scale and Coordinate Attention Residual Network (MSCA-ResNet). Validation experiments are conducted on the public Google speech command dataset v2. The results demonstrate that the proposed MSCA-ResNet significantly improves the accuracy and slightly lower parameters and FLOPs compared with the baseline.

Keywords: Keyword spotting, Coordinate attention, Multi-scale feature fusion, Lightweight model

Published: 2024-08-06
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-031-67162-3_31

Multi-scale and Coordinate Attention Residual Network for Efficient Keyword Spotting

Abstract

About EAI

Community

Publish with EAI