About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Communications and Networking. 18th EAI International Conference, ChinaCom 2023, Sanya, China, November 18–19, 2023, Proceedings

Research Article

Multi-scale and Coordinate Attention Residual Network for Efficient Keyword Spotting

Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-67162-3_31,
        author={Siying Chen and Hongqing Liu and Zhen Luo and Yi Zhou},
        title={Multi-scale and Coordinate Attention Residual Network for Efficient Keyword Spotting},
        proceedings={Communications and Networking. 18th EAI International Conference, ChinaCom 2023, Sanya, China, November 18--19, 2023, Proceedings},
        proceedings_a={CHINACOM},
        year={2024},
        month={8},
        keywords={Keyword spotting Coordinate attention Multi-scale feature fusion Lightweight model},
        doi={10.1007/978-3-031-67162-3_31}
    }
    
  • Siying Chen
    Hongqing Liu
    Zhen Luo
    Yi Zhou
    Year: 2024
    Multi-scale and Coordinate Attention Residual Network for Efficient Keyword Spotting
    CHINACOM
    Springer
    DOI: 10.1007/978-3-031-67162-3_31
Siying Chen1,*, Hongqing Liu2, Zhen Luo, Yi Zhou2
  • 1: School of Communications and Information Engineering
  • 2: Intelligent Speech and Audio Research Lab
*Contact email: s210101014@stu.cqupt.edu.cn

Abstract

Small footprint and low computations are necessary for Keyword Spotting (KWS) models. The baseline model BC-ResNet is a well-known representative of that but lacks adequate channel and global features of input signals. To this end, two lightweight modules were proposed in this work, referred to as Lightweight Residual Coordinate Attention Module (LRCA) and Lightweight Multi-scale Feature Extraction Module (LMSFE). LRCA captures both potential channel features and shallow features by introducing the Coordinate attention (CA) and residual connections, respectively. Different from traditional subsampling methods, LMSFE can acquire rich global features at that stage. We propose a novel network based on the two modules, termed Multi-scale and Coordinate Attention Residual Network (MSCA-ResNet). Validation experiments are conducted on the public Google speech command dataset v2. The results demonstrate that the proposed MSCA-ResNet significantly improves the accuracy and slightly lower parameters and FLOPs compared with the baseline.

Keywords
Keyword spotting Coordinate attention Multi-scale feature fusion Lightweight model
Published
2024-08-06
Appears in
SpringerLink
http://dx.doi.org/10.1007/978-3-031-67162-3_31
Copyright © 2023–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL