About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Machine Learning and Intelligent Communication. 8th EAI International Conference, MLICOM 2023, Beijing, China, December 17, 2023, Proceedings

Research Article

Learning Consistent Embedding Distribution for Robust ASR

Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-71716-1_1,
        author={Zuyao Ma and Yulong Wang and Tongcun Liu and Lei Zhang and Wei Li},
        title={Learning Consistent Embedding Distribution for Robust ASR},
        proceedings={Machine Learning and Intelligent Communication. 8th EAI International Conference, MLICOM 2023, Beijing, China, December 17, 2023, Proceedings},
        proceedings_a={MLICOM},
        year={2024},
        month={9},
        keywords={Robust ASR Distribution transformation Speech enhancement},
        doi={10.1007/978-3-031-71716-1_1}
    }
    
  • Zuyao Ma
    Yulong Wang
    Tongcun Liu
    Lei Zhang
    Wei Li
    Year: 2024
    Learning Consistent Embedding Distribution for Robust ASR
    MLICOM
    Springer
    DOI: 10.1007/978-3-031-71716-1_1
Zuyao Ma,*, Yulong Wang, Tongcun Liu, Lei Zhang, Wei Li
    *Contact email: mazuyao233@bupt.edu.cn

    Abstract

    Despite the success achieved by existing Automatic Speech Recognition (ASR) models, they are highly dependent on the sufficiency of labeled clean training data, which is unrealistic in practice due to expensive labeling costs and unpredictable noise. To address this challenge, we propose a novel Distribution Transformation network (DT-net), which attempts to refine the pre-trained embeddings to mitigate the influence brought by noise. The proposed DT-net consists of a front-end Speech Enhancement (SE) module and a back-end Automatic Speech Recognition (ASR) module. Besides, two types of novel distribution transformations are introduced into the SE and ASR module respectively to adapt to the distributions of clean and noisy pre-trained embeddings. Extensive experiments conducted in public datasets, CHiME-4, reveal that the proposed DT-net outperforms other baselines in terms of both recognition performance and robustness.

    Keywords
    Robust ASR Distribution transformation Speech enhancement
    Published
    2024-09-20
    Appears in
    SpringerLink
    http://dx.doi.org/10.1007/978-3-031-71716-1_1
    Copyright © 2023–2025 ICST
    EBSCOProQuestDBLPDOAJPortico
    EAI Logo

    About EAI

    • Who We Are
    • Leadership
    • Research Areas
    • Partners
    • Media Center

    Community

    • Membership
    • Conference
    • Recognition
    • Sponsor Us

    Publish with EAI

    • Publishing
    • Journals
    • Proceedings
    • Books
    • EUDL