
Research Article
Learning Consistent Embedding Distribution for Robust ASR
@INPROCEEDINGS{10.1007/978-3-031-71716-1_1, author={Zuyao Ma and Yulong Wang and Tongcun Liu and Lei Zhang and Wei Li}, title={Learning Consistent Embedding Distribution for Robust ASR}, proceedings={Machine Learning and Intelligent Communication. 8th EAI International Conference, MLICOM 2023, Beijing, China, December 17, 2023, Proceedings}, proceedings_a={MLICOM}, year={2024}, month={9}, keywords={Robust ASR Distribution transformation Speech enhancement}, doi={10.1007/978-3-031-71716-1_1} }
- Zuyao Ma
Yulong Wang
Tongcun Liu
Lei Zhang
Wei Li
Year: 2024
Learning Consistent Embedding Distribution for Robust ASR
MLICOM
Springer
DOI: 10.1007/978-3-031-71716-1_1
Abstract
Despite the success achieved by existing Automatic Speech Recognition (ASR) models, they are highly dependent on the sufficiency of labeled clean training data, which is unrealistic in practice due to expensive labeling costs and unpredictable noise. To address this challenge, we propose a novel Distribution Transformation network (DT-net), which attempts to refine the pre-trained embeddings to mitigate the influence brought by noise. The proposed DT-net consists of a front-end Speech Enhancement (SE) module and a back-end Automatic Speech Recognition (ASR) module. Besides, two types of novel distribution transformations are introduced into the SE and ASR module respectively to adapt to the distributions of clean and noisy pre-trained embeddings. Extensive experiments conducted in public datasets, CHiME-4, reveal that the proposed DT-net outperforms other baselines in terms of both recognition performance and robustness.