
Research Article
Comparative Analysis of Pretrained Models for Speech Enhancement in Noisy Environments
@INPROCEEDINGS{10.1007/978-3-031-48888-7_23, author={Cheegiti Mahesh and Runkana Durga Prasad and Epanagandla Asha Bibi and Abhinav Dayal and Sridevi Bonthu}, title={Comparative Analysis of Pretrained Models for Speech Enhancement in Noisy Environments}, proceedings={Cognitive Computing and Cyber Physical Systems. 4th EAI International Conference, IC4S 2023, Bhimavaram, Andhra Pradesh, India, August 4-6, 2023, Proceedings, Part I}, proceedings_a={IC4S}, year={2024}, month={1}, keywords={Speech Enhancement ESPNet-SE Generative Adversarial Network SepFormer SpeechBrain MetricGAN+ SpeechBrain SepFormer}, doi={10.1007/978-3-031-48888-7_23} }
- Cheegiti Mahesh
Runkana Durga Prasad
Epanagandla Asha Bibi
Abhinav Dayal
Sridevi Bonthu
Year: 2024
Comparative Analysis of Pretrained Models for Speech Enhancement in Noisy Environments
IC4S
Springer
DOI: 10.1007/978-3-031-48888-7_23
Abstract
Speech Enhancement is the set of techniques and algorithms aimed at enhancing the overall quality of speech signals across diverse conditions both qualitatively and quantitatively. Speech enhancement aims to enhance voice signals whose quality has been diminished by various kinds of noise or distortion. Different techniques were adopted in previous years. Researchers have started working with Machine Learning techniques recently, prior to which they have followed traditional methods like Wiener Filtering, Spectral Subtraction, etc. The advancement of machine learning techniques day by day has laid the path for our work. Our work is to investigate the performance of three models viz., ESPNet-SE, SpeechBrain MetricGAN+ and SpeechBrain SepFormer models on a mixed dataset namelyVoiceBankandDemand, which has added noise on clean signals. Among all the models, SpeechBrain MetricGAN+ performed well by approximately30.05%on ESPNet-SE and10.29%on SpeechBrain SepFormer models. Trained models are publicly available.