Automatic Speech Recognition for Indian Accent Lectures contents using End-to-End Speech Recognition model

Ashok Kumar L; Karthika Renuka D; Raajkumar G

Proceedings of the First International Conference on Combinatorial and Optimization, ICCAP 2021, December 7-8 2021, Chennai, India

Research Article

Automatic Speech Recognition for Indian Accent Lectures contents using End-to-End Speech Recognition model

Download1117 downloads

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.4108/eai.7-12-2021.2314531,
    author={Ashok Kumar  L and Karthika Renuka  D and Raajkumar  G},
    title={Automatic Speech Recognition for Indian Accent Lectures contents using End-to-End Speech Recognition model},
    proceedings={Proceedings of the First International Conference on Combinatorial and Optimization, ICCAP 2021, December 7-8 2021, Chennai, India},
    publisher={EAI},
    proceedings_a={ICCAP},
    year={2021},
    month={12},
    keywords={automatic speech recognition (asr) indian accent word error rate (wer) nptel lecture audio listen attend and spell (las)},
    doi={10.4108/eai.7-12-2021.2314531}
}

Ashok Kumar L
Karthika Renuka D
Raajkumar G
Year: 2021
Automatic Speech Recognition for Indian Accent Lectures contents using End-to-End Speech Recognition model
ICCAP
EAI
DOI: 10.4108/eai.7-12-2021.2314531

Ashok Kumar L¹^,*, Karthika Renuka D¹, Raajkumar G¹

1: PSG College of Technology

*Contact email: lak.eee@psgtech.ac.in

Abstract

In a variety of voice search applications, Automatic speech recognition (ASR) systems are used. The process of turning speech to text is known as automatic speech recognition (ASR). Most of the ASR research is happening using American and British accent. Hence, in this work we have made an attempt to convert Indian accent speech to text using NPTEL lecture audio. The proposed work involves speech to text using deep learning models for Indian accent speech. LAS has two main components one is based on sequence-to-sequence framework with a pyramid structure, by reducing the encoder steps in number the decoder must attend through the end-to-end process. The result obtained from the proposed work improve Word Error Rate of 14%.

Keywords: automatic speech recognition (asr), indian accent, word error rate (wer), nptel lecture audio, listen, attend, and spell (las)

Published: 2021-12-22
Publisher: EAI

: http://dx.doi.org/10.4108/eai.7-12-2021.2314531

Automatic Speech Recognition for Indian Accent Lectures contents using End-to-End Speech Recognition model

Abstract

About EAI

Community

Publish with EAI