
Research Article
Unraveling the Techniques for Speaker Diarization
@INPROCEEDINGS{10.1007/978-3-031-48888-7_25, author={Ganesh Pechetti and Anakapalli Rohini Durga Bhavani and Abhinav Dayal and Sreenu Ponnada}, title={Unraveling the Techniques for Speaker Diarization}, proceedings={Cognitive Computing and Cyber Physical Systems. 4th EAI International Conference, IC4S 2023, Bhimavaram, Andhra Pradesh, India, August 4-6, 2023, Proceedings, Part I}, proceedings_a={IC4S}, year={2024}, month={1}, keywords={Speaker Diarization Segmentation Voice Activity Detection Pyannote Kaldi NeMo}, doi={10.1007/978-3-031-48888-7_25} }
- Ganesh Pechetti
Anakapalli Rohini Durga Bhavani
Abhinav Dayal
Sreenu Ponnada
Year: 2024
Unraveling the Techniques for Speaker Diarization
IC4S
Springer
DOI: 10.1007/978-3-031-48888-7_25
Abstract
This research paper aims to contribute to the field of speaker diarization by providing an in-depth analysis of existing audio datasets and evaluating prominent models. The study focuses on the suitability of these datasets for studying speaker diarization tasks and examines the performance of models such as pyannote-speaker diarization and NVIDIA NeMo speaker diarization. For aspiring researchers in the field, this paper serves as a solid foundation, offering valuable guidance and resources for experimentation in speaker diarization. The evaluation of the models reveals important insights. While each model has its advantages, their limitations must be considered. Overall, this research paper provides valuable insights into audio dataset analysis, model evaluation, and selection considerations for speaker diarization tasks. It equips researchers with essential knowledge to make informed decisions and lays the groundwork for further advancements in the field.