10th EAI International Conference on Bio-inspired Information and Communications Technologies (formerly BIONETICS)

Research Article

Efficient Feature Vector Clustering for Automatic Speech Recognition Systems

Download
  • @INPROCEEDINGS{10.4108/eai.22-3-2017.152399,
        author={Lilia Lazli and Mounir Boukadoum and Otmane Ait Mohamed and Mohamed-Tayeb Laskri},
        title={Efficient Feature Vector Clustering for Automatic Speech Recognition Systems},
        proceedings={10th EAI International Conference on Bio-inspired Information and Communications Technologies (formerly BIONETICS)},
        publisher={ACM},
        proceedings_a={BICT},
        year={2017},
        month={3},
        keywords={unsupervised speech clustering genetic algorithm fuzzy c-means algorithm speech recognition system hmm/mlp},
        doi={10.4108/eai.22-3-2017.152399}
    }
    
  • Lilia Lazli
    Mounir Boukadoum
    Otmane Ait Mohamed
    Mohamed-Tayeb Laskri
    Year: 2017
    Efficient Feature Vector Clustering for Automatic Speech Recognition Systems
    BICT
    ACM
    DOI: 10.4108/eai.22-3-2017.152399
Lilia Lazli1,*, Mounir Boukadoum2, Otmane Ait Mohamed3, Mohamed-Tayeb Laskri4
  • 1: ÉTS, University of Quebec
  • 2: Department of Computer Science, UQAM, University of Quebec, Montreal, Quebec Canada
  • 3: Department of Electrical Engineering and Computer Science, Concordia University, Montreal, Quebec Canada
  • 4: Department of Computer Science, UBMA, University of Badji Mokhtar, Annaba Algeria
*Contact email: lilia.lazli.1@ens.etsmtl.ca

Abstract

In this paper, we present an efficient algorithm for the clustering of speech data. The algorithm based on regulating a similarity measure to set the number of clusters and the cluster boundaries, thus overcoming the shortcomings of conventional clustering algorithms such as k-Means and Fuzzy C-Means, which require a priori knowledge of the number of clusters, the use of similarity measure that follows the data distribution, and are sensitive to the choice of initial configuration, The algorithm performance was tested in an HMM/MLP automatic speech recognition system, with the results were compared with those obtained when using a combination of Fuzzy C-Means and genetic algorithms to do the clustering, showing better performance.