A Neuro Fuzzy Classifier with Linguistic Hedges for Speech Recognition

Fuzzy classification is the task of partitioning a feature space into fuzzy classes. A Neuro fuzzy classifier with linguistic hedges is proposed for noisy and clean speech classification. The linguistic Hedges are used to improve the meaning of fuzzy rules up to secondary level. Fuzzy entropy is applied to select optimal features of MFCC for framing the rules for designing the fuzzy inference system. Results obtained from the proposed classifier is compared over conventional and Neuro Fuzzy Classifier. The classification rates of the proposed model is better than other traditional and conventional fuzzy classifiers. 0.22 to 5% improved classification accuracy is observed for the FSDD dataset. And 5% to 11% of improved classification accuracy is observed for Kannada dataset. From this study it is identified that LH plays a major role in classifying the overlapped classes of data.


Introduction
Speech classification and recognition applications faces a lot of challenges due to the presence of various non-linearity's and noises present in the speech signal and environments. Lot of challenges are encountered in traditional computing system implementations. As the variability's of speech vary from one person to another, we require efficient computation models than ordinary conventional models. Hence Adaptive Neuro Fuzzy Systems [1], are used to construct an efficient fuzzy predictor model for speech recognition for predicting the speech classes effectively. Due to the fuzziness in the overlapped classes, conventional classifiers fail to classify the speech data. Hence to overcome these problems Neuro Fuzzy Classifiers are proposed. But still NFC does not classify the overlapped classes up to the mark. To increase the classification performance of the overlapped classes, the concept of fuzzy Linguistic Hedges are adopted in this work for classifying noisy and clean speech signal. The paper is organized as follows: Section 2 discusses about the existing fuzzy and conventional classifier available in the literature, Section 3 discusses various procedures adopted during LH-NFC model building. Section 4 discusses the algorithm for the proposed system. Results are tabulated and discussed in section 5. Conclusions and future enhancements are discussed in section 6.

Literature Survey
This section presents the existing literature review in the area of conventional fuzzy classifier, Neuro Fuzzy Classifier and adaptive Neuro Fuzzy Inference system. First mathematical model using fuzzy concept was applied on SVM and it is called as Fuzzy Support Vector Machine (FSVM) [10]  The adaptive Neuro fuzzy classifier was proposed by Jang [3] in 1992. The model was developed to simulate an input and outputs mapping based on human knowledge. Adaptive Neuro-fuzzy classifier was also designed to classify and select the features [8] by considering four layered feed-forward network for a fuzzy rule-based classifier. The network was trained by scaled conjugate gradient (SCG) algorithm to determine the optimum values of nonlinear parameters.
Adaptive classification model [9] was designed to integrate a standard fuzzy inference system and a neural network with supervised learning. The fuzzy rules are generated from the numerical data. Triangular membership functions were used for both Feature Extraction and Inference engine design.
The hybrid technique of neural network and Fuzzy systems are known as Neuro fuzzy computing system proposed by Pal and Ghosh in 1996[3,4]. A Neuro fuzzy classifier [5] for the spoken words of Guajarati and English datasets was developed to demonstrate the robustness of the noisy signals. The feasibility of Neuro fuzzy classifier [6] is studied for discrete dependent and independent data sets of phonemes and syllable data set. Kohonen and LVQ networks for compaction and classifying the data and the Neuro fuzzy system for classification. The experimental results are demonstrated with good precisions up to 95% to 96% for ANFIS.A Neuro fuzzy classifier for Thai spoken word recognition is discussed in [7] for the words recorded in a noisy and clean environment .The phoneme recognition using MFCC features is discussed in [11] . ANFIS learning system was developed using subtractive clustering to minimize the rules. A hybrid learning approach with gradient decent and least square estimation procedures are adapted to identify the optimal set of antecedents and consequent values.Paper [12] discusses about the usage of wavelet features and subtractive clustering techniques to reduce the rules in classification. Type-1 Adaptive Neural Fuzzy Inference [13] system is developed for speech recognition using MFCC features. The ANFIS with MFCC features has been tried for robotic application in [14] 3. Techniques used in the proposed work: This section presents the procedure adopted for feature extraction, feature reduction, Linguistic Hedges usage in inference system building and the classification process in modeling the proposed LH-NFC system.

Feature extraction:
i) Mel-frequency-cepstral coefficients MFCC [15]: This is one of the best and common methods used to extract the speech features. Since the frequency bands are equally spaced these features are preferred for speech recognition application simulating the human auditory systems response. The main steps of MFCC are as follows: (i) Speech signal is subjected to windowing followed by Fourier transform to reduce the spectral distortion. (v) The 12 MFCCs features are considered to calculate the amplitudes of the resulting spectrum.

ii) Fuzzy Entropy[16]:
The fuzzy entropy application was suggested in fuzzy inference system building. The specificity of fuzzy sets is to capture the idea of partial membership. Fuzzy Entropy is introduced according to the concept of probabilistic entropy [25].
Shannon Entropy concept is used to optimize the features from data file by removing least information feature.
Let A be a fuzzy set with membership function µA. These are the possible outputs from source A with the probability value P(xi) in eq (1) Where N is the possible outputs P(xi) Probability for each item x The fuzzy entropy measure is considered as fuzzy measure to evaluate the global deviations from the type of ordinary sets, i.e. any crisp set by reducing features.

iii) Linguistic Hedges[LH]
Linguistic hedges define CONS (Concentration) and DILN (Dilation) unary operators on fuzzy sets. In the conventional approach, each primary linguistic truth-value i.e. true or false is semantically assigned by a fuzzy set in the interval(0,1). Whereas, in Linguistic Hedges the composite fuzzy linguistic sets form fuzzy sets that consists of the truth-values that lies between Max and Min values of CONS and DILN.

LHs Representation
Linguistic Hedges are the operator values applied in between Concentration (CONS) and Dilation values (DILN). The LH membership value µ(x) is calculated using equation (2) Where p is the linguistic hedge value of the linguistic term A. (1)

iv) Linguistic Hedge -Neuro Fuzzy Classifier (LH-NFC)
A Neuro fuzzy system [17, 18, 19, 20, and 21] is an arrangement of neural network and fuzzy systems. In this work linguistic hedge is applied for NFC to obtain refined model of the LH-NFC. The NFC retrieves the features from input belonging to different classes. All the features are not equally important in indiscriminating all the classes, but the feature wise belongingness helps in the classification process. The LH-NFC process consists of five phases as discussed below.
1. In the first phase, the input values are fuzzified using Gaussian membership function [22]. 4. Weights are calculated based on fuzzy rules to identify the class.

5.
Defuzzification is performed to obtain the crisp output by applying weighted average method.

The Proposed Method
This section discusses about block diagram and the algorithm steps for building Linguistic Hedge-Neuro Fuzzy Classifier [27,28,29,30]for clean and noisy speech classification. The features are extracted and selected using MFCC and Fuzzy Entropy methods. The LH power fuzzy features are trained and classified using adaptive LH-NFC classifier with SCG. The application of Linguistic Hedge value increases the meaning of fuzzy rules and classification accuracy. The algorithm is as follows Step 1: Features are extracted using MFCC procedure Step 2: Optimal features are selected using fuzzy entropy technique.
Training process: Step Vani H Y, Anusuya M A tpq -LH value of the p th rule and the q th feature for each class.
Step 5: Firing strength of the p th rule is calculated by

D-Number of features
Step 6: Weighted outputs are calculated for each class by equation (6) wjk -amount of belonging to the k th class controlled with the j th rule; Opk -the weighted output for the p th example that belong to the k th class, M -Total number of rules .
Step 7: The outputs are normalized for values greater than 1 using equation (7). dik -The degree of normalized value of the d th sample of the k th class; K-number of different outputs (classes).
Step 8: The maximum normalized degree is determined by Where k varies from 1 to K and Ri class label Step 9: Weighted average is used to defuzzify the data by mapping the fuzzy sets and the corresponding membership degrees The crisp weighted average is computed by equation (9). C1,C2,...Cn -output fuzzy sets Step 10: Model testing

Dataset
Two different datasets considered are Free Spoken Digit Dataset (FSDD) [24] and Kannada data set, consisting of recordings of spoken digits and words sampled at 8kHz and 16kHz respectively. The recordings are trimmed, to have minimal silence at the beginnings and ends. FSDD consists of English pronunciation words of numbers from one to nine from four different speakers. Totally 900 signals with 100 signals of each digit is collected.
The second Kannada data set consists of isolated Kannada words. Totally 30 speakers with 20 male and 10 female speakers utterances are collected having 1000 words samples. The data set is made noisy by artificially adding Gaussian noise[26] additively for various SNRs(Signal To Noise Ratio) of 5dB, 10db and 15dB.

Results and observations
In this study, an adaptive Neuro-fuzzy classifier is developed by using linguistic hedges for classifying noisy and clean speech signals for various SNR's. The fuzzy classification rules are improved with linguistic hedges to enhance the meanings of the rules to the secondary level. The linguistic hedges are tuned by the scaled conjugate gradient by the number of iterations than keeping them constant. There is a average increase in classification accuracy from .22% to 5% for FSDD data set as shown in table 3.In this work LH-NFC is tried on the Kannada data set for the first time. Over the conventional classifiers LH-NFC has 5% to 9% for Kannada utterances as shown in table 2. The membership function for each class is represented in Figure 2. In this each curve identifies membership plot for individual classes. Gaussian curve member ship rule values are depicted in Figure 3.
(9)  The Comparison results of Neuro fuzzy, Linguistic Hedges -Neuro Fuzzy Classifier is tabulated in Table 3 with their accuracies and Root Mean Square Error (RMSE) values for both clean and noisy signals. It is observed from the Table3 that NFC-LH has better recognition accuracies with lower RMSE values for all the SNR levels. Table 4 shows the linguistic hedge values for every class in English data set. The maximum value at each row identifies the belongingness for each feature class. Plot 6 and Plot 7 presents the recognition accuracies over traditional NFC and the proposed LH-NFC  The performance comparisons of all the conventional with the proposed classifiers and their recognition accuracies for both the datasets is tabulated in Table 5. There is an improved performance of the proposed LH-NFC for both clean and noisy datasets. Table 6 represents the confusion matrix for FSDD data set.  1. Applying LH changes the input space of fuzzy sets, for the better handling of overlapped classes.
2. SCG is a better choice to improve the learning rate and the convergence rate.
3. Simple clustering algorithms can also be used to cluster the label data in rule formation decision.
4. The performance of the classification can be improved by providing the normalization of the data.
5. In Kannada, words like Nalku and Nale, Neeru falls approximately into the same group i.e. samples of overlapping classes (difference in phoneme level).
Overlapped data classes as well handled by LH-NFC models.
6. Uncertainties resulting from incomplete or imprecise input information, ambiguity or vagueness in input data, overlapping boundaries among classes or regions, and indefiniteness resulting from data are well handled by LH-NFC in extracting features

Conclusions
In this study, an adaptive Neuro-fuzzy classifier using linguistic hedges is proposed. Optimal features are obtained by applying fuzzy entropy technique. The fuzzy classification rules are improved by applying linguistic hedges. This helps in defining the rules more crisply for the overlapped classes. The classification rate using LH is improved from .22% to 5 % for FSDD and from 5% to 11% for Kannada data sets compared to other classification models. The results demonstrate the usage of SCG increases the convergence rate with the decreased value of RMSE. The application of LH helps in better classification of overlapped classes of clean and noisy signals.