Hearing loss classification via stationary wavelet entropy and Biogeography-based optimization

Introduction: Sensorineural hearing loss is associated with many complications and needs timely detection and diagnosis . Objectives: Optimize the sensorineural hearing loss detection system to improve the accuracies of image detection. Method: The stationary wavelet entropy was used to extract the features of NMR images, the single hidden layer neural network was used for classification, and the BBO algorithm was used for optimization to avoid the dilemma of local optimum. We used two-level SWE as input to the classifier to enhance the identify and classify ability of hearing loss. Results: The results of 10-fold cross validation show that the accuracies of HC, LHL and RHL are 91.83± 3.09%, 92.67±2.38% and 91.17±2.61%, respectively. The overall accuracy is 91.89±0.70%. Conclusion: This model has good performance in detecting hearing loss .


Introduction
Sensorineural hearing loss (SNHL) is the most common sensory deficit in the world, which caused by the dysfunction of one or more parts of the auditory pathway between the inner ear and the auditory cortex [1]. It is primarily manifested by unilateral or bilateral ears progressive hearing impairment at different levels even deafness, accompanied with tinnitus, sensation of intra-aural occlusion and the like *Corresponding author. Email: Yaochong@home.hpu.edu.cn [2]. It often leads to depression, falls, lower intelligence, speech and language delay, and other complications. standard for neurological hearing loss, so as to achieve the purpose of detection [3]. Therefore, based on the above characteristics, scholars generally choose to use computer vision combined with machine learning to detect hearing loss [4][5][6][7][8], and previous experiments have also shown the effectiveness of this system. But many existing detection systems easily fall into the dilemma of local optimization when training and optimizing neural networks. We put forward a new idea: we used the BBO algorithm to replace the previous optimization algorithm, based on its genetic variation characteristics, to avoid the short-term blind optimal state of the system.
Besides, we proposed to use SWE to avoid large deviation of experimental results due to small changes. Finally, the overall accuracy of our model reached 91.89±0.70%.

Background
Several methods have been proposed to detect hearing loss images and brain images in the past. O. Profant (2014) et al [9] used MR morphometry and DTI to study SNHL. They mainly study the physiological changes of hearing loss patients, which makes a great contribution to the image analysis research of later scholars. F. Liu(2017) et al [10] proposed to combine wavelet entropy with feedforward neural network trained by genetic algorithm to defect hearing loss. Their method using 4-level decomposition yielded the overall accuracy of 81.11±1.34%. Their earlier application of genetic algorithm for hearing loss detection system provides ideas for many subsequent researchers. Fang-zhou BAO (2018) et al [11]  In the process of referring to previous studies, we found there are common problems in defecting hearing loss: How to extract more information from MRI images?
(ii) How to select a classifier to make the results more robust?
(iii) How to optimize after selecting classifier?
These questions are raised to ensure that we can get better performance in hearing loss detection. Based on the above problems, we propose the following solutions in this paper: our team used SWE to extract features and proposed using BBO algorithm, which can have a global optimization to the neural network and the image is classified by single-hiddenlayer neural network. We will explain the model we chose in the next sections of the article.
The remaining parts of this paper are organized as follows: Section 2 shows the background of the hearing loss detection.
Section 2 provides the date source. Section 1 introduces the basic principle of SWE, the construction of single-hiddenlayer neural network and the optimization principle of BBO algorithm. Section 5 introduces the experimental results and data analysis. Section 1 gives the conclusion. Hearing loss classification via stationary wavelet entropy and Biogeography-based optimization and the cost of MRI scans, it is difficult to obtain a large sample of patients with hearing loss, but in a similar number of sample sets, many researchers have done studies and obtained good performance. In order to reduce the influence of other diseases on our model, we excluded subjects with known mental or neurological diseases, brain injury (such as tumor or stroke), psychotropic medication, and contraindications to magnetic resonance imaging from the sample selection. The epidemiological investigation and analysis results of the patients are shown in the Table 1.
This part mainly investigates the gender, age distribution and other aspects of the disease. The normal features of the inner ear and image detection standard are shown as Table 2.   [13] used the Stationary wavelet transform (SWT) to predicts gene expression. Discrete wavelet transform (DWT) and wavelet entropy (WE) are more common method but exist a problem of preserving the translation invariance property.
In response to this question, wang (2018) et al [14] introduced a novel feature named as stationary wavelet entropy (SWE), which combined stationary wavelet transform (SWT) and Shannon entropy. Three different numbers of features were used in the experiment, but the results all showed that the accuracy of SWE was significantly higher than that of other wavelets, nearly 100%. Therefore, compared with three traditional features that were successfully applied in pathological brain detection, SWE has better classification ability.
The one-level SWE utilize filters to decompose image and extract the entropy of the four sub-bands with names of LL1, LH1 and HH1 as shown in Figure 1. ( * represents row-wise filter and * represents column-wise filter, l and h represent the low-pass and the high-pass filters respectively) The two-level SWE use LL1 sub-band to perform another one-level SWE as shown in Figure 2. And so on, the four sub-bands of different levels are shown below: Then, we calculate the entropy values. We choose Shannon entropy with definition of Here represents the -th element of a given subband.
Combined with the above, the SWE pseudocode is listed in the Table 3. Table 3. Algorithm of Stationary Wavelet Entropy.

Stationary Wavelet Entropy (SWE)
Step A Import the preprocessed MRI images; Step B Select the best wavelet in the wavelet family; Step C Choose decomposition level ; Step D Perform stationary wavelet transform (SWT) on the given images; Step E Generate and record (3 + 1) wavelet subbands; Step F Use formula 5 to calculate entropy over each subband; Step G Vectorize all the entropy results and output it as the feature to input layer

Single-hidden-layer feedforward neural network
There are many efficient classifiers, which all show high accuracy in different fields: such as Backpropagation Neural Network used for cervical cancer classification [15], SVM utilized for numerous real-world applications/ problems [16], deep learning used for diseased pinus trees recognition [17]. And in recent years, many neural network architectures based on the bionics have been proposed, while multi-layer feedforward neural network (FFNN) is considered as one of fundamental architectures. Deep learning based approach will also be the trend of future research [18]. But it is very time-consuming to train when FFNN is used for classification or regression [19]. Hence, single-hidden-layer feedforward neural network (SLFN) becomes an optimal selection [20].
SLFN includes input layer, output layer and hidden layer.
There is no feedback in the whole network, that is, directed acyclic graph. Its application in this experiment is shown as  Deep learning approaches [21][22][23][24][25][26][27][28][29] were not used since our dataset is of small size, the deep neural networks require a large dataset. After the input layer receives the image dataset from SWE. The learning function can be defined as where is the th hidden node, is inner weight connecting the input layer with the th hidden node, is outer weight connecting the th hidden node with the output node, is the value of the th hidden node. In order to find the optimal parameter, we just need to find the parameter that minimizes . Biogeography-based Optimization (BBO) algorithm has been widely used since it was proposed by Simon(2008) [32] because of its advantages such as low problem dependence, few algorithm parameters and easy implementation. X.

Zhang et al (2020 ) [33] used improved Laplacian
Biogeography-Based Optimization Algorithm for Quadratic Assignment Problems (QAPs) and Q. Niu et al(2014) [34] used BBO for model parameter estimation of solar and fuel cells. They all performed high accuracy rate.
In our paper we select the BBO algorithm, because of its strong mining capacity, integer coding, less time, fast convergence and not easy to fall into the local optimal [35].
In Biogeography-based optimization algorithm, each solution is considered as a habitat with a habitat suitability index (HSI). The factors that affect HSI are called Suitable Index Vector (SIV) like climate. Each habitat has its own immigration rate ( )and emigration rate ( ). Habitats with a high HIS have a low species immigration rate and a high emigration rate. On the contrary, a low HIS means a high species immigration rate and a low emigration rate.
According to the migration of species between different habitats, the habitat with low HIS value can obtain more information form the habitat with high HIS value, so as to realize the continuous evolution of the habitat. Our team mainly use migration and mutation models of species to optimize the neural network.
In BBO algorithm, immigration rate and emigration rate are expressed as where is the biggest immigration rate, E is the largest emigration rate, is the current population quantity, is the maximum population size. The relationship can be shown in Figure 4. Suppose the maximum possible migration rate of this habitat is I and take the maximum when the species of this habitat is zero. As the number of species increases, habitats become more crowded, fewer species are able to successfully move into the habitat, and the rate of migration is decreasing.
The maximum number of species that can be sustained in the habitat is 0.
In addition to migration, habitat can also be changed by sudden disaster, which is called mutation in biogeography. The pipeline of BBO algorithm can be shown in Figure 5.
As with other optimization algorithms, BBO also incorporate some sort of elitism in order to retain the best solutions.

Implementation
Overall, the identification system we purposed to defect can be depicted in Figure 6. BBO algorithm take the sample error as the target function. The flow of single-hidden-layer neural network optimization based on BBO algorithm is in Algorithm 3.

Measure
In order to avoid the overfitting phenomenon in the experiment, we adopt the cross validation. To be more precise, we take 10 runs 10-fold cross validation.    Hearing loss classification via stationary wavelet entropy and Biogeography-based optimization of a certain kind, which can be obtained by dividing diagonal elements with the whole elements in the row.

Analysis of Our Method SWE-BBO
In this experiment, we used two-level decomposition and db4 wavelet to extract features. Moreover, BBO algorithm is proposed to optimize the neural network classifier. We will give the reasons one by one in the following sections. First, analyze the overall experimental results. The sensitivity and overall accuracy (OA) of 10-fold stratified cross validation are listed in Table 4. The sensitivities over the three subject classes are 91.83± 3.09%, 92.67± 2.38% and 91.17± 2.61%, respectively. The overall accuracy is 91.89± 0.70% high with a small error. In order to make the data more vividly represented, we can see the line chart of this experiment in Figure7. It is obvious from the figure that the accuracies of this test are all above 88%, and the overall accuracy is stable at around 91%, which means that our method is very robust. These data are sufficient to confirm the good performance of our model. As a very rigorous discipline, medical science cannot ignore the mistakes of individual sample while pursuing a high degree of overall accuracy. A small mistake may affect the life of patients. We can see that the accuracy of different types of images are not different, and they all show good performance. But it also illustrates that the right hearing loss is easier to identify.
This may be due to the significant increase in ALFF and fALFF values in patients with unilateral SSNHL during the acute phase. It suggests that the resting brain function of patients with unilateral SSNHL may be hyperactivated during the acute period. It has also been suggested that the level of activity in the right medial temporal gyrus could be used as a potential imaging biomarker to assess the degree of auditory impairment.

Comparison of WE and SWE
As we mentioned in the previous section, many experiments used traditional WE before. Therefore, we tested using the WE and BBO algorithms to get better persuasion. Table 5 shows us the sensitivity and overall accuracy results of WE-BBO under the same conditions as SWE-BBO we mentioned before. The sensitivities over the three subject classes are 85.17± 2.28%, 87.17± 2.23% and 86.00± 3.53%. The overall accuracy is 86.11± 1.08%. In addition, in this section, we will explain why we chose SWE. in overall accuracy. In order to make the experimental results clearer, we made error bar of the two methods in Figure 8.
Each indicator of SWE-BBO in the figure is higher than WE-BBO and occupies the upper part of the y-value range.
Therefore, we chose the one with better performance. This may be because SWT, which makes up SWE, provides more information than WT, which makes up WE.

Optimal wavelet selection
In this experiment, we fix the decomposition level as 2, which reason will be given in the following section, and choose the Daubechies wavelet which is the most popular among the wavelet family as db4. In this section we will do some experiments with varying the wavelets to see why our choice was optimal. The experimental results of db1, db2 and db3 can be shown in Table 6. We compared the data in this section with the experimental results in Table 4 (db4) and get Figure 9, which significantly shows the superiority of the db4 we chose.

Optimal Decomposition Level
In this section, we will present the experimental process in which we choose the 2-level decomposition. In order to find the optimal decomposition level, we tested the accuracy and   In Figure 10, we can observe that the data of sensitivity and overall accuracy peaked at 2-level SWE and gradually declined. Therefore, we chose the 2-level decomposition in our experiment. It is not hard to understand that 2-level SWE offers more information with seven sub-bands which are the same sizes of original brain image.

Comparison to State-of-the-art Approaches
We compared the model accuracy of this experiment with other current advanced methods: WE-GA [9], HMI [36] and SVM-PSO [10]. The comparison results are shown in Table   8. It shows HMI yielded an OA of 90.22±0.95%, the WE-GA yielded an OA of 77.47±1.17%, the SVM-PSO yielded an OA of 81.11±1.34% and the SWE-BBO yielded an OA of 86.17±0.41%. It is obvious from the numerical values that our method has a good performance.

Conclusions
In this paper, our team proposed to use SWE and singlehidden-layer neural network to constitute a system for detecting hearing loss, and to optimize it by BBO biogeography algorithm. The overall accuracy of HC, LHL and RHL is 91.89±0.70%. At the same time, we also analyzed the traditional model, elaborated our choice reason, and confirmed the optimal decomposition level 2, db4 and SWE through the accuracy. In the future we will explore more effective neural network searches to detect pathological images. In addition, we will actively seek more effective image preprocessing methods in order to achieve higher accuracy.