Diagnosis of cardiac arrhythmia using Swarm-intelligence based Metaheuristic Techniques: A comparative analysis

INTRODUCTION: Heart diseases are the prominent human disorders that have significantly affected the lifestyle and lives of the victims. Cardiac arrhythmia (heart arrhythmia) is one of the critical heart disorders that reflects the state of heartbeat among individuals. ECG (Electrocardiogram) signals are commonly used in the diagnostic process of this cardiac disorder. OBJECTIVES: In this manuscript, an effort has been made to employ and examine the performance of emerging Swarm Intelligence (SI) techniques in finding an optimal set of features used for cardiac arrhythmia diagnosis. METHODS: A standard benchmark UCI dataset set comprises of 279 attributes and 452 instances have been considered. Five different SI-based meta-heuristic techniques viz. binary Grey-Wolf Optimizer (bGWO), Ant Lion Optimization(ALO), Butterfly optimization algorithm (BOA), Dragonfly Algorithm (DA), and Satin-Bird Optimization (SBO) have been also employed for the same. Additionally, five novel chaotic variants of SBO have been designed to solve the feature selection problem for diagnosing a cardiac arrhythmia. Different performance metrics like accuracy, fitness value, optimal set of features and execution time have been computed. CONCLUSION: It has been observed from the experimentation that in terms of accuracy and fitness value of cardiac arrhythmia, the SBO outperformed other SI algorithms viz. bGWO, DA, BOA, and ALO. Additionally, BOA and ALO seem to be the best fit when the emphasis is on dimension size only.


Introduction
Heart diseases are the pre-eminent and critical human global problems that have significantly affected individuals and families across the globe.Heart attack, coronary artery disease, hypertrophic cardiomyopathy, arrhythmia, congenital heart disease, heart failure, stroke are some of the major cardiac human disorders [1].Cardiac arrhythmia is a heart disease related to the heartbeat-rate or heart rhythm.The malfunctioning of heart electrical impulses leads to this human disease.A cardiac arrhythmia victim may suffer from the too slow or too fast heartbeat.Tachycardia, Bradycardia, Supraventricular arrhythmias, Ventricular arrhythmia are the major types of arrhythmias.Tachycardia occurs when the heart-beat or heart rhythm is more than 100bpm(beats per minute).Bradycardia occurs when the heart-beat or heart rhythm is less than 60bpm.Tachycardia and Bradycardia refer to the fast and slow heart rhythms respectively [2].In some cases, the severe arrhythmia causes the dizziness, chest pain, fainting, breathlessness, fatigue, prolific sweating etc. [3].In extreme conditions, it may become life-threatening and cause to cardiac-arrest and stroke [4].Human diseases like coronary artery disorder, high blood-pressure, thyroid, diabetes, electrolyte imbalance, stress and overuse of drugs, smoking may lead to this kind of human disorder.
Swarm Intelligence is a term fabricated by Beni and Wang in the late 1980s [5].SI is based on five major principles viz.proximity principle, quality principle, the principle of diverse response, the principle of stability, and the principle of adaptability [6].SI is a nature-inspired phenomenon that entirely relies upon the collective social behaviour of different species viz.insects, a flock of birds, school of fishes, mammals etc.It has been witnessed that various computing methods like Naïve Bayes, Random-Forest, Decision-Tree, SVM etc., deep learning techniques(CNN, RBM, RNN etc.), swarm intelligence techniques(Ant-Colony, Honey-Bee, Firefly Algorithm, Grey Wolf Optimization Algorithm etc.), and fuzzy logic have been used in the prognosis of a wide range of human disorders like diabetes [7][8], cancer [9][10], psychological [11][12], neurological [13] [14] and heart-related problems [15][16].
Features play an important role in early and precise disease diagnosis.The feature is an attribute that represents important characteristics of the dataset.Feature selection is a pre-eminent step of data pre-processing that helps in extracting a subset of features from the native data set.It is a process of uprooting the significant features related to an optimization problem [17].Feature selection aims to extract the best features and to lessen the duplicate features.The feature selection process has been significantly used in diverse fields viz.bioinformatics, image processing, remote sensing, intrusion detection, text processing, disease diagnosis, etc. [18] The feature selection methods are categorized as wrapper, filter, and hybrid methods [19].Filter method selects the features or attributes using statistical approaches.This method is independent of the classification algorithm and has a reasonable cost.Wrapper method is highly complex.It gives the best set of features by deciding which features or attributes are to be added or removed from the dataset.Hybrid methods amalgamate the best traits of the wrapper and filter methods.Information gain, Gini-index, chi-square, corelation, gain ratio are some of the most commonly used attribute selection measures [20].It is found that very few authors have worked on the use of feature selection for cardiac arrhythmia.The work is novel as no one has explored the performance of emerging SI techniques((bGWO, BOA, DA, ALO, SBO, CSBO_1, CSBO_2, CSBO_3, CSBO_4, and CSBO_5) for finding the optimal set of features to solve cardiac arrhythmia classification problems.Here, SI-based techniques have been employed to determine the optimal set of features used to diagnose cardiac arrhythmia.Different performance metrics viz.accuracy, number of dimensions, execution time, and fitness value have been computed and analyzed.
In the rest of the manuscript, related works and methodology used in this research work have been presented in the second and third section.The fourth section presents the results and discussions.Finally, conclusion and future work have been depicted in the fifth section.

Related Works
Nowadays, the stupendous volume of data is accessible for every province that needs to be vigilantly and effectively mined.Feature selection plays a significant role in extracting meaningful information from mountains of data using a minimal subset of attributes.Feature selection is one of the challenging optimization problems that assist in selecting optimal features from a dataset so that a better predictive rate can be achieved.In other words, it is an extraction process that eradicates the irrelevant and redundant features for a better understanding of data sets [21].
Different deterministic and stochastic techniques have been engaged to untangle the feature selection problem.Filter and wrapper are two basic approaches to the same.Despite these techniques, several meta-heuristic algorithms have also been effectively used to get an optimal set of features.In the last few decades, more than a hundred SI-based meta-heuristics techniques have been designed.The major aim of this study is to present a comprehensive analysis of SI-based meta-heuristic techniques used for feature selection problems.From Table 1, it has been revealed that swarm intelligence approaches have been successively employed in the field of disease diagnosis.However, no one has explicitly determined the performance of bGWO, DA, BOA, ALO, SBO, and chaotic variants in solving the cardiac arrhythmia classification problem.
This section of the manuscript will briefly present the basic principles of five swarm-based meta-heuristics viz.butterfly optimization, dragonfly algorithm, grey wolf optimization, ant lion optimization algorithm, satin bird optimization, and chaotic maps(Chebyshev map, Circle map, Sinusoidal map, Gauss map and Tent map).

Butterfly Optimization Algorithm:
BOA relies on the foraging strategy of butterflies.Butterflies possess different senses that help them for their survival.These species make use of their sense of smell to look for the food(nectar).Usually, all the butterflies produce some fragrance which has an intensity associated with it.This intensity corresponds to some fitness value.Butterflies with low intensity get attracted towards the butterflies with high-intensity value [51].In BOA butterflies propagate the information by sensing the fragrance.Butterflies move in a random direction if they are not capable of sensing the fragrance otherwise they move towards the best species if the fragrance is sensed and hence results in global search.In global search the movement of butterflies towards the fittest butterfly is represented by the following equation: where x i t denotes the solution vector x i of ith butterfly in iteration t, g * denotes the currently found best solution among the solutions of the current iteration.f i represents the fragrance of ith butterfly and r is the random number in[0,1].In local search the equation used is: where, x j t and x k t represents the j th and k th butterflies respectively of the same swarm.

Dragonfly Algorithm:
DA is a novel optimization approach proposed by Seyedali Mirjalili in 2015 [52].DA algorithm is based on the social behaviour of dragonflies.Around 3000 distinct species of dragonflies exist in the world.Migration and hunting are the major objectives of dragonflies.Based on these objectives dragonflies possess dynamic and static swarm behaviour respectively [52][53] In static swarm dragonflies create small groups and fly over different directions which is one of the prime concerns of exploration.Whereas, in dynamic swarm dragonflies makes larger groups and fly in one particular direction.This process is called exploitation.Separation, alignment, cohesion, attraction and distraction are the major factors used for position updation of dragonflies in a swarm.These factors are mathematically represented by the following equations [52][53][54]: Separation is evaluated by using the following equation: ) where X and X j represent the positions of current and jth neighbouring individuals and N denotes the number of neighbouring individuals.Alignment is calculated as:  Cohesion is calculated as: The attraction of the dragonflies towards the food source is calculated as: ) where X + represents the position of the food source.The distraction of the dragonflies outwards their enemies is calculated as: where, X − represents the position of the enemy.

Grey Wolf Optimization:
GWO propounded in 2014 by Sayedali Mirjalili et al. imitates the hunting nature and leadership behaviour of grey wolves.Generally, the grey wolves come from the Canidae family[17] [55].Based on the hierarchy grey wolves are categorized as alpha(α), beta(β), delta(δ) and omega(ω).Amongst all alphas are at the top of the hierarchy as they are dominant ones.Alphas take all decisions and the rest of the wolves obey them.Beta wolves help alpha in decision making.Delta wolf assists both alpha and beta wolves.Omega wolves are the lowest-ranked and follow the instructions of α, β, δ.In the hunting process the grey wolves surround their prey which is mathematically represented by the following equations [55] [56][57]: where t denotes the current iteration, X p ����⃗ and X � �⃗ represents the position vectors of prey and grey wolf respectively.A � �⃗ and C �⃗ are the coefficient vectors and are calculated as : where r 1 and r 2 are the random vectors in [0,1] and components of a �⃗ are linearly decreased from 2 to 0 throughout iterations.

Figure 4. Positions updation in GWO[55]
In GWO the wolves generally hunt their prey in a swarm of 5 to 12 called pack.The hunting process is lead by alpha, beta, and delta wolves.However, alpha wolves are considered as the best candidate solutions, beta and delta wolves have finer knowledge about the potential position of the prey.So, the three best solutions achieved so far are saved and the other search agents are required to update their locations according to the location of the best search agents.This process is implemented by using the following equations: In the binary version of grey wolf optimization called bGWO, the solution space is restricted to{0,1}.

Satin-bowerbird Optimization:
Satin bowerbird is a novel optimization algorithm that is propounded by Moosavi and Bardsiri in 2017 [58][59].The fascination of male bowerbirds towards the female birds for procreation is the key idea behind the success of this meta-heuristic.Satin bowerbirds are the insect and fruit-eating passerines that are natives of mesic forest and rainforest of eastern Australia.Male birds usually build gazebo called the bower and decorate it with full zest to woo the female birds.Male bowerbirds have been observed competing to make their bowers more ravishing than that of others.This intention provokes them to demolish the bowers of their adjacent males.Male bowerbirds mostly use the different coloured viz.yellow, white, purple objects of which are normally seen placed at the doorway of their bower The fitness value is evaluated using the following equations: where f(x i ) represents the value of the cost function of ith bower or ith position.

Ant Lion Optimization Algorithm
Ant Lion Optimizer is a swarm-intelligence based algorithm introduced by Mirjalili [62].Antlions belong to a family of insects called Myrmelentidae [63].ALO imitates the hunting strategy of antlions.Antlions catch their prey specifically ants by digging the cone-shaped pits in the sand.After digging the pit antlions hide at the bottom of the pit and wait for the prey to trap.The size of the pit is based on the level of hunger of antlions and the shape of the moon.It is observed that antlions dig the larger pits as they feel more hungry.The behaviour of antlions and ants is depicted mathematically as given below [62][64][65]: Ants move randomly in search of their food and their random walk is represented by the following equation: ) where cumsum evaluates the cumulative sum, n denotes the maximum number of iterations, t represents the step of random walk and r(t) is the stochastic function defined as follows: rand is a uniformly distributed random number in [0,1].The objective is to the random walks within the search space, so these are normalized using the following equations: where a i and b i denotes the minimum and maximum of random walks of ith variable respectively.c i t and d i t denotes the minimum and maximum of ith variable t th iteration.The trapping process of ants in the pits of antlions is mathematically represented as follows: t represents the position of the selected j th antlion at t th iteration.When the antlions come to know that ants are trapped in the pit they start putting sand outwards the middle of the pit.This process is mathematically formulated as : where I is a ratio.Now, the process of capturing of the ants by the antlions and reconstructing the pits to grab the new preys are described by the following equations: where Antlion j t and Ant i t represents the jth and ith positions of selected antlion and ant at iteration t.

Chaotic Maps 3.7
Chaos theory is a part of mathematics that solves the nonlinear complex problems whose behaviours are implausible to predict.Chaos refers to the havoc behavior of the non-linear dynamic systems.Some of the chaotic maps employed in this manuscript are discussed in Table 2 [66][67]: where a=0.5, b=0.

Results and Discussions
A standard UCI cardiac arrhythmia dataset has been taken for experimentation.There are 452 instances and 279 attributes.Out of 279 attributes, 206 are linear and the rest are nominal.The data has been pre-processed to remove the missing values.The dataset is examined to find a set of optimal features required to determine the state of arrhythmia.The binary output reflects the presence or absence of cardiac arrhythmia.
All the experiments have been conducted over a machine (CPU: i3, RAM: 2GB).All the algorithms have been simulated using MATLAB 2016 environment.Initially, all the search agents have been assigned random locations in the search space.The values of the upper and lower bound have been set to 1 and 0 for the cardiac arrhythmia data set.The numbers of search agents are set to 10.For the classification problems, the solution having the least value of features is considered to be optimal.To overcome the bias in stochastic techniques, each algorithm has been individually executed for twenty different runs and the average of the results have been taken.
The SI-based meta-heuristic algorithms viz.DA, BOA, ALO, SBO, and bGWO have been employed to find an optimal set of features required for cardiac arrhythmia diagnosis.To examine the performance of different SI algorithms various metrics like accuracy, dimension size, fitness value and execution time have been computed and analyzed.The equations of parameters used are mentioned below: ) where M denotes the number of runs and g* represents the optimal solution The parameters for execution have been empirically set.The results obtained using DA, ALO, SBO, BOA, bGWO and chaotic versions of SBO have been statistically examined.The metaheuristic techniques that provide higher classification accuracy found to be more promising than others.However, for fitness value, number of dimension, and execution time the smallest value corresponds to better results.The results obtained during 200 experiments have been presented in Tables 4 and 5.The best values obtained using DA, BOA, ALO, SBO for accuracy, fitness value, number of dimensions and execution time for cardiac arrhythmia dataset presented in Table 4 and Table 5 have been underlined and italicized.Here, Avg, max, min, std corresponds to the average, maximum, minimum and standard deviation of values.accomplished using simple SBO is 68%.SBO is also performing well in terms of fitness values and number of dimensions.Additionally, in terms of execution speed, the performance of CSBO_2 is outstanding.Furthermore, the minimum accuracy of SBO is 5.17%, 5.17%, 5.17%, 7.01% and 5.17%, better than CSBO_1, CSBO_2 CSBO_3 CSBO_4 and CSBO_5 respectively.Likewise the average and maximum rate of accuracy of SBO is (3.22%, 3.03%), (4.91%,7.93%),(3.22%, 3.03%), (4.91%, 3.03%) and (4.91%, 3.03%) respectively better as compared to CSBO_1, CSBO_2 CSBO_3 CSBO_4 and CSBO_5.

Figure 6. Variation of the rate of accuracy and number of dimensions of SI-based meta-heuristics
The rate of accuracy and the number of dimensions obtained using ten distinct SI metaheuristics techniques during twenty different runs (experimentation) are depicted in Figure 6.

Conclusion
Cardiac arrhythmia is one of the critical heartbeat related human disorders which may lead to another censorious heart-related problem, in case it not diagnosed and treated on time.A standard UCI dataset(ECG signals) comprises of 452 individuals has been explored during this research work.In this manuscript, five emerging swarm intelligence based meta-heuristic techniques and chaotic variants of SBO have been employed as the feature selection techniques and their results are compared.Here, five distinct variants of SBO have been created by hybridizing the characteristics of SBO and different chaotic maps viz.circle map, chebyshev map, sinusoidal map, tent and gauss map.It is found that SBO outperformed all SI approaches as well as its chaotic variants in terms of accuracy and fitness value.Furthermore, the minimum accuracy of SBO is 7.01%, 8.92%, 12.96%, 5.17% better than bGWO, DA, BOA and ALO respectively.Likewise, the average and maximum rate of accuracy of SBO are (3.22%,1.49%), (4.91%,4.61%),(6.66%, 6.25%) and (3.22%, 3.03%) respectively better as compared to bGWO, DA, BOA and ALO.In the future, more chaotic functions can be utilized on the cardiac arrhythmia dataset as the feature selection approaches.Additionally, the use of these meta-heuristic techniques in the diagnosis of other cardio disorders may also be explored.

Figure 3 .
Figure 3. Position updation principles ofdragonflies[52] [59][60].Male bower birds use different kinds of materials for building attractive bowers viz.twigs, flowers, brown-coloured shells of snails, feathers, drinking straw, etc.[59][61].Female bowerbirds choose their breeding partner after visiting numerous bowers.Based on the virtue of the lives of satin bowerbirds the SBO algorithm is organized into various stages viz.Generation of a set of Random Bowers, Probability Calculation of Each Member of Population, Elitism, Calculating bower's new Position, and Mutation.The probability and the fitness values are evaluated using the following equations[59] [61]: where nb represents the number of bowers and fit i corresponds to the fitness value of the ith solution.EAI Endorsed Transactions on Pervasive Health and Technology 05 2020 -09 2020 | Volume 6 | Issue 22 | e7

Figure 5 .
Figure 5. Ant's random walk inside antlion's trap[62] 2 and x n ∈ (0,1) EAI Endorsed Transactions on Pervasive Health and Technology 05 2020 -09 2020 | Volume 6 | Issue 22 | e7 map, Sinusoidal map, Gauss map and Tent map), the performance of original SBO found to be the more suitable for classification of the cardiac arrhythmia dataset.The maximum rate of classification accuracy EAI Endorsed Transactions on Pervasive Health and Technology 05 2020 -09 2020 | Volume 6 | Issue 22 | e7

Table 1 . Related works Authors Domain Purpose SI-based Technique Used
[33]da R.H. et al.[33]Breast Cancer Feature Selection and ClassificationANN+Fuzzy Logic+GA

Table 4 .
Statistical Analysis of parametersIt has been found fromTable 4 that the minimum and maximum values of accuracy ranges between 0.54 to 0.68.Likewise, the fitness values lie between 0.32 to 0.46.Moreover, as far as accuracy rate and fitness values of cardiac arrhythmia are concerned the SBO outperformed other SI algorithms viz.bGWO, DA, BOA and ALO.In case only dimension size is of utmost importance, then BOA and ALO algorithms are on priority.The values of execution time are ranging from 22.95 to 152.81.In terms of execution time, the performance of bGWO is on top.

Table 5 .
Statistical Analysis of SBO and its chaotic versions.

Table 5
depicts the comparison of SBO with its chaotic versions.The experimental analysis reveals that in comparison to the distinct chaotic variants designed using five different chaotic functions (Chebyshev map, Circle