Absence seizure detection classifying matching pursuit features of EEG signals

INTRODUCTION: Absence seizures are characterized by a typical generalized spike-and-wave electroencephalographic (EEG) pattern around 3Hz. The automatic identification of this pattern and consequently its corresponding seizure is a valuable information towards the reliable patient’s clinical image and treatment planning. In this paper, we propose a method for absence seizures detection based on EEG signals decomposition via the Matching Pursuit (MP) algorithm. METHODS: Based on the ictal EEG semiology, MP features were extracted able to track the ictal pattern. This analysis was performed in a clinical dataset of 8 pediatric patients (4 females, 4 males) suffering from active absence epilepsy, containing 123 absence seizures in total. Automatic classification schema based on Machine Learning techniques were employed to categorize the MP patterns into non-ictal and ictal states. RESULTS: The seizure detection system achieved a time window based discrimination accuracy of 97.3% by using a Support Vector Machine (SVM) classifier and 10-fold cross-validation, in that way accomplishing a good state of the art performance. DISCUSSION: Compared to other popular spectral analysis methods, Matching Pursuit appears to be a robust and efficient method regarding absence seizures detection on EEG signals and our results indicate that the MP features proposed in this work are features that can be used effectively in seizure detection procedure. Received on 30 June 2020; accepted on 01 October 2020; published on 13 October 2020


Introduction
Epilepsy constitutes the fourth (4 th ) most common neurological disorder for all ages [1] and the most common chronic neurological disorder in childhood in many countries [2]. According to the World Health Organization, it is estimated that in 2019, epilepsy affected around 50 million people worldwide [1]. In 2017, the prevalence and incidence of epilepsy are estimated to be 6.38 and 0.61 per 1000 persons respectively [3,4].
After different seizure classification approaches dating from 1970, ILAE introduced the ILAE 2017 classification scheme [5] which remains in use until * Corresponding author. Email: agiannakaki1@isc.tuc.gr now. According to this classification, seizures are divided into seizures with focal onset (limited to one hemisphere of the brain), seizures with generalized onset (both hemispheres of the brain), seizures of unknown onset (may be reclassified as focal or generalized when new information is available) and unclassified.
Childhood Absence Epilepsy (CAE) presents mainly generalized non-motor seizures (absence seizures). In CAE the seizures start between 3 and 8 years of age (peak ∼6-7 years) [6]. During the ictal interval patients present brief staring spells during which they are not aware or responsive. The development of children with childhood absence epilepsy usually evolves normally, however they may have higher rates of attention problems. They can also present another epileptic 1

EAI Endorsed Transactions on Bioengineering and Bioinformatics
Research Article

EAI Endorsed Transactions on Bioengineering and Bioinformatics
Online First syndrome such as juvenile absence epilepsy where absence seizures can be combined with generalized tonic clonic seizures. The duration of each seizure is about 10 sec (range 4-20 sec) and the ictal EEG is characterized by the pattern of spike and slow-wave discharge with frequency around 3Hz with a gradual frequency decline during the seizure [6]. Epilepsy diagnosis can be performed mainly with the EEG taking into account the patient's clinical image. Various techniques act complementary to EEG to provide a more reliable clinical image such as magnetoencephalography (MEG), electrocardiography (ECG), magnetic resonance imaging (MRI), functional MRI (fMRI), positron emission tomography (PET), computed tomography (CT), SPECT and others [7]. However, electroencephalography (EEG) remains the most widely adopted clinical technique for seizure diagnosis, detection, and anticipation [8]. The clinical usefulness of both scalp and intracranial EEG has been proved a very useful clinical tool, thus it has been established as the main clinical diagnostic technique [9]. The advantages of EEG are, among others, its high temporal resolution, non-invasiveness and low cost, providing valuable neurophysiological information making it ideal for clinical practice. Lately, there is a growing need for home monitoring, thus ECG gains ground because it is a convenient method being able to be recorded through wearable devices. However, EEG retains its superiority in relation to ECG in terms of predictive evidence, localization ability, and temporal resolution [10].
The accurate monitoring of patients suffering from absence seizures demands the reliable detection of all seizures from the recorded EEG and the analysis in order the clinician to have an accurate clinical picture of the patient to design the treatment plan. The visual inspection of the recorded EEG, especially during long-term monitoring, may be time-consuming and tedious, limiting treatment's effectiveness. Thus, the automatic seizure detection as well as the automatic analysis of seizure's temporal/spectral characteristics are of particular significance mainly for patients with limited access to an organized EEG laboratory and for non-specialized doctors.
In order to address these needs, the present study explores the use of matching pursuit for seizure detection. This framework enables the extraction of spatially localized patterns of particular structural patterns in the composition of the EEG. In this work, we expand the applicability of matching pursuit to the accurate detection and characterization of timefrequency properties of seizures within a machine learning environment. Related studies are reviewed in Section 2, whereas the clinical protocol of the study is presented in Section 3. The methodological framework and the proposed analysis scheme are presented in Section 4, with examples illustrated in Section 5. The discussion in Section 6 discusses advantages and limitations and concludes our study.

Related work
In the related literature, various seizure detection and anticipation algorithms through EEG recordings have been proposed using different approaches [8,[11][12][13]. EEG prediction methods are divided into three main categories, time domain, frequency domain and non linear methods [8,14].
In the time domain analysis, features based on temporal signal behaviour such as signal statistical measures/moments, Hjorth parameters, signal variability, autoregressive (AR) coefficients, length density, etc. In the frequency domain, the EEG spectrum features are widely used in EEG based seizure recognition algorithms. EEG spectrum is divided into spectral rhythms δ,θ,α,β,γ1. The nonlinear analysis, which is considered a very promising approach mainly in the seizure prediction, considers the EEG as a dynamical system extracting features such as entropy measures [11], fractal dimension, maximal Lyapunov exponent, etc. Advanced signal processing techniques enable the signal decomposition on its time-frequency components utilizing a basis functions approach [15]. Especially, in complex and multicomponent signals, these methods have the advantage of eliminating noisy time-frequency cross terms.
A challenging issue is to use decomposition methods in order to extract robust, representative features for seizure discrimination. An efficient signal decomposition algorithm into time-frequency components (atoms) called Matching Pursuit (MP) [16]. MP has been used in EEG analysis [17][18][19] but there are only few studies related to seizure detection [20][21][22][23][24]. Franaszczuk et al. [20], used the MP algorithm in order to provide the time-frequency decomposition of the EEG signals, searching for a definite change in ictal time-frequency pattern and revealing a predominant frequency of 5. 3-8.4 Hz in the non-ictal interval. Regularity and MP features were used along with features selection and classification applied to the Bonn University dataset [25] achieving 97.6% accuracy [21].
Apart from the time-frequency representation of the signals, some researchers used atoms features information such as Gabor Atom Density (GAD) or Mean Atom Frequency (MAF) to determine the seizure onset [22]. In our previous relevant work with MPbased seizure detection, we extracted and evaluated MP features (involving atoms features) in terms of their ability to discriminate non-ictal and ictal intervals in a subset of the study dataset [23].
The Multivariate Matching Pursuit (MMP) approach can also be estimated in which time-frequency atoms 2 EAI Endorsed Transactions on Bioengineering and Bioinformatics Online First from all multichannel data are extracted [26]. In [24], Liu et al. use MMP and the trends of Gabor entropic measures in order to predict an upcoming seizure.
In this study, we use the MP algorithm to extract representative and robust features of EEG signal dynamics in order to perform automatic seizure detection. The extracted features are both features coming from the relevant literature and features proposed in this study based on characteristic information. These features provide a parametric and reliable representation of EEG dynamics. Then, machine learning techniques are employed in order to categorize the MP-derived patterns into non-ictal and ictal states.

Inclusion criteria and ethics
Subjects that participated in this study were patients diagnosed with absence epilepsy. In order to be eligible for the study, it was determined that they should have presented at least one seizure event in the last month. The study's protocol has been approved by the appropriate scientific board of the University Hospital of Heraklion. All caregivers/patients signed and provided written informed consent after a detailed explanation of the study objectives and the followed clinical protocol.

Procedure
Patients that met the inclusion criteria as evaluated by two expert neuropediatricians, were admitted to the study. Their medical health record was created including clinical data about demographics, medical history, family history, medication, epilepsy classification, etc. An EEG cap with 10/20 electrode system was placed in the head of the patient, a camera was placed opposite the patient's bed and additional sensors for recording the breath rate and SpO2 were utilized. Video and surface EEG were recorded simultaneously for each patient during routine longterm hospital monitoring. The EEG signals were recorded at 19 scalp loci of the international 10-20 system, with all electrodes referenced to the earlobe. An electrode placed in the middle of the distance between Fp1 and Fp2 on the subject's forehead served as ground. EEG data were sampled at 256Hz.

Dataset
This study's population consisted of 8 patients (4 females, 4 males) diagnosed with active absence epilepsy. Their age was 5.9±2.8 years at the moment of the recording. The EEG recordings were independently evaluated for epileptic seizures and pathological findings by two expert neuropediatricians. All the epileptic seizures were identified, were marked (temporal onset, ending) and were classified as absence like generalized seizures according to the criteria of the International League Against Epilepsy (ILAE) [27]. The study dataset included 123 absence seizures in total from the 8 patients. The EEG signals were recorded at 19 scalp loci of the international 10-20 system (channels Fp1, Fp2, F7, F3, Fz, F4, F8, T3, C3, Cz, C4, T4, T5, P3, Pz, P4, T6, O1, O2), with all electrodes referenced to the earlobe. Patients demographic data as well as seizures information and selected clinical data are summarized in Table 1.

Matching pursuit algorithm
The matching pursuit algorithm [16] is an iterative algorithm that provides a mathematical formulation of the approximation of a signal using a set of functions (atoms). The redundant set of time-frequency atoms is called dictionary D. The dictionary can be comprised of any arbitrary function, however, in this study we construct the dictionary D from Gabor functions g γ (t).
where K(γ) is a normalization coefficient such that g γ =1, and γ=[u, f , σ ] denotes dictionary's function parameters, u is the translation in time, f is the frequency, σ is the Gaussian spread.
We use Gabor basis functions as it has been shown that they have optimal distribution minimizing the variability of the time-frequency product [28] and are considered quite effective in approximating EEG signals [29].
The algorithm looks for the Gabor function that best matches to an inner pattern of the original signal x over a redundant set of atoms selected from the dictionary. This is done by successive approximations of x with orthogonal projections on elements of the dictionary, i.e. the inner product between the Gabor function and the signal. This inner product is then subtracted from the signal and the next iteration take place. Let R 0 x = x. We suppose that we have computed the n th order residue R n s for n ≥ 0. We choose an element g γ n ∈ D from the dictionary D which best matches the signal R n x (the residual left after subtracting the results of previous iterations). The residue R n x can be also decomposed into: R n x = R n x, g γ n g γ n + R n+1 x g γ n = argmax g γi ∈D R n x, g γi (1) where argmax g γi ∈D means the g γi giving the largest value of the product R n x, g γi . The iterative procedure 3 EAI Endorsed Transactions on Bioengineering and Bioinformatics Online First of decomposition stops either when the energy of the residual signal is below a preset cut-off level ε or, alternatively, after a predetermined number of iterations M. After M iterations, a matching pursuit decomposes a signal x into: where R m s is the residual vector after m iterations and x, g = ∞ −∞ x(t)ḡ(t)dt denotes inner product of functions s and g. This inner product a n = R n x, g γ n represents also the magnitude of the selected atom. Because the orthogonality of R n+1 x and to g γn is valid in each step of the procedure, the form of energy conservation law becomes: When the iterative procedure terminates, the selection of Gabor atoms from the dictionary is completed.

MP features extraction
The Matching Pursuit algorithm decomposes the signal into a number of Gabor atoms in decreasing order that best describe the signal in terms of its energy variation. Each atom has specific energy which can be described by its corresponding magnitude a n and specific timefrequency properties which can be described by its corresponding frequency f n . In order to provide specific features for the classification process, and also investigate which atom's measures can be more efficient in discriminating between ictal and non-ictal periods, the following features were extracted from the analysis.
For each sliding time window, 6 features were calculated from the extracted MP atoms: the mean amplitude (MA), weighted mean frequency (WMF), mean-product frequency (MPF), Gabor Energy (GEn), Gabor Entropy (GE), normalized Gabor Entropy (NGE). An overview of the features and their corresponding type is presented in Table 2.
The first feature was the mean amplitude (MA) value where M is the number of atoms selected. We also introduced two new metrics, the weighted mean frequency (WMF) In our point of view, the WMF and MPF are considered to be better metrics in relation to the mean frequency described in a relevant study [21], because they take into account not only the values of the atoms' frequency but also the contribution of each atom's spectral content through its amplitude a i . The selection of these specific features was founded on the observation that the frequency and the EEG signal envelope (amplitude) were increased during seizures.
Three more features suggested by Liu at al. in a 2018 study [24], were calculated for each time window: Gabor Energy (GEn) as the total energy of the selected atoms: EAI Endorsed Transactions on Bioengineering and Bioinformatics Online First Gabor Entropy (GE): where P n is the relative energy of a Gabor atom, and normalized Gabor Entropy (NGE):

Feature classification
The training and evaluation of the classification schemes for recognition between the two investigated states (non-ictal, ictal states) were performed using machine learning (ML) techniques. ML techniques have been used in the area of EEG signal analysis in a wide range of relevant research areas such as discrimination between emotional states, enhancement of brain-computer interfaces, motor imagery, and epileptic seizure detection [30][31][32][33].
In this study, in order to achieve good discrimination between the two states under investigation (nonictal/ictal periods), the features extracted from the MP analysis were used to train and then evaluate different classification schemes, leading to the selection of the best performing classifier for the specific data type. The classification schemes used are summarized and presented in Table 3. The Trivial classifier classifies everything in the most frequent class and is used as a reference point for the performance of the other classifiers since it is considered to represent random classification. In order to assess the performance of each classification scheme the classification accuracy, sensitivity and specificity metrics were used, which are given by the equations As used in this study, sensitivity is the proportion of cases actually belonging to non-ictal periods that were correctly predicted as belonging to non-ictal periods, while specificity is the proportion of cases predicted as belonging to ictal periods, that actually belonged to ictal periods.
The classification schemes (classifier and its parameters) were cross-validated in order to evaluate their performance and select the combination that could better manage the specific data type. A standard 10-fold crossvalidation method was applied to each classification scheme for testing the system's performance.

EEG preprocessing and MP dictionary construction
The EEG recordings were sampled at a sampling frequency of f s =256Hz. Artifacts related to the subject's activity (body movements, eye blinks, spikes, head movements, chewing, general discharges) were suppressed using wavelet Independent Component Analysis (wICA) [34].
In order to track the temporal EEG dynamics in nonictal phase and the transition from non-ictal to ictal phase, sliding window analysis was followed. Smaller windows sizes enable a greater temporal dynamics resolution, while bigger time windows provide a more reliable estimation of the spectral EEG features, mainly at the low frequency EEG rhythms (e.g. the δ rhythm). Relevant seizure detection studies adopt different window sizes for EEG segmentation typically range from 1sec to 30sec (e.g. 1sec [35], 2sec [36], 3sec [37], 23.6sec [38]). Windows of length ∆t=2sec with a step of 0.5sec were selected for this study's analysis. The selection of these parameters based on the one hand that some of the ictal periods were shorter than 2.5 seconds (a bigger window length would not involve only ictal signal) and on the other hand it is long enough window to track the biggest part of δ rhythm (lower rhythm). It was checked that the increase of time window does not affect the system performance significantly.
As the dataset of seizure detection is highly nonbalanced, i.e. data samples from ictal periods are much fewer than those from non-ictal periods, a balanced dataset should be ensured for formulating a proper model. Thus, all samples from ictal and their double samples from non-ictal periods were selected for the subsequent analysis.
The parameters used for the construction of MP dictionary was N dict =512 samples, σ i ranging from 2 to 256 samples with a step of 2 samples, f i ranging from 1 to 30Hz with a step of 0.5Hz and u ranging from −N dict /2 to N dict /2 − 1 with a step of 2 samples leading 5 EAI Endorsed Transactions on Bioengineering and Bioinformatics Online First

MP features
The features MA, WMF, MPF, GEn, GE and NGE were extracted for each sliding temporal window of the ictal and non-ictal signal. A typical temporal evolution of the EEG signal and its corresponding MA timeseries are presented in Fig. 1. These features were included in the feature matrix which subsequently fed the classification schemes. In order to assess each features' importance and relevance for the investigated discrimination problem, Fisher discrimination ratio [39] was used and the features' ranking is presented in Fig. 2. One can notice that MA achieves the highest ranking, thus it is the most relevant feature for the discrimination problem. Then, the features set was evaluated in terms of its discrimination ability between non-ictal and ictal state.   Table 3. By inspecting Table 3 one can notice that the classifiers Support Vector Machine and Random Forest outperform the other classification schemes with a classification accuracy of 97.32% and 97.20% respectively.
The distribution of the 2 top-ranked features (MA, MPF) is presented in Fig. 3 as a classification plot with the decision boundaries of SVM for the two features along with samples of a testing fold (blue: non-ictal, red: ictal). This figure presents the data separability achieved using this study's features and the discrimination efficiency of the proposed methodology.

Figure 2.
Features ranking according to the Fisher discrimination ratio for the discrimination problem (non-ictal, ictal states). 6 EAI Endorsed Transactions on Bioengineering and Bioinformatics Online First

Discussion
In this paper, a method for the detection of absence seizures from EEG using features of Matching Pursuit decomposition is presented. MP decomposition in timefrequency atoms was performed during sliding time windows, which enables efficient parametric signal's time-frequency representation. Based on the absence seizure ictal behaviour and its specific EEG pattern, we estimated MP features which are able to efficiently represent this pattern. Towards this direction, we estimated 6 representative features of MP and entropic statistics. The MP algorithm is a time-frequency decomposition method that extracts best matching Gabor atoms to the inner signal patterns. The extracted Gabor atoms follow the principle of maximal energy, thus their order represents the importance of the atom as the signal's component. This is a fundamental issue for the proper feature extraction. The usage of the mean metrics (such as the mean of atom frequencies) as estimated in part of the relevant literature is not always representative of the resultant dominant frequency, as there is an ordered atom contribution which should be incorporated to the features. This weighted importance is introduced to our feature extraction analysis formulating weighted versions of the MP features which is in our view more representative and closer to real time-frequency characteristics of the signal. The usage of weighted frequency metrics represents the resultant frequency in a consistent way in the same way MP algorithm operates, i.e. using the weighted coefficients of MP to strengthen the presence of significant atoms in opposition to less important ones.
Compared to other popular spectral analysis methods, MP appears to be a robust method regarding localization ability, with adaptability in signal decomposition and flexibility in discovering components of signals with significant fluctuations in time and frequency domain, such as the brain signals [40]. Besides, MP is a very efficient method in eliminating false cross-terms especially in the case of a multicomponent signal as EEG.
The MP features used in this study relate to the strength and frequency of Gabor atoms that compose the EEG signal and provide good separability between non-ictal and ictal states. Machine learning techniques used for the classification process led to a best achieved time-window classification of 97.32% by SVM classifier, using 10-fold cross-validation. As we use our own clinical dataset, a direct comparison with other studies which use different datasets is not feasible. However, compared with recent literature, our proposal achieves good state-of-the-art performance. These results indicate that the MP features proposed in this work are features that can be used effectively in seizure detection procedure.