An Efficient Adaptive Segmentation Algorithm on EEG ...

3 downloads 258 Views 471KB Size Report
to Discriminate between Subject with Epilepsy and Normal Control. Samaneh Kazemifar, Reza Boostani ..... IEEE Transactions on information technology in biomedicine, 13,. 703- -710. ... University, Shiraz, Iran, and the MSc. degree in.
International Review on Computers and Software (I.RE.CO.S.), Vol. 8, N. 1 ISSN 1828-6003 January 2013

An Efficient Adaptive Segmentation Algorithm on EEG Signals to Discriminate between Subject with Epilepsy and Normal Control Samaneh Kazemifar, Reza Boostani Abstract – In this study, an effective approach was presented to help distinguish between patients with Epilepsy Disease (ED) and the control group using their electroencephalogram (EEG) signals. The closer examination of these signals clearly showed the non-stationary behavior over long periods. In most real-time researches, the window function was applied to segment the nonstationary signal into stationary signal with the short window length. However, the purpose of this study was to find the stationary intervals using the instantaneous frequency (IF) concept. Several time-frequency transforms were applied to segment the EEG signals in the ED group and control group. The kernel principal component analysis (KPCA), full-rank KPCA and low-rank KPCA were applied to increase the accuracy rate and decrease the complexity. Then, the projected features were used in the artificial neural network (ANN) classifier. The results showed that the novel technique was able to distinguish between the subjects with epilepsy and the control group with an accuracy rate of 93%. Copyright © 2013 Praise Worthy Prize S.r.l. - All rights reserved.

Keywords: EEG, Seizure, Kernel PCA, Full-Rank KPCA, Low-Rank KPCA, IF, Time-Frequency Transforms, ANN

Previous studies show that epilepsy can be easily diagnosed in adults by monitoring their EEG signals. However, several techniques have been suggested in the literature for the detection of epileptic seizures, mainly by analyzing the EEG signals. These techniques include auto-correlation function [2], frequency domain analysis [3], wavelet transform [4] and nonlinear methods [5]. The EEG signals were mostly recorded with large number of channels and seizure-localized sources. Therefore, the feature extraction methods were efficient for analysis of EEG signals. However, previous works [6], [7] used the principle component analysis (PCA) method as feature reduction step. In addition, the previous studies [8]-[10] showed the efficiency of KPCA and PCA in image processing. They presented a remarkable classification rate for two and three classes, but their performance was significantly reduced with increasing the number of classes.

Nomenclature Tx (t,f) RIDHx RIDBx RIDBNx SPWVx t f v g(v) h(τ)

Quadratic time-frequency representation of a signal x(t) Reduced interference distribution with Hanning kernel of signal x(t) Reduced interference distribution with Bessel kernel of signal x(t) Reduced interference distribution with Binominal kernel of signal x(t) Smoothed pseudo Wigner Ville distribution of signal x(t) Time Frequency Frequency Time smoothing window Frequency smoothing window

I.

Introduction

Epilepsy is a neurological disease that affects approximately 1% of the world’s population (about 60 million people) [1]. Seizure attacks are caused by synchronous abnormal electrical discharge of a huge group of activated neurons. If these synchronous activated neurons are localized in a certain brain lobe, it is called the focal seizure. However, if these neurons are distributed over the scalp, the term “generalized seizure” will be used. About 75% of epileptic patients can be effectively treated using current treatments, such as drugs or surgical operations.

II.

Materials and Methods II.1.

Subject Analysis

The technique was tested in two databases. The first database includes five sets [5] namely F, Z, O, S and N which were recorded in the Medical Center of Born University. The individual set includes 20 EEG signals which were recorded with 20 electrodes molded on the scalp according to the international 10-20 standard protocol. Set F contained EEG signals of five epileptic patients when their brain acted normally in the absence

Manuscript received and revised December 2012, accepted January 2013

256

Copyright © 2013 Praise Worthy Prize S.r.l. - All rights reserved

S. Kazemifar, R. Boostani

of seizure attack. Set N included EEG signals of five epileptic patients without the seizure attack on their hippocampus area. Set O and Z recorded EEG signals of five healthy volunteers with closed and opened eyes, respectively. Set S included EEG signals of five epileptic patients that were recorded during the seizure attack. EEG signals were recorded through 23.6 seconds and the sampling rate was 173.6 per second. The second dataset contains EEG signals of children that were recorded at the Children’s Hospital in Boston, MA. This dataset includes epileptic and normal subjects [15]. All signals were acquired at 256 samples per second and using an A/D card with 16-bit resolution. The electrodes were located on the scalp based on the international 10-20 recording system. Quadratic time-frequency transforms are represented by the energy distribution of a signal in the time and frequency domains. In general, quadratic time-frequency representation (QTFR) of a signal x(t) can be expressed as follows:

Tx ( t, f ) =

∞ ∞ ∞





1 ⎞

∫ ∫ ∫ ⎢⎣ F (ν ,τ ) e j 2πν ( s −Γ )x ⎜⎝ s + 2 Γ ⎟⎠

−∞ −∞ −∞

1 ⎞ ⎤ x ⎜ s − Γ ⎟ e − j 2π f Γdν dsdτ ⎥ 2 ⎠ ⎝ ⎦ ∗⎛

(3)

where F (ν ,τ ) provides the two-dimensional filtering of the instantaneous autocorrelation and is also known as kernel. It is the filter-like function that differentiates among various distributions in Cohen’s class, t is the time, f is frequency, x ( t ) is the signal, and x* ( t ) is its complex conjugate. In this paper, Smoothed Pseudo Wigner Ville (SPWV) distributions, Reduced Interference Distribution with Bessel kernel (RIDB), Reduced Interference Distribution with Binomial kernel (RIDBN), and Reduced Interference Distribution with Hanning kernel (RIDH) are implemented as follows: τ

Tx ( t, f ) =

∞ ∞

∗ ∫ ∫ x ( t1 ) x ( t2 ) kT ( t1 ,t2 : t, f ) dt1dt2

RIDH x ( t,v ) =

(1)

RIDBN x ( t,v ) =

∞ τ

⎡ 1

∫ ∫ ⎢⎣ 2 2 τ +1 ( 2 τ

+ 1 τ + v + 1)

−∞ − τ

⎤ x ( t + v + τ ) x∗ ( t + v − τ ) e −4 jπ vτ dvdτ ⎥ ⎥⎦

(2)

2 ⎡ 2g (v) ⎛ v −τ ⎞ ⎢ h (τ ) 1− ⎜ ⎟ ⎢ πτ ⎝ τ ⎠ −∞ t − τ ⎣ ⎤ ⎛ τ⎞ ⎛ τ⎞ x ⎜ v + ⎟ x∗ ⎜ v − ⎟ e −2 jπ vτ dvdτ ⎥ 2⎠ ⎝ 2⎠ ⎝ ⎦

where the term Tx1, x2 is cross QTFR or Cross Term (CT) of signals x1(t) and x2(t), and ℜ [α ] denotes the real part

RIDBx ( t ,v ) =

of α . For QTFRs, windowing techniques are not required because the objective of this study was to demonstrate the energy distribution of a signal represented in timefrequency (TF) space. However, windowing techniques are often used to suppress CTs that may impede processing due to their oscillatory nature. Note that QTFRs often overcome the TF resolution problem that limits the linear TFRs [16]-[21]. II.2.

(4)

⎤ τ⎞ ⎛ τ⎞ ⎛ x ⎜ t + v + ⎟ x∗ ⎜ t + v − ⎟ e −2 jπ vτ dvdτ ⎥ 2 2 ⎝ ⎠ ⎝ ⎠ ⎦

where k T is a signal-independent function that characterizes the QTFR. This transformation satisfies the quadratic superposition principle. For instance, the QTFR of the x(t) = αx1(t) + βx2(t) signal is expressed as:

+2ℜ ⎡⎣αβ ∗ T x1 ,x2 ( t , f ) ⎤⎦

2

−∞ − τ 2

−∞ −∞

2 Tx ( t, f ) = α Tx1 ( t, f ) + β 2 Tx2 ( t, f ) +

⎡ g (v) ⎛ ⎛ 2π v ⎞ ⎞ ∫ ∫ ⎢⎣⎢h (τ ) τ ⎜⎝1 + cos ⎜⎝ τ ⎟⎠ ⎟⎠



∞ t+τ

∫ ∫

SPWVx ( t ,v ) = −2 jπ

(

(5)





−∞

) (

h (τ )

(6)



∫ ⎡⎣ g ( s − t )

−∞

)

(7)

x s + τ x∗ s − τ e −2 jπ vτ dsdτ ⎤ 2 2 ⎦

Cohen’s Class Quadratic TFRs where g(v) and h(τ) are time and frequency smoothing windows, respectively. Fig. 1 shows the original signals and RIDBN result on five sets S, Z, O, N and F, respectively.

Cohen’s class is a famous class of quadratic TFRs that satisfies both time shift and frequency shift co-variances. Both co-variance properties are important in applications where the signal needs to be analyzed in the whole TF space with a fixed TF resolution. Indeed, these properties guaranty that if the signal is delayed in time and modulated, its TF distribution is translated to the same quantities in the time-frequency plane. The formal equation for determining a timefrequency distribution from Cohen’s class of distribution is described as:

II.3.

Instantaneous Frequency

To extract informative features from the timefrequency (TF) space the dividing TF plane into subspaces and heuristically select better performance.

Copyright © 2013 Praise Worthy Prize S.r.l. - All rights reserved

International Review on Computers and Software, Vol. 8, N. 1

257

S. Kazemifar, R. Boostani

Fig. 1. Ensembles of the eightieth channel of the EEG signals along with their time-frequency representation for the five sets S, Z, O, N, and F. The left column shows the signal and the right column demonstrates its time-frequency transform. There is a significance difference between their time-frequency behaviors

But, the problem was selection of the optimum subbands in time and frequency domains. To overcome this drawback, Instantaneous Frequency (IF) was suggested in this paper as a criterion to properly find stationary intervals for dividing time domains. The mean frequency of a signal describes the gravity center of the power spectrum of the signal. The power spectrum of non-stationary signals is time-dependent; therefore, the mean frequency of non-stationary signals would be time-dependent. The time-dependent mean frequency is called the IF. For non-stationary signals, the IF describes the central frequency evolution over time. Fig. 2 shows IF of the time-frequency representation for certain EEG channels in the first dataset. II.4.

Fig. 2. The IF representation of their time-frequency plane (RIDB) for the five sets S, Z, O, N, and F, respectively

II.5.

Kernel PCA

The kernel PCA [22] is applied to extract discriminative features that increase the discrimination between classes. This method is a combination of the PCA method and the kernel function. The covariance matrix (mxn) of the matrix (x) is computed in the PCA technique: 1 m C = ∑ xi xiT (8) m i =1

Feature Extraction

The high dimensions of the time-frequency features were non-discriminative and high computational complexity for the classifier. In added, the irrelevant features can mix up the boundaries between the classes and increase the overlap. The several classifiers (fuzzy rule based and neural networks) are complicated due the high dimensionality of the features. Then, the classifier performance was decreased. In this study, the kernel PCA was used as a feature extraction method. The kernel function implicitly maps the input feature into the higher dimensional space. These features were probably more discriminative compared to the original space.

where C is the covariance matrix, n is the number of features, and m is the number of instances. This transformation projects the data onto the first k eigenvectors of the matrix. The input data was projected into an implicitly higher-dimensional space in the KPCA

Copyright © 2013 Praise Worthy Prize S.r.l. - All rights reserved

International Review on Computers and Software, Vol. 8, N. 1

258

S. Kazemifar, R. Boostani

method. Table III shows that Low-Rank kernel KPCA had higher accuracy rates because it could remove the noise in the feature space.

method. The covariance matrix of the transformed data was computing that is expressed as: C=

1 m

m

∑ φ ( xi ) φ ( xi )

T

(9)

TABLE I THE VALUES ARE CLASSIFICATION RATE (%) FOR DIFFERENT QTFRS AND DIFFERENT NUMBER OF CLASSES. FIRST VALUES IN PARENTHESES ARE THE STANDARD DEVIATION AND SECOND VALUES ARE P-VALUE FOR KPCA METHOD TFRS Methods (S, Z, F, O, N) (S, Z, F) (S, Z) 92.8 (1.1) 97.3 (2.1) SPWV KPCA 100 (