Electroencephalogram (EEG) time series classification: Applications in ...

3 downloads 495 Views 685KB Size Report
In this study, we apply optimization-based data mining techniques to classify the brain's ... ClassificationEEGBrain dynamicsOptimizationEpilepsySupport vector machines .... Not logged in Google [Search Crawler] (3000811494) 66.249.66.167.
Ann Oper Res (2006) 148: 227–250 DOI 10.1007/s10479-006-0076-x

Electroencephalogram (EEG) time series classification: Applications in epilepsy Wanpracha Art Chaovalitwongse · Oleg A. Prokopyev · Panos M. Pardalos

Published online: 18 August 2006  C Springer Science + Business Media, LLC 2006

Abstract Epilepsy is among the most common brain disorders. Approximately 25–30% of epilepsy patients remain unresponsive to anti-epileptic drug treatment, which is the standard therapy for epilepsy. In this study, we apply optimization-based data mining techniques to classify the brain’s normal and epilepsy activity using intracranial electroencephalogram (EEG), which is a tool for evaluating the physiological state of the brain. A statistical cross validation and support vector machines were implemented to classify the brain’s normal and abnormal activities. The results of this study indicate that it may be possible to design and develop efficient seizure warning algorithms for diagnostic and therapeutic purposes. Keywords Classification . EEG . Brain dynamics . Optimization . Epilepsy . Support vector machines

1. Introduction Epilepsy is one of the most common brain disorders, which is a chronic condition of diverse etiologies with the common symptom of spontaneous recurrent seizures, which is characterized by intermittent paroxysmal and highly organized rhythmic neuronal discharges in the cerebral cortex.

Research was partially supported by the Rutgers Research Council grant-202018, the NSF grants DBI-980821, CCF-0546574, IIS-0611998, and NIH grant R01-NS-39687-01A1. W. A. Chaovalitwongse () Department of Industrial and Systems Engineering, Rutgers University, 96 Frenlinghuysen Rd., Piscataway, NJ 08854, USA e-mail: [email protected] O. A. Prokopyev· P. M. Pardalos Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32601, USA Springer

228

Ann Oper Res (2006) 148:227–250

Fig. 1 Inferior transverse views of the brain, illustrating approximate depth and subdural electrode placement for EEG recordings are depicted. Subdural electrode strips are placed over the left orbitofrontal (LOF), right orbitofrontal (ROF), left subtemporal (LST), and right subtemporal (RST) cortex.Depth electrodes are placed in the left temporal depth (LTD) and right temporal depth (RTD) to record hippocampal activity

In some types of epilepsy (e.g., focal or partial epilepsy), there is a localized structural change in neuronal circuitry within the cerebrum which produces organized quasi-rhythmic discharges, which spread from the region of origin (epileptogenic zone) to activate other areas of the cerebral hemisphere. The development of the epileptic state can be considered as changes in network circuitry of neurons in the brain that produce changes in voltage potential, which can be captured by an electroencephalogram (EEG). These changes are reflected by wriggling lines along the time axis in a typical EEG recording. A typical electrode montage for EEG recordings in our study is shown in Fig. 1. The 10-second EEG profiles during the normal and pre-seizure periods of patient 1 are illustrated in Fig. 2(A) and (B). The EEG onset of a typical epileptic seizure is illustrated in Fig. 2(C). Figure 2(D) shows the post-seizure state of a typical epileptic seizure, respectively. During the past decade, seizure predictability has been demonstrated through the recent studies in the EEG dynamics. Those studies include discoveries previously reported by our group (Chaovalitwongse et al., 2003; Chaovalitwongse, Pardalos, and Prokoyev, 2004; Iasemidis et al., 2003; Pardalos et al., 2003), confirmed by several other groups (Elger and Lehnertz, 1998; Lehnertz and Elger, 1998; Quyen et al., 1999; Litt et al., 2001; Martinerie, Adam and Quyen, 1998), which indicate that it may be possible to detect state transitions to epileptic seizures based on analysis of the brain electrical activity through EEG signals. Those studies were motivated by mathematical models used to analyze multi-dimensional complex systems (e.g., neuronal network in the brain) based on the chaos theory and optimization techniques. The results of those studies demonstrated that a seizure is essentially a reflecting transition of progressive changes of hidden dynamical patterns in EEG. Such transitions have been shown to be detectable through the quantitative analysis of the brain dynamics (Chaovalitwongse et al., 2003; Chaovalitwongse, Pardalos, and Prokoyev, 2004; Pardalos et al., 2003, 2004). Nonetheless, research in seizure prediction is still far from complete in spite of promising signs of the seizure’s predictabilty. While the previous studies demonstrate that pre-seizure transitions are detectable, the existence of pre-seizure transitions remains to be further investigated with respect to its specificity and accuracy, that is if it only reflects epileptic activity or it also occurs with other brain activity. In addition, the fundamental question, whether Springer

Ann Oper Res (2006) 148:227–250

229

Fig. 2 Twenty-second EEG recordings of (A) Normal Activity (B) Pre-Seizure Activity (C) Seizure Onset Activity (D) Post-Seizure Activity from patient 1 obtained from 32 electrodes. Each horizontal trace represents the voltage recorded from electrode sites listed in the left column (see Fig. 1 for anatomical location of electrodes)

the brain’s normal and pre-seizure epileptic activities are distinctive or differentiable, remains unanswered. In order for one to verify that seizures are predictable, one would have to demonstrate substantial evidence that the brain’s normal activity differs from the brain’s pre-seizure epileptic activity. If they differ, the next question is “Are different brain’s states classifiable?”. In this research, in order to answer these crucial questions, we employ quantitative measures of the brain dynamics, including Short-Term Maximum Lyapunov Exponents, Angular Frequency and Entropy, to contemplate hidden mechanisms of the state transition toward seizures. The study of the brain dynamics is motivated by the proof concept of the chaos theory; that is, understanding the brain dynamics is capable of providing insights about different states of brain activities reflected from pathological dynamical interactions of the brain network (Iasemidis and Sackellares, 1991; Iasemidis, 1991). Especially, these three measures were previously shown capable of contemplating dynamical mechanisms of the brain network (Chaovalitwongse et al., 2003). Based on the quantification of the brain dynamics, we herein apply optimization and data mining techniques to the study in the classifiability of the brain’s activities. These techniques are proposed for the extraction of classifiable features of different brain’s physiological states from spontaneous EEG. In other words, the classifiable features are the characteristics of the brain dynamics, which reflect the signature of the pre-seizure epileptic activity during the brain’s abnormal episodes. We use these features in the study of the classification of the brain’s states (normal, pre-seizure, and Springer

230

Ann Oper Res (2006) 148:227–250

post-seizure states). A statistical cross validation is implemented to estimate the accuracy of the brain state classification. In addition, support vector machines are also implemented to classify the brain’s activities. The organization of the succeeding sections of this paper is as follows. The background including directions of epilepsy research, research in seizure prediction, and chaos in epilepsy will be discussed in Section 2. In Section 3, the quantification of the brain dynamics, experimental design, the data characteristics, the statistical cross-validation, classification schemes, and support vector machines will be described. The results on the statistical evaluation and the performance characteristics of the proposed classification method will be addressed in Section 4. The concluding remarks will be discussed in Section 5.

2. Background 2.1. Directions of epilepsy research The diagnosis and treatment of epilepsy is complicated by the disabling aspect that seizures occur spontaneously and unpredictably. While the most common treatment for epilepsy is pharmacological, despite advances in neurology and neuroscience, approximately 25 to 30% of patients receiving treatment with Anti-Epileptic Drugs (AEDs) continue to have seizures and have inadequate seizure control. Epilepsy surgery is also another option. However, surgical treatment is only useful when the epileptogenic zone, the area initiating the habitual seizures, can be identified through an extensive pre-surgical evaluation and enough epileptogenic brain tissue may be removed to prevent (or at least drastically reduce) the occurrence of seizures without leading to important functional and neuropsychological deficits. However, at least 50% of pre-surgical candidates eventually will not undergo respective surgery because a single epileptogenic zone is not identified or it is located in functional brain tissue. Besides, only 60 to 85% of epilepsy surgery cases result in seizure free patients. Recently, the vagus nerve stimulator Neurocybernetic Prosthesis has been available for an alternative epilepsy treatment that reduces seizure frequency; however, a majority of patients still suffer from side effects and cannot benefit from this treatment. In addition, only 60% of surgical cases are successful and the mean length of hospital stay for patients with intractable epilepsy admitted for invasive EEG monitoring for pre-surgical candidates ranged from 4.7 to 5.8 days and the total aggregate costs exceeded $200 million each year (Begley et al., 2000). Cost per patient ranged from $4,272 for persons with remission after initial diagnosis and treatment to $138,602 for persons with intractable and frequent seizures (Begley et al., 1994). Based on 1995 estimates, epilepsy imposes an annual economic burden of $12.5 billion in the U.S. in associated health care costs and losses in employment, wages, and productivity (Begley et al., 2000). Uncontrolled epilepsy poses a significant burden to society due to the poor life quality and associated healthcare cost. Since the currently existing treatments for patients with epilepsy are not beneficial, there has been an urgency of new therapeutic treatments for epilepsy. Seizure prediction/warning is an alternative for effective and safe treatment for people with epilepsy by avoiding both the side effects of drugs and cutting out pieces of brain. If seizures could be predicted, it would lead to the development of completely novel diagnostic and therapeutic advances in controlling epileptic seizures. The most realizable application of seizure prediction development is the usage in therapeutic epilepsy devices to either warn of an impending seizure or trigger intervention to prevent seizures before they begin. Springer

Ann Oper Res (2006) 148:227–250

231

2.2. Research in seizure prediction There is growing evidence that human epileptic seizures are preceded by physiological changes that are reflected in the dynamical characteristics of the EEG signals. Our group reported pre-seizure convergence of STLmax values (calculated from intracranial or scalp electrode EEG recordings) which occurred tens of minutes prior to epileptic seizures (Iasemidis et al., 2001). Subsequently, Elger and Lehnertz (1998) reported reductions in the effective ef f correlation dimension (D2 , a measure of the complexity of the EEG signals) that were more prominent in pre-seizure EEG samples than at times more distant from a seizure. They estimated that a detectable change in dynamics could be observed at least 2 minutes before a seizure in most cases (Elger and Lehnertz, 1998). Martinerie, Adam and Quyen (1998) also reported significant differences between dimension measures obtained in pre-seizure versus normal EEG samples. They found an abrupt decrease in dimension during the pre-seizure transition. This study also employed relatively brief (40 minutes) samples of pre-seizure and normal data. More recently, this group reported changes in brain dynamics obtained from scalp electrode recordings of the EEG. By comparing preseizure EEG samples to a reference sample selected from normal data, they found evidence for dynamical changes that anticipated temporal lobe seizures by periods of up to 15 minutes (Quyen et al., 1999). Recently, Litt and coworkers (Litt et al., 2001) reported sustained bursts of energy in some EEG channels visually selected by one of the investigators. 2.3. Chaos in epilepsy A chaotic system is one that exhibits chaotic behavior, which is characterized by sensitivity to initial conditions, during normal operation. The brain is an example of a multi-dimensional system that exhibits chaotic behavior during normal operation. Epilepsy is one of the conditions that can trigger the brain to experience abnormal episodes due to chaotic brain malfunctions. A variety of experiments have shown that the EEG time series is driven by a deterministic dynamical system with a low dimensional chaotic attractor, which is defined as the phase space point or set of points representing the various possible steady-state conditions of a system; an equilibrium state or group of states to which a dynamical system converges. Thus, the chaos theory has provided new theoretical and conceptual tools that allow us to capture, understand, and link the complex behaviors of simple systems together. Several quantitative system approaches based on the chaos theory have been successfully used to study epilepsy because the aperiodic and unstable behavior of the epileptic brain is suitable to such techniques that allow precise tracking of the temporal evolution. The transition from the brain’s normal activity to pre-seizure epileptic activity has been postulated to result from periodic increases in the spatial and temporal nonlinear relations of abnormally discharging neurons. In particular, the brain activities may result from complex systems that follow dynamical transitions, and of which the statistical properties depend on both time and space. This motivated us to resort to quantitative analyses of brain dynamics based on mathematical models derived from chaos theory, which were previously shown capable of contemplating dynamical mechanisms of the brain networks (Iasemidis et al., 2001). Through the analysis of the chaos theory, we have discovered dynamical changes of epileptic activity that involve the gradual transition from a state of spatiotemporal chaos to spatial order and temporal chaos. Such a transition that precedes seizures for periods on the order of minutes to hours is detectable in the EEG by the convergence in value of the measures of chaos (i.e., Lyapunov Exponent, Angular Frequency, and Entropy) among critical electrode sites on the neocortex Springer

232

Ann Oper Res (2006) 148:227–250

and hippocampus. We postulated the convergence in value of the measures of chaos as an increase in exchange of electrophysiological information when regions become synchronized in dynamics during the transition to seizure (pre-seizure transition). Previous studies by our group have shown that measures of the spatiotemporal dynamical properties of the EEG demonstrate patterns that correspond to specific clinical states. One of the remarkable findings, which is the motivation of this research, is that the normal, seizure, and immediate post-seizure states differ with respect to the spatiotemporal dynamical properties of intracranial EEG recordings (Chaovalitwongse et al., 2003; Iasemidis et al., 2003; Pardalos et al., 2003).

3. Methods and experimental design 3.1. Quantification of the brain dynamics Quantification of the brain dynamics is suitable to the investigation of a nonstationary system such as the brain because it is capable of automatically identifying and appropriately weighing existing transients in the data. To quantify the brain dynamics, we divide EEG signals into sequential 10.24-second epochs (non-overlapping windows) to properly account for possible nonstationarities in the epileptic EEG. For each epoch of each channel of EEG signals, we estimate the measures of chaos to quantify the chaoticity of the attractor. These measures include Lyapunov exponents, angular frequency, and entropy. A chaotic system (e.g., the brain) is a system in which orbits that originate from similar initial conditions or nearby points in the phase space diverge exponentially in expansion process. The rate of divergence is an important aspect of the dynamical system and is reflected in the value of Lyapunov exponents and angular frequency. The Lyapunov exponents and angular frequency measure the average uncertainty along the local eigenvectors and phase differences of an attractor in the phase space, respectively. The entropy measures the uncertainty or information about the future state of the system, given information about its previous states in the phase space. Next, we will give a short overview of mathematical models used in the estimation of the maximum Lyapunov exponent, angular frequency, and entropy in EEG signals. In the study of the brain dynamics, the initial step in analyzing the dynamical properties of EEG signals is to embed it in a higher dimensional space of dimension p, which enables us to capture the behavior in time of the p variables that are primarily responsible for the dynamics of the EEG. We can now construct p-dimensional vectors X (t), whose components consist of values of the recorded EEG signal x(t) at p points in time separated by a time delay. Construction of the embedding phase space from a data segment x(t) of duration T is made with the method of delays. The vectors X i in the phase space are constructed as: X i = (x(ti ), x(ti + τ ) . . . x(ti + ( p − 1) ∗ τ ))

(1)

where τ is the selected time lag between the components of each vector in the phase space, p is the selected dimension of the embedding phase space, and ti ∈ [1, T − ( p − 1)τ ]. The vectors X i in the phase space are illustrated in Fig. 3. 3.1.1. Estimation of maximum Lyapunov exponent The method for estimation of the Short Term Maximum Lyapunov Exponent (STLmax ) for nonstationary data (e.g., EEG time series) is previously explained in Iasemidis (1991) and Springer

Ann Oper Res (2006) 148:227–250

233

Fig. 3 Diagram illustrating the embedding phase space used in the estimation of the measures of chaos

Wolf et al. (1985). In this section, we will only give a short description and basic notation of our mathematical models used to estimate STLmax . First, let us define the following notation.

r Na is the number of local L max ’s that will be estimated within a duration T data segment.

Therefore, if Dt is the sampling period of the time domain data, T = (N − 1)Dt = Na t + ( p − 1)τ . r X (ti ) is the point of the fiducial trajectory φt (X (t0 )) with t = ti , X (t0 ) = (x(t0 ), . . . , x(t0 + ( p − 1) ∗ τ )), and X (t j ) is a properly chosen vector adjacent to X (ti ) in the phase space. r δ X i, j (0) = X (ti ) − X (t j ) is the displacement vector at ti , that is, a perturbation of the fiducial orbit at ti , and δ X i, j (t) = X (ti + t) − X (t j + t) is the evolution of this perturbation after time t. r ti = t0 + (i − 1) ∗ t and t j = t0 + ( j − 1) ∗ t, where i ∈ [1, Na ] and j ∈ [1, N ] with j = i. r t is the evolution time for δ X i, j , that is, the time one allows δ X i, j to evolve in the phase space. If the evolution time t is given in second, then L is in bits per second. r t0 is the initial time point of the fiducial trajectory and coincides with the time point of the first data in the data segment of analysis. In the estimation of L, for a complete scan of the attractor, t0 should move within [0, t]. Let L be an estimate of the short term maximum Lyapunov exponent, defined as the average of local Lyapunov exponents in the state space. L can be calculated by the following equation: L=

Na |δ X i, j (t)| 1  . log2 Na t i=1 |δ X i, j (0)|

(2) Springer

234

Ann Oper Res (2006) 148:227–250 9 8

Lmax (bits/sec)

7 6 5 4 3 2 1 0

25

50

75 100 Time (Minutes)

125

150

Fig. 4 ST L max profile over 2.5 hours estimated from an EEG signal recorded at RTD − 2 (the epileptogenic hippocampus of patient 1). A seizure started at the vertical dashed line

An example of a typical STLmax profile estimated from an EEG signal recorded at RTD − 2 (the epileptogenic hippocampus of patient 1) over time is given in Fig. 4. In the estimation of the STLmax , the state space was reconstructed by dividing the EEG signal into sequential, non-overlapping segments of 2048 points (sampling frequency 200 Hz, hence each segment of 10.24 seconds in duration) with p = 7 and τ = 4. Note that p = 7 and τ = 20 milliseconds are the crucial parameters of the STLmax estimation procedure, in order to distinguish between the pre-seizure, the seizure and the post-seizure stages based on results from simulated data of known attractors (Iasemidis, 1991). The values are estimated from a 150-minute EEG sample recorded from the RTD − 2 electrode located in the epileptogenic hippocampus. The EEG sample includes a seizure that occurs in the time 124 minutes of the recording. In the preseizure state, one can see a trend of STLmax toward lower values over the whole pre-seizure period. This phenomenon can be explained as an attempt of the system toward a new state of less degrees of freedom long before the actual seizure (Iasemidis and Sackellares, 1991). 3.1.2. Estimation of angular frequency (dynamical phase) Similar to the estimation of Lyapunov exponents, the estimation of the angular frequency is motivated by the representation of a state as a vector in the state space. The angular frequency is merely an average uncertainty along the phase differences of an attractor in the phase space. First, let us define the difference in phase between two evolved states X (ti ) and X (ti + t) as i . Then, denoting with () the average of the local phase differences i between the vectors in the state space, we have:  = Springer

Nα 1  · i Nα i=1

(3)

Ann Oper Res (2006) 148:227–250

235

where Nα is the total number of phase differences estimated from the evolution of X (ti ) to X (ti + t) in the state space, according to:     X (ti ) · X (ti + t)  . i =  arccos  X (ti )  ·  X (ti + t)  

(4)

¯ is: Then, the average angular frequency  ¯ = 1 · .  t

(5)

¯ is given in rad/sec. Thus, while STLmax measures the local If t is given in second, then  ¯ measures how fast a local state of the system stability of the state of the system on average,  ¯ by 2π, the rate of the change of the state of the system changes on average (e.g. dividing  is expressed in sec−1 = Hz). ¯ profile estimated from an EEG signal recorded at RTD − 2 An example of a typical  (the epileptogenic hippocampus of patient 1) over time is given in Fig. 5. The values are estimated from a 150-minute EEG sample recorded from the RTD − 2 electrode located in the epileptogenic hippocampus. This segment of EEG signals is the same as the one used to estimate the STLmax profile in Fig. 4. The pre-seizure, seizure and post-seizure states ¯ respectively. The highest  ¯ values were correspond to medium, high and lower values of  ¯ values were observed during the pre-seizure observed during the seizure period, and higher  period than during the post-seizure period. This pattern roughly corresponds to the typical observation of the decrease in values of the STLmax profile. The explanation is that the higher

1.8

1.6

1.4

1.2

1

0.8

0.6 0

25

50

75 100 Time (minutes)

125

150

¯ profile before, during, and after an epileptic seizure, estimated from an EEG signal recorded Fig. 5 A typical  at electrode RTD − 2 (the epileptogenic hippocampus of patient 1). Note that this segment of EEG signals is the same as the one used to estimate the STLmax profile in Fig. 4, which the seizure occurred at the vertical dashed line Springer

236

Ann Oper Res (2006) 148:227–250

rate of change of the trajectory in the phase space, the lower change in direction (angular frequency) of the trajectory. On the other hand, while the rate of change is low, the higher angle (more diverse directions) the trajectory can change. 3.1.3. Entropy and information One of the key concepts in information theory is that information is conveyed by randomness. Basically, information is defined, in some mathematical sense, as knowledge used to identify structure of the data. It is not difficult to make the connection between randomness and information. The information is related somewhat to the degree of surprise at finding out the answer. In fact, the information can be used to measure the amount of pattern or sequence hidden in the data. On another hand, entropy can thus be viewed as a self-moment of the probability, in contrast to the ordinary moments. However, the entropy also measures the degree of surprise one should feel upon learning the results of a measurement. It counts the number of possible states, weighting each by its likelihood. The negative of entropy is sometimes called information. The entropy can also be considered as the average uncertainty provided by a random variable x. For example, the entropy measures average over quantities that provide a different kind of information than the ordinary moments. In fact, it is a measure of the amount of information required on the average to describe the random variable x. The entropy H ( p) of a probability density p is  H ( p) = −

p(x) log p(x)d x.

The entropy of a distribution over a discrete domain is  H ( p) = − pi log pi .

(6)

(7)

i

The entropy of EEG data describes the extension to which the distribution is concentrated on small sets. If the entropy is low, then the distribution is concentrated at a few values of x. It may, however, be concentrated on several sets from each other. Thus, the variance can be large while the entropy is small. The average entropy depends explicitly on p. The entropy can be written as H ( p) = −log p(x).

(8)

As we mentioned, the information is referred to the negative of entropy written as I ( p) = −H.

(9)

For our purposes the distribution between the two is semantic. Entropy is most often used to refer to measures, whereas information typically refers to probabilities. Specifically to the study of the brain dynamics, we use the entropy to measure the information hidden in the EEG data. In Fig. 6, a example of the entropy profile from EEG recordings of electrode RTD − 2 (the epileptogenic hippocampus of patient 1) over 2-hour interval including one seizure are illustrated.

Springer

Ann Oper Res (2006) 148:227–250

237

1800 1700 1600 1500 1400 1300 1200 1100 1000 900 800 0

25

50

75 100 Time (Minutes)

125

150

Fig. 6 Plot of the Entropy over time derived from an EEG signal recorded at RTD − 2 (the epileptogenic hippocampus of patient 1). Note that this segment of EEG signals is the same as the one used to estimate the STLmax profile in Fig. 4 and the phase profile in Fig. 5, which the seizure occurred at the vertical dashed line

3.2. Experimental design The present study was undertaken to determine whether or not the brain’s normal, pre-seizure, and post-seizure states differ. If they differ, we will try to demonstrate the classifiability of different brain’s states through the quantitative analysis of the brain dynamics (STLmax , phase, and entropy). To do so, we develop a novel statistical classification technique based on optimization and data mining techniques. If successful, the results of this study can be extended to the development of a new therapeutic approach, which is used to provide an early seizure warning as well as an automated brain’s state classifier. In this framework, we first calculate measures of chaos from EEG signals using the methods described in the previous section. Each measure was calculated continuously for each non-overlapping 10.24-second segment of EEG data. Figure 7 shows an example of 3-dimensional plot of three measures in the brain’s different physiological states. There is a gradual transition from one physiologic state to another, which can be identified. This observation suggests the possibility to classify the brain’s physiological state. We will then test our hypothesis whether the measures of chaos can be used as features to discriminate different states of the brain dynamics. Our hypothesis can be stated as follows: “The brain dynamics from the same brain’s state is more similar than the one from different brain’s states”. In other words, the characteristics of the brain dynamics during the brain’s normal activity should be more similar to each other than the brain’s pre-seizure epileptic activity, vice versa. We also postulate that the brain functional units should execute different activities during the three states of a patient. To test the difference between different brain’s states, we propose a statistical T-test as a statistical distance measure between two time series. Applying the T-test, a leave-one-out statistical cross-validation will be implemented to validate our hypothesis. In order to show the practical applicability of this framework, we will also apply support vector machines to classify the brain’s states. Springer

238

Ann Oper Res (2006) 148:227–250

1500

Entropy

1000

500 Interictal Preictal Ictal Postictal

0 2 1.5

1 0.5 Angular Frequency

0

2

3

4

5

6

7

8

STLmax

Fig. 7 Example of 3-dimension plots of entropy, angular frequency, and STLmax in different physiological states (normal, pre-seizure, seizure and post-seizure) for an epileptic patient

3.3. Dataset We will test the aforementioned hypothesis in the human subject in which long-term (3 to 14 days in duration) multi-channel intracranial EEG recordings were obtained from bilaterally, surgically implanted microelectrodes in the hippocampus, temporal and frontal lobe cortexes of 3 epileptic patients with medically intractable temporal lobe epilepsy. The recordings were obtained as part of a pre-surgical clinical evaluation. Each record included a total of 28 to 32 intracranial electrodes (8 subdural and 6 hippocampal depth electrodes for each cerebral hemisphere). These EEG recordings were viewed by two independent electroencephalographers to determine the number and type of recorded seizures, seizure onset and end times, and seizure onset zones. From our EEG recordings, for an individual patient, we divide the EEG data into 3 groups: (1) normal state, (2) pre-seizure state, and (3) post-seizure state. The normal state of EEG data are chosen such that the recordings are far away from seizures (at least 8 hours) and the patient’s medical conditions are at a steady state. The pre-seizure epileptic state of EEG data are chosen so that the recordings are within the 30-minute interval before seizures. In addition, we are also interested in the brain recovery period (post-seizure); therefore, the post-seizure state of EEG data are chosen such that the recordings are in the 30-minute interval after seizures. The characteristics of the patients and the test recordings are outlined in Fig. 8. For individual patient, we randomly select 200 samples (epochs) from the group of normal state EEG’s. Each epoch is 5 minutes in length. Note that in this analysis we only consider clinical seizures and un-clustered seizures. From the data set, we select 22, 7, and 15 seizures in patients 1, 2, and 3, respectively. For every seizure, we randomly select 3 epochs of preseizure EEG data and 3 epochs of post-seizure EEG data. In other words, 66, 21, and 45 epochs of the EEG data are selected from the group of pre-seizure EEG data in patient 1, 2 and 3, respectively. Similarly, 66, 21, and 45 epochs are randomly selected from the group of post-seizure EEG in patient 1, 2, and 3, respectively. Springer

Ann Oper Res (2006) 148:227–250

Patient ID Gender

1 2 3

F F M

239

Onset Age Seizure region (years) types RH RH RH

41 45 29

Total

CP CP, SC CP, SC

Duration of Number Inter-seizure average interrecordings of interval range seizure interval (days) seizures (hours) (hours) 9.06 3.63 6.07

24 9 19

0.30 ~ 14.49 0.52 ~ 47.93 0.32 ~ 70.70

3.62 8.69 6.61

18.76 days

52

0.30 ~ 70.70

5.59

Fig. 8 Data characteristics. Onset region: LH, Left Hippacampal; RH, Right Hippacampal; RF, Right Orbifrontal. Seizure types: CP, Complex Partial; SC, Subclinical

3.4. Classification problems The general idea of classification problems that will be implemented in the next section is as follows. Suppose that we have a dataset of N elements, and each of these elements has a finite number of certain attributes. If we denote the number of attributes as n, then every element of the given dataset can be represented as a pair (xi , yi ), i = 1, . . . , N , where xi ∈ R n is an n-dimensional vector: xi = (xi1 , xi2 , . . . , xin )T , and yi is the class attribute. The value of y defines to which class a given element belongs, and this value is known a priori for each element of the initial dataset. It should be also mentioned that in this case yi can take integer values, and the number of these values (i.e., the number of classes) is pre-defined. Now suppose that a new element with the known attribute vector x, but unknown class attribute y, is added to the dataset. The essence of classification problems is to predict the unknown value of y. This is accomplished by identifying a criterion of placing the element into a certain class based on the information about the known attribute x of this element. An initial question that naturally arises is how to find a criterion that will correctly classify data elements with a sufficiently high confidence level. The intuitive answer is that the construction of such a criterion should be based on the available information about the elements whose attributes are all known. However, another fundamental question is how to create a formal model that would take the available dataset as the input and perform the classification procedure. Several approaches aimed to utilize this idea have been developed. The main idea of these approaches is to adjust (or, “train”) the classification model using the existing information about the elements in the available dataset (which is usually referred to as the “training dataset”) and then apply this model to classifying new elements. However, in this study, since we aim to verify the classifiability of the brain’s states, we set up an intuitive rule (model) that needs not be trained. The details of this framework is described in the next section. 3.5. Statistical cross validation In this research, we propose leave-one-out statistical cross-validation to prove our main hypothesis. In general, cross-validation can be seen as a way of applying partial information about the applicability of alternative classification strategies. In other words, cross-validation Springer

240

Ann Oper Res (2006) 148:227–250

is a method for estimating generalization error based on “resampling”. The resulting estimates of generalization error are often used for choosing among various decision models (rules). Generally, people refer cross validation to k-fold cross validation. In k-fold cross-validation, the data are divided into k subsets of (approximately) equal size. The decision models are trained k times, in which one of the subsets from training is left out each time, by using only the omitted subset to compute whatever error criterion interests you. If k equals the sample size, this is called “leave-one-out” cross-validation. In our study, we have already had the decision rule in classifying the brian’s states. Therefore, we do not have any decision models to train. In fact, we can only apply crossvalidation to simply estimate the generalization error of our decision rule and validate our hypothesis. Since we consider the estimate of statistical distance between EEG epochs as our decision rule, we call this technique a “statistical cross-validation”. Our decision rule is described as follows. Given an unknown-state epoch of EEG signals “A”, we calculate the average statistical distance (T-index) between “A” and the groups of normal, pre-seizure, and post-seizure EEG data. Per electrode, we will then obtain 3 T-index values, which are the average mean statistical distances between the epoch “A” and the group of normal, pre-seizure, and postseizure, respectively (see Fig. 9). The EEG epoch “A” will be classified in the group of EEG data (normal, pre-seizure, and post-seizure) that yields the minimum T-index value based on 28–32 electrodes. Since our classifier has 28–32 decision inputs, we proposed different classification scheme which are based on different electrodes and combination of dynamical measures. This study will indicate the dynamical measures that are most useful in classifying the different states of the brain. In other words, the proposed framework can be extended to the study of the feature selection of the brain dynamics. Next, we will discuss different classification schemes that we employ in this study. In this section, we propose a similarity measure used to estimate the difference of the EEG data between different groups of the brain states. We employ the T-index as a measure of statistical distance between two epochs of measures of chaos. In this section, we will use the notation of measures of chaos as STLmax to simplify the mathematical models. Note that, in practice, we do not only use these equations to calculate the statistical distance of two STLmax samples but also use in the case of phase and entropy samples. The T -index at time

Normal Pre-Seizure

Post-Seizure

A

Fig. 9 Cross validation for classification of an unknown-state EEG epoch “A” by calculating the statistical distances between “A” and normal, “A” and pre-seizure, and “A” and post-seizure Springer

Ann Oper Res (2006) 148:227–250

241

t between electrode sites i and j is defined as: T indexi, j (t) =



Nt × |E{STLmax,i − STLmax, j }|/σi, j (t)

(10)

where E{·} is the sample average difference for the STLmax,i − STLmax, j estimated over a moving window wt (λ) defined as:  wt (λ) =

1

if λ ∈ [t − Nt − 1, t]

0

if λ ∈ [t − Nt − 1, t],

where Nt is the length of the moving window. Then, σi, j (t) is the sample standard deviation of the STLmax differences between electrode sites i and j within the moving window wt (λ). The thus defined T -index follows a t-distribution with Nt − 1 degrees of freedom. 3.6. Classification schemes For multi-dimensional systems, such as brains, we have more than one attribute used for making decision in classification. Attributes can be electrodes and dynamical measures. As we mentioned in the previous section, we apply three measures of chaos and 28–32 electrodes, which are considered to be attributes in the classification. In this framework, we treat different classification schemes as combinations of measures of chaos. For instance, one of the classification schemes can be the combination of L max and dynamical phase. Another example can be the combination of L max and entropy. In addition, we still have 26–30 attributes used in decision making. We propose two schemes to incorporate these attributes; that are, averaging scheme and voting scheme. The averaging scheme is one of the most intuitive scheme used to incorporate different attributes (considered as outputs). The voting scheme is a common technique used for construction of reliable decisions in critical systems. For each electrode, an output is separately calculated and is sent to the voter. Each version for each electrode is usually developed independent of other versions. Based on the 30 electrodes (outputs), the voter then estimates the correct output. The design of the voter is essential to the reliable functioning of the system. Majority voting selects action with maximum number of votes. 3.7. Support vector machines Support Vector Machines is one of the classification techniques widely used in practice. The essence of support vector machines is to construct separating surfaces that will minimize the upper bound on the out-of-sample error. In the case of one linear surface (plane) separating the elements from two classes, this approach will choose the plane that maximizes the sum of the distances between the plane and the closest elements from each class, i.e., the “gap” between the elements from different classes. The mathematical definition of support vector machines can be described as follows: Let all the data elements be represented as n-dimensional vectors (or points in the ndimensional space), then these elements can be separated geometrically by constructing the surfaces that serve as the “borders” between different groups of points. One of the common approaches is to use linear surfaces (planes) for this purpose, however, different types of nonlinear (e.g., quadratic) separating surfaces can be considered in certain applications. It is also worth noting that, in reality, it is not possible to find a surface that would “perfectly” Springer

242

Ann Oper Res (2006) 148:227–250

separate the points according to the value of some attribute, i.e., points with different values of the given attribute may not necessarily lie at the different sides of the surface; however, in general, the number these errors should be small enough. The classification problem of support vector machines can be represented as the problem of finding geometrical parameters of the separating surface(s). As it will be described below, these parameters can be found by solving the optimization problem of minimizing the misclassification error for the elements in the training dataset (so-called “in-sample error”). After determining these parameters, every new data element will be automatically assigned to a certain class, according to its geometrical location in the elements space. The procedure of using the existing dataset for classifying new elements is often called “training the classifier” (and the corresponding dataset is referred to as the “training dataset”). It means that the parameters of separating surfaces are “tuned” (or, “trained”) to fit the attributes of the existing elements to minimize the number of errors in their classification. However, a crucial issue in this procedure is not to “overtrain” the model, so that it would have enough flexibility to classify new elements, which is the primal purpose of constructing the classifier. An example of hyperplanes separating the brain’s pre-seizure, normal, and post-seizure states is illustrated in Fig. 10. In this study, we consider one of the first practical applications of mathematical programming in the brain’s state classification. The essence of the support vector machines framework in application can be stated as follows. The dataset consisting of 66 × 2, 21 × 2, and 45 × 2 N ∗ m-dimensional feature vector corresponding to patient 1, 2, and 3, respectively. In each patient, we only take samples of normal and pre-seizure EEG data because support vector machines is a binary (2-class) classifier and, in practice, we are only interested in differentiating normal and pre-seizure data. N is the number of electrodes for individual patient. m = 30 is the length of data sample, which is about 5 mins duration. In this experiment, we also apply “leave-one-out cross validation” as in the previous experiment. The classifier was developed based on linear programming (LP) techniques. The procedure of constructing this classifier is relatively simple. The vectors corresponding to normal and pre-seizure states are stored in two matrices,

Fig. 10 Example of hyperplanes separating different brain’s states Springer

Ann Oper Res (2006) 148:227–250

243

A and B, respectively. The goal of the constructed model is to find a plane which would separate all the vectors (points in the N ∗ m-dimensional space) in A from the vectors in B. If a plane is defined by the standard equation xT ω = γ, where ω = (ω1 , . . . , ωn )T is an n-dimensional vector of real numbers, and γ is a scalar, then this plane will separate all the elements from A and B if the following conditions are satisfied: Aω ≥ eγ + e,

Bω ≤ eγ − e.

(11)

Here e = (1, 1, . . . , 1)T is the vector of ones with appropriate dimension (m for the matrix A and k for the matrix B). However, as it was pointed out above, in practice it is usually not possible to perfectly separate two sets of elements by a plane. For this reason, one should try to minimize the average measure of misclassifications, i.e., in the case when the constraints (11) are violated the average sum of violations should be as small as possible. The violations of these constraints are modeled by introducing nonnegative variables u and v as follows: Aω + u ≥ eγ + e,

Bω − v ≤ eγ − e.

(12)

Now we are having an optimization model that will minimize the total average measure of misclassification errors as follows: min

ω,γ ,u,v

m k 1 1  ui + vj m i=1 k j=1

(13)

subject to Aω + u ≥ eγ + e

(14)

Bω − v ≤ eγ − e

(15)

u ≥ 0, v ≥ 0.

(16)

As one can see, this is a linear programming problem, and the decision variables here are the geometrical parameters of the separating plane ω and γ , as well as the variables representing misclassification error u and v. Although in many cases this type of problems may involve high dimensionality of data, they can be efficiently solved by available LP solvers, for instance Matlab, Xpress-MP, or CPLEX.

4. Results To evaluate the performance characteristics of our statistical cross-validation technique, we calculate the sensitivity and false positive rate for each of the proposed classification schemes. In order to select the best classification scheme, we propose the receiver operating characteristics (ROC) analysis, which is derived from the detection theory. Basically, the sensitivity and the specificity of each classification scheme for each combination of dynamical measures Springer

244

Ann Oper Res (2006) 148:227–250

are calculated. Based on the detection theory, best scheme is selected such that it is closest to the ideal classifier (best performance). The performance evaluation of classification schemes and the results for each of classification schemes will be discussed in more details in the next subsequent subsections. 4.1. Performance evaluation of classification schemes To evaluate the classifier, we categorize the classification into two classes, positive and negative. Then we consider four subsets of classification results, 1. True positives (TP): True positive answers of a classifier denoting correct classifications of positive cases; 2. True negatives (TN): True negative answers denoting correct classifications of negative cases; 3. False positives (FP): False positive answers denoting incorrect classifications of negative cases into class positive; 4. False negatives (FN): False negative answers denoting incorrect classifications of positive cases into class negative. To better explain the concept of the evaluation of classifiers, let us consider in the case of the detection of abnormal data (see Fig. 11). A classification result was considered to be true positive if we classify an abnormal sample as an abnormal sample. A classification result was considered to be true negative if we classify a normal sample as a normal sample. A classification result was considered to be false positive when we classify a normal sample as an abnormal sample. A classification result was considered to be false negative when we classify an abnormal sample as a normal sample. In medical community, two classification accuracy measures, sensitivity and specificity, are usually employed. Sensitivity measures the fraction of positive cases that are classified as positive. Specificity measures the fraction of negative cases classified as negative. The sensitivity and specificity are defined as follows: Sensitivity = TP/(TP + FN), Specificity = TN/(TN + FP). In fact, the sensitivity can be considered as a probability of accurately classifying EEG samples in the positive case. The specificity can be considered as a probability of accurately classifying EEG samples in the negative case. In general, one always wants to increase the sensitivity of classifiers by attempting to increase the correct classifications of positive cases (TP). In order to increase the specificity, we should try to decrease the number of incorrect classifications of negative cases into class positive (FP). 4.2. Best classification scheme In the algorithm, we have different classification schemes including average and voting schemes for dynamical measures, which need to be optimized. In order to find the best scheme, we employed ROC analysis, which is used to indicate an appropriate trade-off that Fig. 11 Evaluation of classification results

Predict Abnormal

Actual Springer

Normal

Abnormal True Positive False Negative Normal False Positive True Negative

Ann Oper Res (2006) 148:227–250

245

ROC for Different Classification Methods 1.000 0.900 Average L+P+E

0.800

Average L+P Average L+E

0.700

Average P+E

Sensitivity

Average Lmax

0.600

Average Phase Average Entropy

0.500

Voting L+P+E Voting L+P

0.400

Voting P+E Voting Lmax

0.300

Voting Phase Voting Entropy

0.200

Voting L+E

0.100 0.000 0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

1-Specificity

Fig. 12 ROC plot for all classification schemes in patient 1

one can achieve between the detection rate (Sensitivity, plotted on Y-axis) that is desired to be maximized, and the false alarm rate (1-Specificity, plotted on X-axis) that is desirable to be minimized. For individual patient, the best classification scheme can be identified by selecting the scheme whose performance is closest to the ideal point (sensitivity = 1 and 1-specificity = 0); that is, the scheme closest to the top left hand corner on ROC plot will be selected. An example of a ROC plot in patient 1 was illustrated in Fig. 12, in which the scheme that incorporates the average of L max and Entropy is the best scheme. The performance characteristics of the best scheme for individual patient are listed in Table 1. Figure 13 illustrates the classification results of the best scheme in patient 1 (Average Lmax & Entropy). The probabilities of correctly predicting pre-seizure, post-seizure, and normal EEG’s are 89.39%, 80.30%, and 93.50%, respectively. Figure 14 illustrates the classification results of the best scheme in patient 2 (Average Lmax & Phase). The probabilities of correctly predicting pre-seizure, post-seizure, and normal EEG’s are 85.71%, 61.90%, and 78.00%, respectively. Figure 15 illustrates the classification results of the best scheme in patient 3 (Average Lmax & Phase). The probabilities of correctly predicting pre-seizure, post-seizure, and normal EEG’s are 84.44%, 73.33%, and 75.00%, respectively. Note that in practice classifying pre-seizure and normal EEG’s is more meaningful than classifying post-seizure EEG’s since the post-seizure EEG’s can be easily observed (visualized) after the seizure onset. Table 1 Performance characteristics of the best classification scheme for individual patient

Patient

Sensitivity

Specificity

Best scheme

1 2 3

90.06% 77.27% 76.21%

95.03% 88.64% 88.10%

Average Lmax & Entropy Average Lmax & Phase Average Lmax & Phase

Springer

246

Ann Oper Res (2006) 148:227–250 Performance Characteristics of Optimal Classification Scheme in Patient 1 100.00% 93.50% 90.00%

89.39% 80.30%

80.00%

Percentage of Classified State

70.00%

60.00% Pre-Seizure 50.00%

Post-Seizure Normal

40.00%

30.00% 18.18%

20.00% 7.58%

10.00%

6.00%

3.03% 0.00%

1.52%

0.50%

Pr e- Seiz ur e

Pos t-Seiz ur e

Pre-Seizure

89.39%

18.18%

Normal 0.50%

Post-Seizure

3.03%

80.30%

6.00%

Normal

7.58%

1.52%

93.50%

Fig. 13 Patient 1 results

Sensitivity of Optimal Classification Scheme in Patient 2 100.00%

90.00%

85.71% 78.00%

80.00%

Percentage of Classified State

70.00% 61.90% 60.00% Pre-Seizure 50.00%

Post-Seizure Normal 38.10%

40.00%

30.00% 22.00% 20.00%

14.29%

10.00% 0.00% 0.00%

0.00%

0.00%

Pre-Seizure

Post-Seizure

Nor mal

Pre-Seizure

85.71%

38.10%

22.00%

Post-Seizure

0.00%

61.90%

0.00%

Normal

14.29%

0.00%

78.00%

Fig. 14 Patient 2 results

Springer

Ann Oper Res (2006) 148:227–250

247

Sensitivity of Optimal Classification Scheme in Patient 3 100.00%

90.00%

84.44%

80.00%

75.00%

Percentage of Classified State

73.33% 70.00%

60.00% Pre-Seizure 50.00%

Post-Seizure Normal

40.00%

30.00% 20.00% 20.00%

15.50%

13.33%

9.50% 10.00%

6.67% 2.22%

0.00%

Pre- Seizur e

Pos t- Seiz ur e

Nor mal

Pre-Seizure

84.44%

20.00%

15.50%

Post-Seizure

13.33%

73.33%

9.50%

Normal

2.22%

6.67%

75.00%

Fig. 15 Patient 3 results

The results of this study indicate that we can correctly classify the pre-seizure EEG’s close to 90% and close to 83% in classifying the normal EEG’s on average in 3 patients. These results confirm our hypothesis that the pre-seizure and normal EEG’s are differentiable. The techniques proposed in this study can be extended to development of an online brain activity monitoring, which is used to detect the brain’s abnormal activity and seizure’s pre-cursors. From the best classification schemes in 3 patients, we note that L max tends to be the most classifiable attribute. 4.3. Results from support vector machines We have developed a support vector machines classifier (SVMC) for the brain’s state classification. However, the design of experiment to test the performance characteristics of the SVMC is slightly different from the statistical cross-validation classifier. First of all, since support vector machines is considered to be binary classifier, we can classify only two states of the brain’s physiology. In this study, we used the SVMC to classify only the brain’s pre-seizure and normal states because these two states are practically more important in the development of an automated brain’s state classifier or an automated seizure warning system. In addition, the brain’s post-seizure state can be identified right after a seizure. To train the SVMC, it is important to note that, in general, the training of support vectors machines is optimized when the number of pre-seizure and normal samples are comparable. Otherwise, the classifier will be biased to the state with larger size samples. To fairly evaluate the SVMC, we need to train the classifier with the same number of pre-seizure and normal samples. We applied Monte Carlo sampling simulation. First, we shuffle (random order) the pre-seizure and normal samples individually. Since the size of pre-seizure samples is much larger than the size of normal samples, the number of pre-seizure samples will be used to determine the Springer

248 Table 2 Performance characteristics of the support vector machine classifier for individual patient

Ann Oper Res (2006) 148:227–250

Sensitivity Patient

Pre-seizure state

Normal state

Overall

1 2 3

81.21% 71.18% 74.13%

87.46% 76.85% 70.60%

86.43% 76.76% 71.00%

Average

75.51%

78.30%

78.06%

size of the training and testing sets. Then, we divide the first of pre-seizure samples for the training and the other half for the testing. After that, we randomly select training data (with the same size) from normal samples. For individual patient, we run the simulation 100 times. Table 2 illustrates the classification results of the SVMC in individual patient. In patient 1, the sensitivity of predicting pre-seizure and normal EEG’s are about 81% and 88%, respectively. In patient 2, the sensitivity of predicting pre-seizure and normal EEG’s are about 71% and 77%, respectively. In patient 3, the sensitivity of predicting pre-seizure and normal EEG’s are about 74% and 71%, respectively. Note that this result is consistent with the prediction result based on the statistical cross-validation classifier; that is, the brain’s states of patient 1 tend to be better classifiable than that of patients 2 and 3. These results confirm our hypothesis that the brain’s states are classifiable based on quantitative analyses of EEG. The framework of classifiers proposed in this study can be extended to development of an automated brain’s state classifier or an online brain activity monitoring.

5. Concluding remarks Although evidence for the characteristic pre-seizure state transition and seizure predictability were first reported by our group (Chaovalitwongse et al., 2003; Chaovalitwongse, Pardalos, and Prokoyev, 2004; Iasemidis et al., 2001, 2003; Pardalos et al., 2003) and was confirmed by several other groups (Elger and Lehnertz, 1998; Lehnertz and Elger, 1998; Quyen et al., 1999; Litt et al., 2001; Martinerie, Adam and Quyen, 1998), further studies are required before a practical seizure prediction paradigm becomes feasible. This study addresses the open question of the classifiability of the brain’s pre-seizure, normal, and post-seizure states. The results of this study is another proof concept of the quantification of the brain dynamics, which was proved successful in providing insights and characterizing different states of brain activities reflected from pathological dynamical interactions of brain network. In addition, these results also confirm our hypothesis that it is possible to differentiate and classify the brain’s pre-seizure and normal activities based on optimization, data mining, and dynamical system approaches in multichannel intracranial EEG recordings. The reliable classification is conceivable because, for the vast majority of seizures, the spatiotemporal dynamical features of the pre-seizure state sufficiently differs from that of the normal state. It can be easily observed statistical differences in the brain dynamics from different states of the same patient. The experiments in this research can be done in a practical fashion because the herein proposed statistical cross-validation derived from data mining concepts is fast and efficient and the dynamical measures of brain dynamics derived from the chaos theory can reveal hidden information from EEG signals. Also, the development of the SVMC is very straightforward and not difficult to implement. The optimization problems in the framework of support vector machines can be solved in reasonable time. All of the programming was done in MATLAB environment on a desktop computer Pentium Springer

Ann Oper Res (2006) 148:227–250

249

IV 2.4 GHz with 1 GBytes of RAM. Both statistical cross-validation technique and SVMC are very fast and scalable. The running time for statistical cross-validation technique is less than 2 minutes on average. The running time for SVMC is less than 10 minutes on average. In the future, more cases (patients and seizures) will be studied to validate the observation across patients as well as the development of multi-class classifier based on support vector machines framework. In addition, the feature selection study will be possible in the future. This study will help us to select electrodes that show prominent changes, which might lead us to the solution to the epileptogenic zone localization problem. This pre-clinical research will form a bridge between seizure prediction research and the implementation of seizure prediction/warning devices, which will be a revolutionary approach for handling epileptic seizures, very much similar to the pacemaker. It will also lead to clinical investigations of the effects of medical diagnosis, drug effects, or therapeutic intervention during invasive EEG monitoring of epileptic patients. The future research towards the treatment of human epilepsy and therapeutic intervention of epileptic activities as well as the development of seizure feedback control devices will be feasible. Thus, it represents a necessary first step in the development of implantable biofeedback devices to directly regulate therapeutic pharmacological or physiological intervention to prevent impending seizures or other brain disorders. For example, such an intervention might be achieved by electrical or magnetic stimulation (e.g., vagal nerve stimulation) or by a timely release of an anticonvulsant drug. Another practical application of the proposed approach is to help neurosurgeons to quickly identify the epileptogenic zone without having patients stay in the hospital for the invasive long-time (10–14 days in duration) EEG monitoring. This research has potential to revolutionize the protocol to identify the epileptogenic zone, which can drastically reduce the healthcare cost during the hospital stay for these patients. In addition, this protocol will help physicians to identify epileptogenic zones without the necessity to risk patients’ safety by implanting depth electrodes in the brain. In addition, the results from this study will also contribute to the understanding of the intermittency of other dynamical neurophysiological disorders of the brain (e.g., migraines, panic attacks, sleep disorders, and Parkinsonian tremors). We also expect our research to contribute to the localization of defects (flaws), classification and prediction of spatiotemporal transitions in other high dimensional biological systems such as heart fibrillation and heart attacks. Acknowledgments The authors also want to thank Prof. I.P. Androulakis, Prof. L.D. Iasemidis, Prof. J.C. Sackellares, Prof. P.R. Carney, D.-S. Shiau, and L.K. Dance for their fruitful comments and discussion.

References Begley, C., J. Annegers, D. Lairson, T. Reynolds, and W. Hauser. (1994). “Cost of Epilepsy in the United States: A Model Based on Incidence and Prognosis.” Epilepsia, 35(6), 1230–1243. Begley, C., M. Famulari, J. Annegers, D. Lairson, T. Reynolds, S. Coan, S. Dubinsky, M. Newmark, C. Leibson, E. So, and W. Rocca. (2000). “The Cost of Epilepsy in the United States: An Estimate from Population-Based Clinical and Survey Data.” Epilepsia, 41(3), 342–351. Chaovalitwongse, W., P. Pardalos, L. Iasemidis, D.-S. Shiau, and J. Sackellares. (2003). “Applications of Global Optimization and Dynamical Systems to Prediction of Epileptic Seizures.” In P. Pardalos, J. Sackellares, L. Iasemidis, and P. Carney, (Eds). Quantitative Neuroscience, pp. 1–36. Kluwer. Chaovalitwongse, W., P. Pardalos and O. Prokoyev. (2004). “A New Linearization Technique for Multiquadratic 0–1 Programming Problems.” Operations Research Letters, 32(6), 517–522. Elger, C. and K. Lehnertz. (1998). “Seizure Prediction by Non-Linear Time Series Analysis of Brain Electrical Activity.” European Journal of Neuroscience, 10, 786–789. Iasemidis, L. (1991). “On the Dynamics of the Human Brain in Temporal Lobe Epilepsy.” PhD thesis, University of Michigan, Ann Arbor. Springer

250

Ann Oper Res (2006) 148:227–250

Iasemidis, L., P. Pardalos, J. Sackellares, and D.-S. Shiau. (2001). “Quadratic Binary Programming and Dynamical System Approach to Determine the Predictability of Epileptic Seizures.” Journal of Combinatorial Optimization, 5, 9–26. Iasemidis, L. and J. Sackellares. (1991). “The Evolution with Time of the Spatial Distribution of the Largest Lyapunov Exponent on the Human Epileptic Cortex.” In D. Duke and W. Pritchard (Eds.), Measuring Chaos in the Human Brain, pp. 49–82. World Scientific. Iasemidis, L., D.-S. Shiau, W. Chaovalitwongse, J. Sackellares, P. Pardalos, P. Carney, J. Principe, A. Prasad, B. Veeramani, and K. Tsakalis. (2003).“Adaptive Epileptic Seizure Prediction System.” IEEE Transactions on Biomedical Engineering, 5(5), 616–627. Lehnertz, K. and C. Elger. (1998). “Can Epileptic Seizures be Predicted? Evidence from Nonlinear Time Series Analysis of Brain Electrical Activity.” Phys. Rev. Lett., 80, 5019–5022. Litt, B., R. Esteller, J. Echauz, D. Maryann, R. Shor, T. Henry, P. Pennell, C. Epstein, R. Bakay, M. Dichter, and G. Vachtservanos. (2001). “Epileptic Seizures May Begin Hours in Advance of Clinical Onset: A Report of Five Patients.” Neuron, 30, 51–64. Martinerie, J., C.V. Adam, and M.L.V. Quyen. (1998). “Epileptic Seizures Can Be Anticipated by Non-Linear Analysis.” Nature Medicine, 4, 1173–1176. Pardalos, P., W. Chaovalitwongse, L. Iasemidis, J. Sackellares, D.-S. Shiau, P. Carney, O. Prokopyev, and V. Yatsenko. (2004). “Seizure Warning Algorithm Based on Spatiotemporal Dynamics of Intracranial Eeg.” Mathematical Programming, 101(2), 365–385. Pardalos, P., V. Yatsenko, J. Sackellares, D.-S. Shiau, W. Chaovalitwongse, and L. Iasemidis. (2003). “Analysis of EEG Data Using Optimization, Statistics, and Dynamical System Techniques.” Computational Statistics & Data Analysis, 44(1–2), 391–408. Quyen, M. L.V., J. Martinerie, M. Baulac, and F. Varela. (1999). “Anticipating Epileptic Seizures in Real Time by Non-Linear Analysis of Similarity Between EEG Recordings.” NeuroReport, 10, 2149–2155. Wolf, A., J. Swift, H. Swinney, and J. Vastano. (1985). “Determining Lyapunov Exponents from a Time Series.” Physica D, 16, 285–317.

Springer