A deep neural network classifier for diagnosing

2 downloads 0 Views 369KB Size Report
in previous studies to be effective in detecting sleep apnea. Taking into account ... Traditionally, sleep apnea is diagnosed using polysomnogra- phy (PSG), an ... to ECG data, we hypothesize that deep learning and un- supervised ... arrhythmia [24]. Innovations in ... longer to perform apnea detection in real-time. Thus, we.
A deep neural network classifier for diagnosing sleep apnea from ECG data on smartphones and small embedded systems Antony Kaguara

Kee Myoung Nam

Siddarth Reddy

Swarthmore College 500 College Ave. Swarthmore, PA 19081

Swarthmore College 500 College Ave. Swarthmore, PA 19081

Swarthmore College 500 College Ave. Swarthmore, PA 19081

[email protected]@swarthmore.edu [email protected] ABSTRACT

and 1% to 3% of preschool children. Sleep apnea can cause a patient to stop breathing for 10 to 30 seconds at a time, a phenomenon that can occur hundreds of times throughout one night [15]. These frequent bouts of sleep apnea can prevent the ability to achieve and maintain the deeper stages of sleep, which can lead to larger problems such daytime fatigue, impaired memory and depression. Thus, all this information establishes the detection and prevention of sleep apnea as an important problem. [2]

Obstructive sleep apnea (OSA) is a prevalent sleep disorder caused by an obstruction of a person’s airway, which can cause people to stop breathing for 10 to 30 seconds at a time. OSA can lead to larger problems such as daytime fatigue, impaired memory, and depression [15]. Currently, the most common way of testing for OSA is polysomnography (PSG), an expensive procedure that requires appreciable medical personnel and a large amount of equipment [2]. Thus, there is a pressing need for a convenient, reliable, and lightweight system that detects sleep apnea in real-time.

Traditionally, sleep apnea is diagnosed using polysomnography (PSG), an expensive process that requires a large collection of sophisticated medical equipment and personnel. PSG measures several physiological variables that together merit a diagnosis of sleep apnea. These include measures of the electrical activity in the brain and heart. Because a lapse in breathing affects the electrical activity of the heart as transcribed in electrocardiography (ECG), ECG data on its own can be used to identify instances of apnea. In this light, there has been significant previous work done on classifying electrocardiogram (ECG) patterns obtained during PSG studies as indicative or non-indicative of OSA using supervised machine learning techniques, such as support vector machines [2, 10, 30], neural networks [7, 11, 9, 28].

To help solve this problem, we have implemented a lightweight system that employs a deep neural network classifier to detect sleep apnea using data from ECG signals. ECG signals are one of the elements of PSG that have been proven in previous studies to be effective in detecting sleep apnea. Taking into account the availability and usage of mobile lowpower devices, our system was designed to run on mobile, low-power devices while being able to detect sleep apnea in real-time with high levels of accuracy, thus creating an inexpensive and convenient system for a user potentially with sleep apnea. This system differs from previous research in that our system: (1) uses a novel classification technique and (2) detects apnea in real-time with limited CPU and battery resources.

Further, the prevalence of sleep apnea necessitates the need for easily accessible apnea detection systems that are nonobtrusive and can work on low-power devices. In addition, these devices would need to run for approximately 8–10 hours without interruption or failure in order to cover the average sleep-time window needed to accurately capture instances of sleep apnea. Further, with the ubiquity of lowpower sensor devices and smartphones, combined with the capability to collect ECG signals through these devices, we saw the need to design the system to work on a lightweight computing environment. This makes it more convenient and less costly for users of the system who may already own a smartphone or a wearable device on which we can port our system for real-time apnea detection.

We evaluate the performance of our classifier based on achieving three goals: (1) classifying sleep apnea with high levels of accuracy, (2) detecting sleep apnea with minimal delay under limited CPU usage, (3) minimizing strain on the device’s battery life. We tested our classifier on ECG data from the PhysioNet Apnea-ECG database [26, 12], running our classifier on its data and cross-referencing our results with the apnea annotations from the database. In evaluating our results, we found our experimental classifier met the first two goals of high accuracy and real-time detection with limited CPU usage, but the goal of being able to run our system with limited battery capabilities could not be definitively shown.

1. INTRODUCTION 1.1 Background

Thus, our research study aims to develop a simple and lightweight system that (1) accurately detects sleep apnea using ECG data; (2) detects sleep apnea in real-time under limited CPU usage (3) has limited strain on the devices battery life.

Obstructive sleep apnea (OSA), a sleeping disorder caused by obstruction of the upper airway, is a prevalent disorder among adults, occurring in 2% to 4% of middle-aged adults

1.2 1

Contributions

There currently exists an open database of ECG signals from sleep patterns of 35 subjects [26, 12]. In this study, we experiment with using supervised learning and cross-validation techniques on the ECG data to help create a model which can look at given ECG signals for a patient and detect signs of sleep apnea through the means of a smartphone application. Specifically, we implement a deep neural network with a stacked autoencoder that performs classification based on a latent feature representation learned from the extracted features. Due to the noise and complexity inherent to ECG data, we hypothesize that deep learning and unsupervised feature learning algorithms, which have demonstrated unprecedented success in making inferences about high-dimensional data of various modalities [5, 20, 18, 14], are particularly well-suited to completing this task with high accuracy. We are the first, to our knowledge, to apply unsupervised feature learning to this particular classification task.

been implemented to identify sleep apnea moments from ECG data, including support vector machines [2, 10, 30], neural networks [7, 11, 9, 28], quadratic discriminant analysis [30], na¨ıve Bayes classifiers [10], k-nearest neighbors classification [16, 19, 30], and bootstrap AdaBoost [16], with classification accuracy exceeding 75% in almost all cases. Second, the integration of such diagnostic algorithms with real-time feature extraction and classification on a mobile device has also been studied extensively in various permutations. For instance, Apnea MedAssist [6] is an Android application that uses a support vector classifier to diagnose sleep apnea in real time. Al-Mardini et al. [1] presented a similar smartphone application that, instead of processing ECG data, uses oximetry, microphone, and accelerometer data to diagnose sleep apnea moments. Patil et al. [25] presented yet another similar application that, instead of processing ECG data, uses blood pulsioximetry as well. Nakano et al. [21] also strayed from ECG data with their algorithm, instead providing a “proof of concept” for a means of monitoring sound to quantify snoring and sleep apnea severity using a mobile device. SleepAp [4] is yet another smartphone application for automated sleep apnea diagnosis; however, it extracts features from audio, actigraphy, photoplethysmography (PPG) and demographics, rather than ECG data. Alqassim et al. [3] also presented an Android/Windows application, named Sleep Apnea Monitor, for the real time detection of apnea events from built-in smartphone sensors. Other smartphone applications also use real time sensing to consider other conditions that manifest and are diagnosable through ECG data, such as atrial fibrillation [23] and arrhythmia [24].

We evaluate our implementation of this classifier on three metrics: (1) its ability to accurately detect sleep apnea moments from ECG data, (2) its ability to run in real-time in a low-power environment, and (3) the amount of strain it places on the device’s battery life. Classification accuracy was evaluted using 5-fold cross-validation on a set of ECG recordings of 35 patients, taken from the PhysioNet database [26, 12] to train and test our classifier. The evaluation of accuracy was done by cross-referencing the apnea classification with the apnea annotations that accompany the database. We also compare the results of our classifier with the results of other classifiers being used for comparison previously used for sleep apnea detection as a benchmark. In determining the ability for our classification method to run in a low-power environment, we test our method on an Android smartphone but set up constraints to simulate running this method on a low-power device. This includes obtaining data at a rate possible for low-power devices, as well as limiting the CPU usage of the phone to obtain a sense of what limits can be placed on our classifier while still being able to accurately detect sleep apnea in real-time.

1.3

Innovations in wearable hardware that aid the automated diagnosis of sleep apnea have also been subject to development. HealthGear [22] is a wearable system of non-invasive physiological sensors, such as a blood oximeter, that connect wirelessly via Bluetooth and present data intelligently to the user. Zhang et al. [31] implemented an innovative “smart pillow” system that detects apnea events by sensing blood oxygen levels, then immediately adjusts the pillow height and shape to terminate each apnea event. Ishida et al. [15] developed a wearable smartphone-based respiration monitoring system, built with a microcontroller and a piezoelectric sensor, that non-invasively detects sleep apnea moments.

Paper Organization

To preview the rest of this paper, Section 2 (Related Work) takes a look at other methods of sleep detection, highlighting key differences between our experimental method and those done in prior research. Section 3 (Methodology) describes different aspects of how the experiment was performed, including a description of the database, the process of reading in the data and performing feature extraction, as well as the process of classifying the data. Section 4 (Results) gives an evaluation of our detection method and compares to other methods previously used. Finally Section 5 (Conclusion) offers insight into the potential usefulness of our system, and discusses potential future directions.

2.

While unsupervised feature learning algorithms, such as stacked autoencoders, have been used to denoise ECG signals [27], to the best of our knowledge they have not been used for learning a latent feature representation of ECG data for classification. Likewise, while deep neural networks – which, in the most general sense, are merely multilayer neural networks, or multilayer perceptrons – have indeed been used to classify ECG signals (outside the context of apnea) [11, 9, 28], to our knowledge no previous work has ever integrated the deep neural network and stacked autoencoder with automated sleep apnea diagnosis from ECG data.

RELATED WORK

The literature on the automated diagnosis of sleep apnea and other conditions from ECG data is extensive, with various permutations of the problem being considered [2, 10, 30, 7, 11, 9, 28, 16, 19, 6, 1, 25, 21, 4, 3, 23, 24, 22, 31, 15, 26, 12].

3. METHODOLOGY 3.1 Data Collection The database of ECG signals with sleep apnea annotations was obtained from the PhysioNet Apnea-ECG database [26,

First, a wide spectrum of machine learning algorithms have 2

12], which we used to assess and validate our approach. The Apnea-ECG database contains ECG recordings for 70 different patients ranging from 7 to 10 hours in length. However, only 35 of these recordings contain minute-wise apnea annotations, which denote whether apnea occurred during each minute of ECG data; given the necessity of annotated test data to evaluate the classifier’s performance, we only used these 35 recordings in the experiment.

1.5

Amplitude (mV)

1

To simulate a real environment, we read in the data file in a manner to replicate the Shimmer wearable sensing system. Since the Shimmer is capable of acquiring ECG data at a frequency of 256 Hz, while the PhysioNet database contains ECG data obtained at a frequency of 100 Hz, we found no need to decimate the data [23]. We would only expect the accuracy of our classifier to be enhanced given the use of a device if it was run on a device such as the Shimmer.

0.5

0

-0.5

-1

3.2

a01

2

0

100

200

300

400

500

600

700

800

900

1000

Time (centiseconds)

Feature Extraction

For our classifier to work well, there must be an input of feature vectors extracted from the data of ECG signals. There exists an important trade-off in performing feature extraction: The more features one uses will lead to higher levels of classification accuracy, but comes at the price of taking longer to perform apnea detection in real-time. Thus, we sought to find the most telling features from ECG signals as to minimize the size of our feature vector while still being able to classify sleep apnea with high accuracy.

Figure 1: Ten seconds of ECG data from the PhysioNet Apnea-ECG database [26, 12], along with the R peaks identified via slope inversion. where N is the number of RR intervals, xi is the ith RR interval, and µ is the mean RR interval; • The NN50 measure (variant 1), defined as the number of pairs of adjacent RR intervals where the first RR interval exceeds the second RR interval by more than 50 ms.

The features used in our experimentation were all metrics based around RR intervals. An RR interval is defined as the time between two consecutive R peaks, which in turn are defined as the maximum amplitude of a given QRS complex. These metrics were chosen because RR intervals have been shown to be a telling indicator of heart-rate variability (HRV), which is a known byproduct of sleep apnea [29]. We detected R peaks from each ECG recording using the following algorithm, which has been used in prior studies of a similar nature [2]:

• The NN50 measure (variant 2), defined as the number of pairs of adjacent RR intervals where the second RR interval exceeds the first RR interval by more than 50 ms. • The pNN50 measure (variant 1), defined as the NN50 measure (variant 1) divided by the total number of RR intervals.

1. Find all peaks within ECG recordings using slope inversion;

• The pNN50 measure (variant 2), defined as the NN50 measure (variant 2) divided by the total number of RR intervals.

2. Define a peak as an R peak if its height is more than two standard deviations above the mean height over all peaks.

• The SDSD measure, defined as the standard deviation of the differences between adjacent RR intervals. • The RMSSD measure, defined as the square root of the mean of the sum of the squares of differences between adjacent RR intervals: v u −1 u 1 N X (xi+1 − xi )2 . RMSSD = t N − 1 i=1

Once all the R peaks were obtained, we computed the RR intervals, defined as the time difference between every consecutive pair of R peaks. We then calculated 11 metrics, all of which were used in previous studies of a similar nature [8, 24], based on these RR intervals to create our feature vectors. We define these metrics below.

• Inter-quartile range, defined as difference between 75th and 25th percentiles of the RR interval value distribution.

• Mean RR interval; • Median RR interval;

• Mean absolute deviation, defined as mean of the absolute values of every RR interval minus the mean RR interval:

• Standard deviation of the set of RR intervals, defined as v u N u1 X (xi − µ)2 , σ=t N i=1

m.a.d. =

3

N 1 X |xi − µ|. N i=1

As each ECG recording in the PhysioNet database was annotated per minute, each feature vector was computed based on 60 seconds of ECG data; as each minute-wise annotation indicates the presence of apnea (or lack thereof) at the beginning of the following minute, we considered the 30 seconds before and the 30 seconds after each apnea annotation to calculate each corresponding feature vector. We then divided each such one-minute period in six 10-second epochs, computed the above 11 features for each epoch, and concatenated the results to form a 66-dimensional feature vector for each minute considered. These feature vectors were then collected, along with their corresponding annotations, to train our classifier.

3.3

ral network with n input units (excluding the bias unit), n output units, and one hidden layer with k 6= n units, that attempts to approximate the identity function. More specifically, an autoencoder is trained to encode the input x into a latent feature representation e(x), which the autoencoder then decodes to recover the input features [5]: Latent (Approx.) Input encode decode Input −−−−→ feature −−−−→ features rep’n features x −→ e(x) −→ (d ◦ e)(x) ≈ x The latent feature representation e(x) can either represent some compression of the input features (if k < n), or some overcomplete representation of the input features (if k > n). In the former case, the encoding transformation from the input features to the latent representation essentially performs dimensionality reduction a ` la principal component analysis (PCA) and other more sophisticated (non)linear algorithms.1

Data Analysis & Classification

The classifier, as aforementioned, comprises two parts: (1) a sparse autoencoder that learns a latent feature representation of each minute of ECG data, using the extracted features (see Section 3.2) in an unsupervised manner; and (2) a deep neural network that uses this latent feature representation to classify each minute of ECG data as apnea or non-apnea. Here, we describe some of the rudimentary theory underlying autoencoders and deep neural networks, and discuss the specifics of our implementation.

3.3.1

In the latter case, the encoding transformation may still remove redundant relationships among the input features and discover significant correlative structure if an additional sparsity constraint is enforced, that is, each observed instance in the data is expressible as a sparse combination of the input features. This additional sparsity constraint forces the joint distribution of any two features in the model to be highly non-redundant, while retaining a much greater degree of flexibility that is lost when the dimensionality of the data is collapsed. Variants of sparse autoencoders have been particularly successful at learning higher-level representations of data of various modalities, ranging from images [5, 14, 20, 17, 18] to audio [13, 18] to text [18].

Sparse Autoencoders and Unsupervised Feature Learning

3.3.2

Stacked Autoencoders and Supervised Learning with a Deep Neural Network

Now, we consider a supervised learning methodology that incorporates the learned latent feature representation as the input features. Multiple autoencoders can be “stacked” (to form stacked autoencoders) so that the output layer of the ith autoencoder is wired to the input layer of the (i + 1)th autoencoder. This setup is distinct from the standard deep neural network or multilayer perceptron in the sense that the edge weights in each layer are initialized as the encoding transformations of each autoencoder, as opposed to a uniformly random initialization of edge weights. This distinction is reflected in the standard greedy layer-wise training algorithm for deep neural networks, due to Hinton and Salakhutdinov [14]: 1. Greedily train each autoencoder in succession, treating the hidden layer of the ith autoencoder as the input layer of the (i + 1)th autoencoder.

Figure 2: A schematic of an autoencoder. In this case, we have six input features (x1 , . . . , x6 ) extracted from each data instance, which we first encode four-dimensional representa into a latent  (1) (1) tion h1 , . . . , h4 , then decode into output features

1 Compared to PCA, an autoencoder is more flexible in that it can learn nonlinear encoding transformations if the activations of the input and/or hidden units are nonlinear. Compared to nonlinear dimensionality reduction algorithms, such as self-organizing maps (SOM) or t-distributed stochastic neighbor embedding (t-SNE), an autoencoder is still more flexible in that it can not only reduce the dimensionality of the data to a fixed number, but also learn sparse, overcomplete feature representations of the data (see above).

(ˆ x1 , . . . , x ˆ6 ) that approximate the input features (i.e., xi ≈ x ˆi for i = 1, . . . , 6). Given an input feature vector x = (x1 , . . . , xn ) extracted from some dataset, an autoencoder is a feedforward neu4

2. Take the last autoencoder’s hidden layer as the input layer to a new supervised layer, consisting of one output unit for each supervised label.

to when it was able to classify it as apnea vs. non-apnea. To ensure that SetCPU was limiting the CPU usage as indicated, we used the Qualcomm Trepn Profiler to monitor CPU usage while running our classifier.

3. Then fine-tune the edge weights using backpropagation with respect to the supervised labels.

3.5

For a more detailed exposition of autoencoders and deep neural networks, please consult [5, 20, 14].

3.3.3

Implementation Details

We implemented our stacked autoencoder and deep neural network using DeepLearnToolbox, a MATLAB toolbox for implementing deep learning algorithms.2 We then trained and tested our classifier using 5-fold cross-validation. For each fold, given a 66-dimensional feature vector extracted from each minute of ECG data, the stacked autoencoder learned a 20-dimensional latent feature representation over 20 training epochs. We adopted the sigmoid function f (z) =

4. RESULTS 4.1 Classification As aforementioned, we evaluated our classifier using 5-fold cross validation on the 35 annotated ECG recordings in the PhysioNet Apnea-ECG database [26, 12]. In Table 1, we report classification accuracy (percentage of correctly labeled test instances among all test instances) over each fold of the data. We also compare these accuracy percentages to similar scores reported for five different classifiers from the literature: a support vector machine [2], two na¨ıve Bayes classifiers [8, 30], and two implementations of k-nearest neighbors [8, 30]. These figures confirm that our deep neural network classifier demonstrates a level of accuracy comparable to, if not markedly greater than, other classifiers implemented for this task.

1 1 + e−z

as the activation of every unit in the network. The resulting 20-dimensional representation then served as input into a supervised layer that classified each representation (and therefore each minute) as apnea or non-apnea. These labels were then compared with the ground-truth annotations available on the PhysioNet database [26, 12]. We report the percentage accuracy per fold of our classifier in Table 1. Once the classifier was parametrized and trained to achieve a satisfactory level of accuracy, we implemented the classifier on Android as a series of matrix multiplications (with the learned edge weights hard-coded) and sigmoid function evaluations. While this design choice implies that the classifier will not adapt to new ECG data (given that ground-truth minute-wise annotations for the new ECG data are available) and is perhaps more prone to overfit to the PhysioNet database, it also ensures that our system is as lightweight as possible, which is essential to our goal of real-time classification.

3.4

Testing Battery Usage

In order for our system to be able to run on low-power devices, it must be able to run with limited battery power. However, there is a tradeoff in that creating a more computationally intensive system that would perform better in detecting apnea would also be more draining on the battery. Thus, we sought to find a balance between these two goals. To test the effect our system on the battery power, we used the GSam Battery Monitor to record our classifier’s battery usage (as a percentage of total battery power) while varying the upper limits of CPU power.

Classifier DNN with autoencoder (fold 1) DNN with autoencoder (fold 2) DNN with autoencoder (fold 3) DNN with autoencoder (fold 4) DNN with autoencoder (fold 5) Support vector machine [2] k-nearest neighbors [8] k-nearest neighbors [30] Na¨ıve Bayes [8] Na¨ıve Bayes [30]

Limiting CPU Usage

One important goal of this study was to implement our classifier so that it can run in a low-power environment with limited CPU power. However, there exists a tradeoff to having limited CPU power in terms of how well sleep apnea can be detected in real-time: the less CPU power available, the longer it would take for features to be extracted from the data and classified as apnea or non-apnea. Thus, we wanted to test how limiting the CPU usage at different levels would affect average classification time for one minute of ECG data.

Accuracy (%) 87 85 84 91 90 86.1–96.5 75.6 78.5 88.1 90.2

Table 1: Classification accuracy of our deep neural network over each fold of the data, along with corresponding figures for three other classifiers from the literature.

4.2

CPU Usage

Using SetCPU and the Qualcomm Trepn Profiler, we computed the average detection time required to classify each minute as apnea or non-apnea, defined as the time between receiving a full minutes worth of data and classifying that data as apnea or non-apnea. Table 2 illustrates an inversely proportional relationship between average classification time and maximum CPU frequency.

To perform this test, it was first necessary to obtain root access to the smartphone. From there, we used the Android application SetCPU to variously adjust the maximum CPU frequency available to our classifier. For each maximum frequency tested, we calculated the average length of time between when the classifier received an interval of data

4.3

2

Battery Life

Using SetCPU to limit CPU power at various levels, we recorded the battery usage as a percentage of total power

DeepLearnToolbox is available at http://github.com/ rasmusbergpalm/DeepLearnToolbox. 5

CPU1 speed (MHz)

CPU2 speed (MHz)

1190 998 787 600 384

1190 998 787 600 384

Average detection time (seconds) 0.15 0.21 0.36 0.39 1.212

levels more comparable to those on low-power device, the battery usage became much more significant. Given that along with the fact that low-power devices have significantly less battery capacity, it is difficult to conclude whether battery constraints would cause problems to our system when run on low-power devices.

5.4

Table 2: Average detection time (i.e., the average amount of time, over all minutes, for the application to extract all 66 features and classify the minute as apnea or non-apnea) for various CPU frequencies.

at each level. These recordings were taken from a MOTO E Smartphone, which has an 1800 mAh battery capacity. The results of these tests can be seen in Table 3, which illustrates an inverse relationship between the upper-limit on CPU usage and the battery usage of the classifier. CPU1 speed (MHz) 1190 998 787 600

CPU2 speed (MHz) 1190 998 787 600

Battery usage (%) 4.1 8.3 13.4 15.6

While our classifier demonstrates high levels of accuracy on the PhysioNet database, there are various ways in which the classifier could be improved even further. One could, for instance, extract more features from the data, or adopt a more robust machine learning algorithm to locate R peaks from particularly noisy ECG data. One could also ensure that the various hyperparameters of our classifier — the number of hidden layers, the number of units per layer, and additional constraints such as sparsity — are optimized, perhaps through a systematic grid search in a subset of hyperparameter space.

Table 3: Percent battery usage of the application at any given point during execution.

5. CONCLUSIONS 5.1 Classification Given the results of how our classifier compared to other previously researched detection methods, we found our method to demonstrate comparable levels of accuracy to those of other classifiers implemented for this task. Furthermore, the fact that we were able to achieve accurate results using a small selection of features and an accordingly lightweight implementation clearly suggests that accurate sleep apnea diagnosis can occur in a low-power environment.

5.2

6.

REFERENCES

[1] M. Al-Mardini, F. Aloul, A. Sagahyroon, and L. Al-Husseini. Classifying obstructive sleep apnea using smartphones. J Biomed Inform, July 2014. [2] L. Almazaydeh, K. Elleithy, and M. Faezipour. Obstructive sleep apnea detection using SVM-based classification of ECG signal features. Conf Proc IEEE Eng Med Biol Soc, pages 4938–41, 2012. [3] S. Alqassim, M. Ganesh, S. Khoja, M. Zaidi, F. Aloul, and A. Sagahyroon. Sleep apnea monitoring using mobile phones. IEEE 14th International Conference on e-Health Networking, Applications and Services (Healthcom), pages 443–446, October 2012. [4] J. Behar, A. Roebuck, M. Shahid, J. Daly, A. Hallack, N. Palmius, J. Stradling, and G. Clifford. SleepAp: an automated obstructive sleep apnoea screening application for smartphones. Computing in Cardiology, 40:257–260, 2013. [5] Y. Bengio. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1):1–127, January 2009. [6] M. Bsoul, H. Minn, and L. Tamil. Apnea MedAssist: real-time sleep apnea monitor using single-lead ECG. IEEE Transactions on Information Technology in Biomedicine, 15(3):416–427, October 2010.

CPU Usage

As expected, there was an inverse relationship between maximum CPU usage and time for apnea detection. Given the relationship exhibited between the two variables, we could expect our classifier to run effectively with the lowest amount of CPU usage being 384 MHz. We also performed well against our benchmark of low-powered wearable and embedded devices, that is, the LG G Watch running Android Wear on a Dual-Core processor running at 1700MHz and the Gumstix Pepper Microprocessor running at 700MHz. Furthermore, we discovered that majority of the apnea detection time was attributed to feature extraction. A pass through the classifier was less than 10% of the total detection time, showing that our classifier is in itself lightweight.

5.3

Future Directions

We have effectively implemented a proof-of-concept classifier, comprised of a stacked autoencoder and a deep neural network, that can accurately identify sleep apnea moments in real-time with limited CPU on an Android smartphone. With this in mind, a particularly significant goal is to transition from relying on a smartphone and a pre-existing database of ECG recordings to processing and classifying live ECG data from a patient, using a smaller device. This would allow us to not only test our classifier on new data, but also test our implementation on actually constrained hardware, such as a microcontroller or a wearable watch, instead of simulating these constraints on a smartphone. Furthermore, this would also shed insight into designing and improving hardware and/or physical experimental conditions to streamline and optimize the real-time sensing, processing, and classification of ECG data.

Battery Life

When the CPU usage was unconstrained, our system did not appear to have a significant effect on the battery usage of the phone. However, once the CPU power was limited to 6

[7] W. Bystricky and A. Safer. Identification of individual sleep apnea events from the ECG using neural networks and dynamic Markovian state model. Computers in Cardiology, 31:297–300, 2004. [8] P. de Chazal, T. Penzel, and C. Heneghan. Automated detection of obstructive sleep apnoea at different time scales using the electrocardiogram. Physiol Meas, 25(4):967–983, 2004. [9] V. Dubey and V. Richariya. A neural network approach for ECG classification. International Journal of Emerging Technology & Advanced Engineering, 3, October 2013. [10] N. Eiseman, M. Westover, J. Mietus, R. Thomas, and M. Bianchi. Classification algorithms for predicting sleepiness and sleep apnea severity. J Sleep Res, 21(1):101–112, February 2012. [11] S. El-Khafif and M. El-Brawany. A neural network approach for ECG classification. ISRN Biomedical Engineering. [12] A. Goldberger, L. Amaral, L. Glass, J. Hausdorff, P. Ivanov, R. Mark, J. Mietus, G. Moody, C. Peng, and H. Stanley. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation, 101(23), June 2000. [13] R. Grosse, R. Raina, H. Kwon, and A. Ng. Shift-invariance sparse coding for audio classification, 2012. [14] G. Hinton and R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313:504–507, July 2006. [15] R. Ishida, Y. Yonezawa, H. Maki, H. Ogawa, I. Ninomiya, K. Sada, S. Hamada, A. Hahn, and W. Caldwell. A wearable, mobile phone-based respiration monitoring system for sleep apnea syndrome detection. Biomed Sci Instrum, 41:289–293, 2005. [16] T.-P. Kao, J.-S. Wang, C.-W. Lin, Y.-T. Yang, and F.-C. Juang. Using Bootstrap AdaBoost with KNN for ECG-based automated obstructive sleep apnea detection. IEEE World Congress on Computational Intelligence. [17] Q. V. Le, M. Ranzato, R. Monga, M. Devin, K. Chen, G. S. Corrado, J. Dean, and A. Ng. Building high-level features using large scale unsupervised learning. Proceedings of the 29th Intenational Conference on Machine Learning, 2012. [18] H. Lee, A. Battle, R. Raina, and A. Ng. Efficient sparse coding algorithms. Advanced in Neural Information Processing Systems, pages 801–808, 2006. [19] M. Mendez, D. Ruini, O. Villantieri, M. Matteucci, T. Penzel, S. Cerutti, and A. Bianchi. Detection of sleep apnea from surface ECG based on features extracted by an autoregressive model. pages 6106–6109, 2007. [20] K. P. Murphy. Machine learning: a probabilistic perspective. MIT Press, 2012. [21] H. Nakano, K. Hirayama, Y. Sadamitsu, A. Toshimitsu, H. Fujita, S. Shin, and T. Tanigawa. Monitoring sound to quantify snoring and sleep apnea severity using a smartphone: proof of concept. J Clin Sleep Med, 10(1):73–78, 2014.

[22] N. Oliver and F. Flores-Mangas. HealthGear: automatic sleep apnea detection and monitoring with a mobile phone. Journal of Communications, 2(2):1–9, March 2007. [23] J. Oster, J. Behar, R. Colloca, Q. Li, Q. Li, and G. Clifford. Open source Java-based ECG analysis software and Android app for atrial fibrillation screening. Computing in Cardiology, 40:731–734, 2013. [24] A. Patel, P. Gakare, and A. Cheeran. Real time ECG feature extraction and arrhythmia detection on a mobile platform. International Journal of Computer Applications, 44(23), April 2012. [25] S. Patil, P. Shewale, A. Agrawal, V. Choudhari, and B. Doke. Real time data processing for detection of apnea using Android phone. International Journal of Science and Modern Engineering, 2(2), January 2014. [26] T. Penzel, G. Moody, R. Mark, A. Goldberger, and J. Peter. The Apnea-ECG database. Computers in Cardiology, 27:255–258, 2000. [27] R. Rodrigues and P. Couto. A neural network approach to ECG denoising. arXiv:1212.5217v1, December 2012. [28] N. Srinivas, A. Babu, and M. Rajak. ECG signal analysis using data clustering and artificial neural networks. American International Journal of Research in Science, Technology, Engineering & Mathematics, 2013. [29] R. A. Thuraisingham. Preprocessing RR interval time series for heart rate variability analysis and estimates of standard deviation of RR intervals. Computer Methods and Programs in Biomedicine, 83(1):78–82, 2006. [30] B. Yilmaz, M. Asyali, E. Arikan, S. Yetkin, and F. Ozgen. Sleep stage and obstructive apneaic epoch classification using single-lead ECG. BioMedical Engineering OnLine, 9(39), 2010. [31] J. Zhang, Q. Zhang, Y. Wang, and C. Qiu. A real-time auto-adjustable smart pillow system for sleep apnea detection and treatment. pages 179–190, 2013.

7