Optimized time and frequency domain methods for ECG signal

0 downloads 0 Views 519KB Size Report
ECG time domain compression framework, and the optimized design of lter banks for subband coders taking ECG signal properties into account.
Optimized time and frequency domain methods for ECG signal compression Sven O. Aase Ranveig Nygaard John H. Husøy Dag Haugland Høgskolen i Stavanger Department of Electrical and Computer Engineering 2557 Ullandhaug, 4004 Stavanger Norway No. of manuscript pages (incl. gures and tables): 31 No. of tables: 1 No. of gures: 14 

Correspondence should be directed to Sven O. Aase at address given above.

1

Abstract Storage and transmission of ElectroCardioGram (ECG) signals involve large amounts of data. Techniques for lossy data compression are useful in reducing the needed storage space or transmission time for such signals. Several dedicated time domain methods exist for this purpose, including the AZTEC, CORTES, Turning Point, and FAN algorithms. This article presents coding results for some modern as well as some novel time and frequency domain data compression algorithms. Key issues in our novel approaches are the way our algorithms are optimized with respect to minimum distortion, within a common ECG time domain compression framework, and the optimized design of lter banks for subband coders taking ECG signal properties into account. The performances of subband coders based on various lter bank types,  both optimized and more traditional types, are compared to that of an optimized as well as an established time domain coding algorithm. Our time domain algorithm is based on the idea of extracting signicant samples from the original signal in a manner that guarantees minimal reconstruction error when using linear interpolation of the retained samples. Thus, it can be used as a best case representative for the whole class of time domain algorithms using the sample reduction technique and linear interpolation. Using a varied signal test set, extensive coding experiments conclude that the frequencydomain method based on a short kernel lter bank tailored to the properties of the ECG signal, and having some other important properties, performs signicantly better than more traditional transform and subband coders. With respect to the performance of the time domain algorithms, our optimized solution exhibits a performance dramatically superior to the FAN algorithm. Comparing the optimized time domain solution to the subband coders, we observe better performance for the latter class of coders in terms the percentage root mean square dierence (PRD) performance measure. The observations referred to above is conrmed and complemented by visual inspection of the reconstructed signals.

2

List of Figures 1

Arc length in the graph. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

7

2

Algorithm for the CCSP problem : : : : : : : : : : : : : : : : : : : : : : : : :

9

3

Structure of encoding system. : : : : : : : : : : : : : : : : : : : : : : : : : : :

10

4

Uniform quantizer with thresholding. : : : : : : : : : : : : : : : : : : : : : : :

12

5

Parallell analysis lter bank. Subsampling by a factor of M is denoted by # M . 14

6

A two-band analysis lter bank employed in a uniform tree-structured lter bank with 8 subbands. Subsampling by a factor of 2 is denoted by # 2. : : : :

15

7

Perfect reconstruction two-band analysis/synthesis IIR lter bank system. : :

16

8

Basic shape of normal heartbeat. : : : : : : : : : : : : : : : : : : : : : : : : :

20

9

Estimated autocorrelation function for normal heartbeat. : : : : : : : : : : : :

21

10 Lowpass unit pulse responses for the ECG-optimized nonunitary lter bank, and the LOT. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

21

11 Original ECG's (the rst 15 seconds) used in the experiments. mitxxx yyyy denotes record no. xxx starting at time yy.yy. Each record length is in total 10 minutes, corresponding to 216000 samples. : : : : : : : : : : : : : : : : : :

23

12 Coding performance for varying input signals. The lter banks used are: Johnston's F16B [1] (..), IIR lter bank F 2 2 smpl [2] (-.), DCT (- -), 32-tap LOT (solid line with circles), and the optimized 32-tap FIR lter bank (solid line). The time domain algorithm are: CCSP (solid line with x-marks), and FAN (solid line with diamonds). : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

25

13 Reconstructed signal segment (taken from mit100 1000) at 1.0 bit per sample.

26

14 Reconstructed signal segment (taken from mit100 1000) at 0.5 bit per sample.

27

List of Tables 1

Filter coecients for F 2 2 smpl : : : : : : : : : : : : : : : : : : : : : : : : : : 3

17

1 Introduction Electrocardiogram (ECG) signals are recorded from patients for both monitoring and diagnostic purposes. If the signals are to be stored or transmitted in digital form, an ecient algorithm for data compression is appropriate. Many data compression techniques for ECG waveforms have been presented. Roughly, they can be classied into two categories: 1. Dedicated techniques: These are mainly time domain techniques dedicated to compression of ECG signal. They include the AZTEC [3], CORTES [4], Turning Point [5], and FAN [6] algorithms. The more recent CCSP technique [7], based on a Cardinality Constrained Shortest Path algorithm and guaranteeing a minimum distortion for a given sample reduction ratio, also ts into this framework. 2. General techniques: Techniques, which historically were developed for the compression of speech and image/video compression, having a sound mathematical foundation. They include Dierential Pulse Code Modulation (DPCM), Subband- and Transform Coding, and Vector Quantization (VQ). Recently there has been some activity using methods from the second category to compress ECG signals: Transform coding and vector quantization [8], and subband coding [2, 9, 10]. Our goal in this article is to present new results relating the performance of optimum time domain coding using linear interpolation, with subband coding using a variety of lter banks, including an ECG-optimized lter bank decomposition. Perhaps equally important, we present a fair and authoritative comparison of the coding capabilities of dedicated time domain ECG coders with a range of subband based frequency domain coders. The comparison is not straightforward, since the dedicated methods typically refer to the sample reduction ratio , i.e., the total number of original signal samples, divided by the number of signal samples kept. In contrast, general compression techniques refer to the average number of bits used for representation of each reconstructed signal sample. The paper is organized as follows: Section 2 presents current time domain techniques, with an emphasis on the CCSP algorithm and the bit ecient representation of retained samples. 4

Section 3 presents frequency domain algorithms exemplied by subband coding. Transform coding is also included in this category as a special case. A comparison of these compression methods is given in Section 4, where we evaluate coding results at comparable compression ratios. This is followed by summary and conclusions in Section 5.

2 Dedicated time domain methods Coding by time domain methods is based on the idea of extracting a subset of signicant signal samples to represent the signal. The key to a successful algorithm is a good rule for determining the most signicant samples. Decoding is based on interpolation in this set. We distinguish between traditional heuristic methods, of which several variants have been available for some time, and recently developed optimization approaches.

2.1 Heuristics Heuristic time domain algorithms are usually fast generators of compressed code satisfying certain reproduction requirements. Typically, the absolute error of the decoded signal is guaranteed to be below a prescribed bound. A frequently cited heuristic is the FAN algorithm [6]. The basic idea of this algorithm is to identify signal segments where a straight line serves as a close approximation, and to discard all but the terminal points along this line. When signicant deviations from linearity are detected, the corresponding samples are included in the extracted signal samples. The FAN algorithm accomplishes the above idea by initially accepting the very rst sample point. Next it computes a range within which succeeding samples must be found if they are to be t by a straight line. This range depends on the absolute error bound (input to FAN), but becomes more and more narrow as more samples are processed. Whenever a sample falls outside the range, its predecessor is accepted as a signicant sample, and the procedure above is repeated from this point on. This method oers control of the absolute error, while the number of extracted signal samples is beyond management. Numerous variations in how to make the eliminate-or-keep decision on signal samples, have been suggested in diverse time domain coders. These includes the original AZTEC [3]. 5

Others can be found in [4], [5], [11], [12], and [13].

2.2 Optimization methods Despite the incorporation of intelligent sample selection rules, all heuristics suer from lack of ability to extract signal samples in a manner that guarantees the smallest reconstruction error possible. However, by a rigorous mathematical model of the compression problem, and by a corresponding solution algorithm, the minimum set of samples may be achieved. In the following, we sketch a method guaranteeing the smallest possible distortion among all techniques using linear interpolation, given the number of retained samples. An Euclidean norm has been chosen as error measure. The algorithm is a revision of the one given in [7] and [14]. Denote the samples taken from an ECG signal at constant interval by x(1); x(2); :::; x(N ). Let M denote the bound on the number of extracted samples and S denote the sample set S = fx(1); x(2); :::; x(N )g. We seek an appropriate compression set C = fn1 ; n2 ; :::; nM g  f1; 2; :::; N g and the corresponding sample values. Assume n1 = 1 and nM = N . The approximation is then given by x(n) = x(n) if n 2 C and x(n) = x(nm ) + x(nnmm+1+1)??nx(mnm ) (n ? nm ) when nm < n < nm+1 . Hence we arrive at a continuous piecewise linear function which interpolates f(n; x(n)) jn 2 C g. Dene the directed graph G = (V; A) whose vertex set V = f1; 2; :::; N g and arc set A contains node pairs (i; j ) where i; j 2 V and i < j . If n1 ; nM 2 V , the set (n1 ; n2 ; :::; nM ) is said to be a path from n1 to nM in G if n1 ; :::; nM 2 V are distinct vertices and n1 < n2 <    < nM . Let Pn denote the path from node 1 up to node n. The length of each arc (i; j ) in A is given as the contribution to the total reconstruction error by eliminating all nodes P between i and j . This can be expressed as c2ij = nj ?=1i+1 (x(n) ? x(n))2 . The length of Pn will thus be the sum of the length of all arcs included in the path up to node n. Each arc (i; j ) in A represents the possibility of letting i and j be consecutive members of C . The arc length for a linear interpolation case is illustrated in Figure 1. Here "(n) = x(n) ? x(n) and P the length of the arc connecting nodes i and j are thus given by c2ij = nj ?=1i+1 "(n)2 . The P path length of a given path P is given by jjP jj2 = (i;j )2P c2ij 6

x(n ) Samples from an ECG signal

i

ε( n)

^ n) x(

j

Figure 1: Arc length in the graph. Hence we are faced with the following problem: Minimize the length of PN under the constraint that PN contains no more than M vertices. The problem is an instance of the resource-constrained shortest path problem [15, 16]. The resource in question is the number of vertices on the path. Unlike the general version of the problem, our model contains only one resource constraint [17, 18, 19, 20]. Omitting the resource constraint, we simply face the frequently studied shortest path problem [21]. Because of the particular choice of resource in our case, we term our problem the cardinality constrained shortest path problem (CCSP). In the next section, we derive a recursive formulation of CCSP, and based on this we give an algorithm providing the exact solution.

2.2.1 Optimization algorithm In order to establish an ecient solution scheme, we propose the following precise problem formulation. Dene Pj;m as the shortest path to j visiting exactly m vertices, and let f (j; m) denote the length of Pj;m . We actually search for a 1 < m  M and the corresponding PN;m for which f (N; m ) = min1 i. Furthermore, we disregard f (i; m) for all i < m since all paths terminating at i have at most i vertices. From the above formulation, the algorithm in Figure 2 suggests itself (p(j; m) signies the predecessor of j in Pj;m ). The compression set can thus be recorded by letting nm = N and nm?1 = p(nm ; m) (m = m ; m ? 1; : : : ; 2). This produces the interpolation points (n1 ; x(n1 )) ; (n2 ; x(n2 )) ; : : : ; (nm ; x(nm )). It is easily seen that when all arc lengths are available, the computations above involve O(MN 2 ) arithmetic operations. Computation of all cij -values can be shown to require no more than a total of O(N 2 ) operations [14].

2.2.2 Coding scheme The performance of time domain compression methods are often evaluated as a chosen distortion measure as a function of sample reduction ratio , dened as the number of samples in the original signal per retained sample. When it comes to other techniques, such as subbandand transform coding, the performance is often evaluated as a chosen distortion measure as a function of bit rate , i.e. the average number of bits necessary to represent one sample of 8

Algorithm CCSP for j = 2; : : : ; N

begin

f (j; 2) = c1j p(j; 2) = 1

// Length of two-vertex path // from 1 to j

end

m = 2 for m = 2; : : : ; M ? 1

begin for j = m + 1; : : : ; N begin

p(j; m + 1) = m f (j; m + 1) = f (m; m) + cm;j for i = m + 1; m + 2; : : : ; j ? 1

// Assume the two-vertex path to N is optimal // Find m + 1-vertex paths // Find the path to j // Assume Pj;m+1 = f1; 2; : : : ; m; j g // The length of this path // Pj;m+1 may equal Pi;m [ fj g

begin if f (i; m) + cij < f (j; m + 1) // Shorter! begin

f (j; m + 1) = f (i; m) + cij // Update the shortest length p(j; m + 1) = i // Record the last step

end end if f (N; m + 1) < f (N; m) begin m = m + 1

end end end

// Shortest path to N so far // Optimal number of vertices in path to N

Figure 2: Algorithm for the CCSP problem

9

run Entropy coder 1 x(n)

Extraction of MUX

samples by CCSP algorithm

channel

amplitude Entropy coder 2

Figure 3: Structure of encoding system. the signal. In order to be able to compare the results from time domain methods to other methods in a fully justied way, encoding of the extracted samples will have to take place. Reconstruction of a signal encoded by a linear interpolation time domain method requires one sample amplitude and the distance from the previous sample, referred to as run , for each segment of the signal to be reconstructed. Denote the signal samples extracted by the time domain coder by x(nk ), k = 1; : : : ; M , and denote the run associated with each extracted signal sample as rk = nk+1 ? nk ? 1. We thus have pairs of (rk ; x(nk )), k = 1; : : : ; M to be encoded. The symbols to be encoded are termed source symbols in this context. We have at least two possibilities when choosing between coding strategies: 1. Coding of amplitudes and runs by two separate coders. 2. Coding of the concatenated symbols (rk ; x(nk )) by one single coder. Alternative 2 implies a high number of possible source symbols. The original bit representation for the test signals in Section 4 is 12 bits for the amplitudes, and if we assume that no run is longer than 256, we arrive at 212  28 = 220 possible dierent source symbols. This will lead to a huge table of source words which is impractical to cope with [24]. For this reason we choose to use alternative 1 and encode the amplitudes and runs by two separate coders. Note however, that alternative 2 is viable in our subband coders, see section 3.1.1. The structure of the encoding system is as illustrated in Figure 3. An entropy coder is applied in compression of the extracted signal samples. We choose to use a record adaptive coder1 , and account for the overhead necessary due to side information. 1 in our case one record corresponds to 10 minutes of a single channel ECG

10

The results from the complete coding system are presented and discussed in Section 4.

3 Subband and Transform Coding Subband and transform coding are well established and widely used techniques for compression of sound and images. In this section we rst present traditional subband coding within an ECG coding context, before showing how to explicitly make use of the ECG signal properties in the coder design. In the subband coder construction our focus is primarily on the issue of lter bank selection and design. After addressing the issue of low computational complexity coders through the use of innite impulse response (IIR) lter banks, we present a technique for optimized lter bank design for ECG subband coders.

3.1 Introduction to ECG subband coding A subband coder splits the input signal into a collection of approximately disjoint frequency bands. If the resulting subbands have the same extent in the frequency domain, the subband decomposition is said to be uniform. Since the bandwidth of each subband is reduced by an amount corresponding to the number of subbands, - say M , each subband can be subsampled by a factor of M . Thus, the number of signal samples in the critically sampled subbands are the same as in the input signal. Since the importance of the various subbands is unequal,  compression is obtained by representing (quantizing) the less important subbands with a small number of bits. Normally, the signal energy is concentrated in the lower frequency subbands, implying that the higher frequency subbands can be represented with a small number of bits. The subband splitting is performed by an analysis lter bank, see Figures 5 and 6 for two alternative structures for doing this. Reconstruction of the decoded signal is performed by a synthesis lter bank operating on the signals derived from the bit ecient representation of the analysis lter bank outputs. If, in the absence of quantization, the output of a cascade of an analysis and synthesis lter bank is identical to the input, the lter bank, or more correctly the lter bank pair, is said to possess the perfect reconstruction property. Most subband coders for speech and image signals are based on perfect or almost perfect reconstruction lter banks. More details on the theory of lter bank construction can be found in [25, 26, 27]. 11

Figure 4: Uniform quantizer with thresholding.

3.1.1 Bit ecient representation of the subbands After the signal has been split into subbands it is in a form well suited for quantization and coding. In the system described in this paper a uniform quantizer is used together with runlength and Human coding. Figure 4 illustrates the quantization and thresholding operation performed on each sample. As the gure shows, the input samples are represented by an amplitude selected from a discrete set of levels, each separated by q. If the sample amplitude has absolute value below a selected threshold t it is set equal to zero. In the experiments discussed in the next section we have chosen t = 1:5q. As will be evident shortly, it is convenient to reorganize the subband samples in the following way: The subband samples are put into vectors such that each element is taken from a separate subband. The sequence is such that the rst entries in the vector are taken from the low frequency bands, whereas the latter entries are taken from the higher bands. We illustrate this with an example: Suppose we have a four subband split with 3 signal samples in each of the subbands. We can picture this as follows: abc def ghi jkl,

where abc are the 3 samples of subband no. 0, def are the 3 samples of subband no. 1 and so on. After reorganization,  or scanning, we get 3 vectors with elements given by: 12

adgj behk cfil

This results in a number of vectors equal to the total number of subband samples divided by the number of bands. Given that most of the energy in the signal is in the lower subbands, it is reasonable to assume,  under most circumstances, that after quantization and thresholding a substantial number of higher band subband samples will be set to zero. Since these zeros tend to occur in clusters,  as a direct consequence of the way in which the data are organized in vectors, run-length coding of these zeros makes sense. The run-length coding is done by representing the above mentioned vectors in the form (Run, Level), where Run is the number of zeros before each non-zero sample, and Level is the amplitude of the quantized subband sample following a number of zeros given by Run. The event that the last samples of the vector are all zeros is represented with the special code EOB (end of block). We illustrate by an example: before run length coding 01003200...0 after run length coding (1,1) (2,3) (0,2) EOB

After the signal has been run-length coded it is nally encoded with a Human coder [28]. Having the choice of either using a xed or adaptive Human code, tailored to the signal at hand, we use an adaptive code. The adaption of the Human code is done on a record by record basis2 .

3.1.2 Traditional lter banks There are many options in selecting both the structure and the constituent lters of a lter bank to be used in a subband coder. Both parallel and tree structured lter banks are commonly used, see Figures 5 and 6. In conjunction with the traditional lter banks presented in this subsection, we follow [2] and use tree structured lter banks. 2 A record can be from a few ten thousand samples and upwards. In the experiments in the next section,

one record corresponds to 10 minutes of a single channel ECG.

13

Figure 5: Parallell analysis lter bank. Subsampling by a factor of M is denoted by # M . In our subband coder we explore the use of either lter banks based on Johnston's almost perfect reconstruction FIR lters [1] or the perfect reconstruction IIR lters described in [29, 30]. In the IIR case the high pass and low pass channel lters of a two-band analysis lter bank can be written [30] as:

and

H0 (z) = 12 [A0 (z 2 ) + z ?1 A1 (z 2 )]

(3)

H1 (z) = 12 [A0 (z 2 ) ? z ?1 A1 (z 2 )];

(4)

?1 A0 (z) = 1a+0 +a zz?1 ;

(5)

?1 A1 (z) = 1a+1 +a zz?1 :

(6)

where A0;1 (z ) are all-pass lters. For subband coding of ECG we have found the use of rst order all-pass lters suitable. The lters A0;1 (z ) can then be written as 0

1

Introducing Equations 3 and 4 into the two channel structure shown in Figure 6, we get the top portion of Figure 7. Note that in arriving at this block diagram we have made use of the so-called noble identities' [25] making it possible to perform the ltering operations on the 14

Figure 6: A two-band analysis lter bank employed in a uniform tree-structured lter bank with 8 subbands. Subsampling by a factor of 2 is denoted by # 2.

15

low sampling rate. The corresponding synthesis two-band lter bank is shown in the bottom part of the same gure. Since an all-pass lter, A(z ), has the property A(z )A(z ?1 ) = 1, it is easy to see that the two band analysis/synthesis structure of Figure 7 has the perfect reconstruction property. Note that the branches with lters of the type A(z ?1 ) are non-causal. Since the ECG signals, in many important applications, can be processed in a batch mode or in blocks of a suitable size, these ltering operations are realized by ltering the time-reversed subband signals with lters of type A(z ).

Figure 7: Perfect reconstruction two-band analysis/synthesis IIR lter bank system. With respect to computational complexity, it has been found [29] that a 16 band lter bank employing 16 tap FIR lters (for example Johnston's F16B lter [1]) can be realized with 32 additions and multiplications per output sample if implemented eciently. For the allpass based IIR solutions the number of additions and multiplications for the same number of subbands is 12 and 4, respectively (we are assuming that both the all-pass lters in Equations 3 and 4 are of rst order). Good IIR lter banks are obtained with all-pass lters with lter coecients expressible as a sum of a few terms of type 2?k , where k is a positive integer. One such lter bank, denoted F 2 2 smpl in [29], has lter coecients as given in Table 1. In this case, when multiplications are implemented by a combination of shift and add operations, the number of multiplications per output sample is zero, whereas the number of 16

Branch Coe 0 0.125 (= 2?3 ) 1 0.625 (= 2?1 + 2?3 ) Table 1: Filter coecients for F 2 2 smpl additions3 is 20.

3.2 Optimized lter banks Traditional subband coding has relied on lter banks having long or innite unit pulse responses, giving rise to the notion of frequency band splitting. The performance of a subband coder having ideal frequency band partitioning approaches that of the ideal prediction gain when the number of subbands approaches innity [31]. Early lter bank designers were concerned with obtaining good stop band attenuation, little ripple in the pass band, and low aliasing between adjacent bands [1]. Therefore, the resulting lter bank unit pulse responses were long. In the following, we will look at how to design short-kernel FIR lter banks.

3.2.1 The lapped orthogonal transform (LOT) The optimal transform in terms of theoretical coding gain (see Section 3.2.2) is the KarhunenLo?eve transform [32]. Optimality is in the mean square sense. This transform is the eigenvector matrix of the signal autocorrelation matrix, and is therefore signal dependent. For image coding, a good, signal independent transform is the discrete cosine transform (DCT), which is very close to the Karhunen-Lo?eve transform for a rst order autoregressive (AR(1)) process with correlation coecient approaching unity [31]. The lapped orthogonal transform (LOT) is just a lter bank featuring perfect reconstruction as well as having orthogonal unit pulse responses. The new concept here is to design the lter bank with short unit pulse responses, using the AR(1) coding gain as criterion for 3 This is actually the number of additions and shift-by-n operations. We tacitly assume the same compu-

tational complexity for these two operations.

17

performance instead of the channel separation properties previously mentioned. Malvar and Staelin [33] developed this concept with a direct approach to LOT design using an eigenvalue formulation. The LOT in this article is an M = 16 channel decomposition using 32-tap lters. It was optimized with respect to an AR(1) process, with correlation factor  = 0:95, using the eigenvalue technique of [33]. The lowpass unit pulse response of the LOT is shown in Figure 10.

3.2.2 ECG-optimized lter banks In this subsection an attempt is made to exploit current knowledge of subband coding of images to design good lter banks for use with ECG signals. In [34] a exible lter bank design method was presented. Using gradient search techniques, almost any kind of mathematical optimization criteria can be utilized. When designing a lter bank for image compression at low bit rates, the following criteria are important [27]:

 Perfect, or near-perfect reconstruction in the absence of quantization noise.  High coding gain.  Zero DC-leakage.  Short lter length to minimize ringing noise.  Absence of blocking eects.  Nonunitarity. The analysis lters may dier from the corresponding synthesis lters. It is shown in [27] that nonunitary lter banks oer higher gain than unitary lter banks. The analysis and synthesis lters of each channel work as half-whitening lters, thus decreasing the reconstruction noise by acting as an open-loop DPCM [31] system. The majority of the given criteria are similarly important when compressing ECG signals. However, there are dierences: When evaluating 2-dimensional greytone (or color) images, 18

the human visual system responds dierently than when evaluating 1-dimensional curves. For images, blocking eects in smooth image areas, and ringing noise, can be annoying. Ringing noise is characterized by over- and undershoots in the reconstructed signal close to abrupt transitions. Blocking eects result from the upsampling procedure in the synthesis lter bank. Evaluation of reconstructed images and ECG signals is also dierent: In image compression the purpose often is that the decompressed images should look good, i.e., it does not matter if there are minor degradations as long as they look natural. This is not always the case for ECG signals: It is important that no artifacts arise during compression. A common optimization criteria for transform and subband coding is the so-called coding gain, dened by [31] 2 x2 = (7) GSBC = PCM 1 ;  2 Q

SBC

M ?1 2 M k=0 yk

2 2 where PCM and SBC are the variances of the reconstruction error associated with basic PCM and subband coding, respectively, and M is the total number of subband channels. x2 and y2k denote the input variance and the variance of subband channel no. k, respectively.

It is well known that images, as a rule, are of lowpass character. A common signal model is the autoregressive (AR) model of order one, with correlation factor  = 0:95 [32]. The associated power spectral density has lowpass shape. Using this model, the coding gain of Equation 7 can be maximized by adapting the subband lter coecients [34].

3.2.3 Decorrelation of a heartbeat signal The subband variances can be found statistically as4 :

y2k (n) = E [yk (n)2 ] =

LX ?1 LX ?1 j =0 i=0

hk (j )hk (i)E [x(n ? j )x(n ? i)]; (8)

where hk () is the analysis lter in channel no. k, and L is the lter length. Due to the nonstationary nature of ECG signals, the variance will to a great extent depend on the time 4 We tacitly assume all signals to have zero mean value. Any DC-component in the lowpass subband can

be subtracted prior to coding.

19

index n. In the following we concentrate on the second order statistics of one normal heartbeat signal. A lter bank which performs good signal decorrelation in these regions will perform well. Figure 8 shows the basic shape of a heartbeat signal, where the QRS complex and also the P and T waves are indicated.

Figure 8: Basic shape of normal heartbeat. Substituting the expectation operator with time-averaging over K consecutive samples,

E [yk (n)2 ] ! K1

KX ?1 n=0

yk (n)2 ;

(9)

Equation 8 can be re-written as

y2k (n) =

LX ?1 LX ?1 j =0 i=0

hk (j )hk (i)Rxx (i ? j );

(10)

where Rxx () is the biased autocorrelation function (acf) estimate:

Rxx (k) = K1

K ?j kj?1 X n=0

x(n)x(n + k):

(11)

An acf estimate was obtained in [35] using ensemble averaging over 18 heartbeats taken from dierent patients. The resulting function is shown in Figure 9. In [2], it was concluded that 16 channels was suitable for compression of ECG signals, although the exact number of channels did not signicantly alter the coding results. As in Section 3.1.2 we consider lter banks with 16 channels only. 20

Normalized autocorrelation

1 0.8 0.6 0.4 0.2 0

−0.2 −0.4 0

10

20

30

40

50 60 Lag

70

80

90 100

Figure 9: Estimated autocorrelation function for normal heartbeat. Using gradient-search techniques, a parallel, uniform, nonunitary, 16-channel FIR lter bank was optimized using the criteria listed earlier. The coding gain was maximized assuming the signal acf of Figure 9, and the lter lengths were limited to 32 taps. The optimized lter bank is nonunitary. It is therefore interesting to highlight the dierences between corresponding analysis and synthesis responses. The lowpass analysis and synthesis unit pulse responses of the optimized lter bank are shown in Figure 10, and compared to the lowpass unit pulse response of the LOT decomposition of Section 3.2.1. The LOT, being a unitary lter bank, has identical shape for corresponding analysis-synthesis responses. ECG analysis

ECG synthesis LOT

Figure 10: Lowpass unit pulse responses for the ECG-optimized nonunitary lter bank, and the LOT. 21

There is a dramatic dierence between the analysis and synthesis lowpass responses of the optimized lter bank: The analysis response is bimodal, and while it tapers o to zero at both ends, the decay is not as smooth as that of the synthesis response. The smoothness of the synthesis response is related to the blocking free reconstruction properties of this lter bank, and is in accordance with Malvar's approach [36], in which he tried to design the basis vectors of the LOT to decay smoothly to zero. An additional advantage of the smooth, monotonously decaying synthesis lowpass response, is that it does not produce ringing artifacts in the reconstructed signal. A signal edge reconstructed with this response, will be smeared out, but without any over- or undershoots. The end taps of the lowpass LOT response have signicantly larger amplitude than the end taps of the ECG-optimized synthesis response.

4 Coding experiments and discussion For evaluation of the reconstructed signals a commonly used performance measure is the percentage root mean square dierence (PRD) [37]: v u PN u (x(n) ? x(n))2 t PRD = Pn=1 N (x(n) ? x)2 n=1

 100%;

(12)

where x(n) and x(n) denote the original and reconstructed signals, x is the mean value of x(n), and N is the original signal length. Although useful for testing the relative performance of coding techniques within a narrow family, the PRD hardly qualies as an authoritative yardstick for inter-method comparisons. Each compression method has its own distortion characteristic, and objective measures like the PRD, or the signal to noise ratio, should be supplemented by visual inspection of the reconstructed waveforms.

4.1 Coding evaluation All records used in this article are taken from the MIT/BIH Arrhythmia CD-ROM database, second edition [38]. The sampling frequency is 360 Hz with 12 bits per sample. In order to test the robustness of the coding systems presented earlier, a varied set of test signals is used, similar to [2]. The rst 15 seconds of each test signal are plotted in Figure 11. 22

Note that only the rst record, mit100 1000, is a normal heartbeat signal. The other records also contain various abnormal rhythms. mit100_1000

mit202_0800

mit203_0100

mit203_1100

mit207_1800

mit214_0300

Figure 11: Original ECG's (the rst 15 seconds) used in the experiments. mitxxx yyyy denotes record no. xxx starting at time yy.yy. Each record length is in total 10 minutes, corresponding to 216000 samples.

4.1.1 Comparison based on the PRD measure The test signals presented above are coded using in total 7 dierent compression algorithms: The time domain algorithms CCSP and FAN, and subband coding using 5 dierent signal 23

decompositions: The F16B FIR lter bank [1], the F 2 2 smpl IIR lter bank presented in Section 3.1.2, the Discrete Cosine Transform (DCT), the 32-tap LOT presented in Section 3.2.1, and the ECG-optimized, 32-tap, parallel FIR lter bank presented in Section 3.2.2. All decompositions use uniform frequency band splitting, with M = 16 channels. For the time domain coders the bit representation of the retained signal samples is as described in Section 2.2.2, and the bit representation of the subband signals/transform coecients is as described in Section 3.1.1. Figure 12 presents obtained PRDs for all combinations of coders and test signals for bit rates between 0.2 and 1.4 bits per sample (bps). We start the discussion by looking at the frequency domain coders. As a result of the sharp peaks in the rst signal, mit100 1000, the relatively short kernel of the DCT, the LOT, and the ECG-optimized lter bank results in a better performance than that of the more traditional lter banks. Especially the IIR lter bank has inferior performance in this case. For the 5 remaining test signals the DCT is outperformed by the other decompositions. This is due to the smoother nature of the test signals, thus rendering them more suitable to approximations by larger lter kernels. For all test signals the ECG-optimized, 32-tap, parallel FIR lter bank outperforms the other decompositions. The optimality of the CSSP algorithm is clearly demonstrated when compared to the heuristic FAN algorithm. For all test signals the CCSP outperforms the FAN algorithm with a large margin over the range of bit rates shown. Especially a low bit rates (around 0.6 bps) the dierence varies between 3 and 12 % PRD. At higher rates (around 1.0 bps) the dierence is smaller, but still signicant. The CCSP and FAN algorithms perform best on signals without too much oscillations. To illustrate this, compare the results for mit203 0100 and mit203 1100 with the results for the other test signals. This is reasonable as oscillating signals demand retaining more samples in order to achieve the same PRD as a slowly varying signal. In spite of being optimal, in the sense outlined in Section 2, the CCSP coder cannot compete, in terms of obtained PRD, with the majority of the subband coders. Only the IIR 24

mit100 1000

50

mit202 0800

35 30

40 25

PRD

PRD

30

20

20 15 10

10 5 0 0.2

0.4

0.6

0.8 1 Total bit rate

1.2

0 0.2

1.4

mit203 0100

25

0.4

0.6

0.8 1 Total bit rate

1.2

1.4

1.2

1.4

1.2

1.4

mit203 1100

30 25

20

20

PRD

PRD

15 15

10 10 5

0 0.2

5

0.4

0.6

0.8 1 Total bit rate

1.2

0 0.2

1.4

mit207 1800

35

0.4

0.6

0.8 1 Total bit rate

mit214 0300

25

30 20

15

20

PRD

PRD

25

15

10

10 5 5 0 0.2

0.4

0.6

0.8 1 Total bit rate

1.2

0 0.2

1.4

0.4

0.6

0.8 1 Total bit rate

Figure 12: Coding performance for varying input signals. The lter banks used are: Johnston's F16B [1] (..), IIR lter bank F 2 2 smpl [2] (-.), DCT (- -), 32-tap LOT (solid line with circles), and the optimized 32-tap FIR lter bank (solid line). The time domain algorithm are: CCSP (solid line with x-marks), and FAN (solid line with diamonds). 25

lter bank, as well as the DCT, do in some cases give rise to a PRD that is marginally larger.

4.1.2 Comparison based on visual inspection As mentioned in the beginning of this section, inter-method comparisons should also include visual inspection of the reconstructed signals. The scope of this investigation is to show coding artifacts as they appear using the presented compression methods for a typical ECG signal. We have chosen a short segment taken from mit100 1000 representing a regular heartbeat. The reconstructed signal segment is shown at two dierent bit rates: 1.0 bps and 0.5 bps, when coding the whole 10 minute record. Figure 13 shows the reconstructed segment at 1.0 bps. The original signal segment is also included. At 1.0 bps all coders smooth out some Original

DCT

LOT

ECG−optimized

F16B

F_2_2_smpl

CCSP

FAN

Figure 13: Reconstructed signal segment (taken from mit100 1000) at 1.0 bit per sample. of the details in the original signal. This is particularly evident for the time domain coders, 26

and for the DCT coder. The latter also produces some blocking artifacts. In contrast to the short-kernel LOT and ECG-optimized lter banks, the traditional F16B and F 2 2 smpl lter banks produce some ringing noise. This is evident near the R-wave of the QRS complex in both cases. At 0.5 bps the blocking artifacts of the DCT coder and the ringing noise of the Original

DCT

LOT

ECG−optimized

F16B

F_2_2_smpl

CCSP

FAN

Figure 14: Reconstructed signal segment (taken from mit100 1000) at 0.5 bit per sample. F16B and the F 2 2 smpl lter banks have seriously increased, whereas the reconstructions using the short kernel FIR lter banks have degraded in a more graceful manner. Some ringing noise can bee seen, though, for the LOT reconstruction near the R-wave. The ECG-optimized lter bank does not exhibit artifacts at this bit rate. The performance of the time domain algorithms suers at low bit rates due to the fact that when few samples are extracted, the number of possible runs increases. The block size of the encoding process should therefore be tuned together with the increase in the number 27

of runs. Here a xed block size of 500 samples is used.

5 Summary and conclusions We have presented an overview of current time- and frequency domain methods for compression of ECG signal. In time domain coding the compressed signal is represented by retained signal samples, whereas in frequency domain coding the compressed signal is represented by quantized subband samples. In both categories entropy coding was used for bit-ecient representation. Coding experiments demonstrates that time domain methods based on linear interpolation of retained samples cannot compete with subband coders at low bit rates, i.e., below 1.0 bps. This was also veried by visual inspection of the reconstructed waveforms. The loss of detail was more prevalent in the time domain coding cases. However, the optimal time domain algorithm performed dramatically better than the well known FAN algorithm. A collection of lter bank decompositions were tested in a complete subband coding setup. The short kernel lter banks, the parallel, nonunitary, ECG-optimized lter bank and the LOT, had the best overall coding performance, both with respect to obtained PRD and visual evaluation, with the ECG-optimized lter bank being the better of the two.

References [1] J. D. Johnston, A lter family designed for use in quadrature mirror lter banks, in Proc. Int. Conf. Acoust. Speech, Signal Proc., (Denver, CO), pp. 291294, IEEE, 1980. [2] J. H. Husøy and T. Gjerde, Computationally ecient subband coding of ECG signals, Medical Engineering and Physics, vol. 18, pp. 132142, Mar. 1996. [3] J. Cox, F. Noelle, H. Fozzard, and G. Oliver, AZTEC: A preprocessing program for real-time ECG rhythm analysis, IEEE Trans. Biomed. Eng., vol. BME-15, pp. 128129, 1968. [4] J. Abenstein and W. Tompkins, New data-reduction algorithm for real-time ECG analysis, IEEE Trans. Biomed. Eng., vol. BME-29, pp. 4348, 1982. 28

[5] W. Mueller, Arrhythmia dectection program for an ambulatory ECG monitor, Biomed. Sci. Instrument, vol. 14, pp. 8185, 1978. [6] D. A. Dipersio and R. C. Barr, Evaluation of the fan method of adaptive sampling on human electrocardiograms, Medical & Biological Engineering & Computing, pp. 401 410, September 1985. [7] D. Haugland, J. Heber, and J. Husøy, Optimisation algorithms for ECG data compression, Medical & Biological Engineering & Computing, vol. 35, pp. 420424, July 1997. [8] C. P. Mammen and B. Ramamurthi, Vector quantization for compression of multichannel ECG, IEEE Trancations on Biomedical Engineering, vol. 37, pp. 821825, September 1990. [9] M. C. Aydin, A. E. C etin, and H. Koymen, ECG data compression by subband coding, Electronics Letter, vol. 27, pp. 359360, February 1991. [10] S. C. Tai, Six-band sub-band coder on ECG waveforms, Medical & Biological Engineering & Computing, vol. 30, pp. 187192, March 1992. [11] J. Ishijima, S. B. Shin, G. H. Hostetter, and J. Skalansky, Scan along polyon approximation for data compression of electrocardiagrams., IEEE Trans. Biomed. Eng., vol. 30, pp. 723729, 1983. [12] S. C. Tai, AZTDIS  a two phase real-time ECG data compressor, Journal of Biomedical Engineering, vol. 15, pp. 510515, Nov. 1993. [13] S. C. Tai, Slope  a real-time ECG data compressor, Medical & Biological Engineering & Computing, vol. 29, pp. 175179, March, 1991. [14] D. Haugland, J. G. Heber, and J. H. Husøy, Compressing data by shortest path methods, in Operations Resarch proc., (Springer, Berlin, Germany), 1997. [15] Y. Aneja, V. Aggarwal, and K. Nair, Shortest chain subject to side conditions, Networks, vol. 13, pp. 295302, 1983. 29

[16] J. Beasley and N. Christodes, An algorithm for the resource constrained shortest path problem, Networks, vol. 19, pp. 379394, 1989. [17] Y. Aneja and K. Nair, The constrained shortest path problem, Nav. Res. Log. Q, vol. 25, pp. 549553, 1978. [18] G. Handler and I. Zang, A dual algorithm for the constrained shortest path problem, Networks, vol. 10, pp. 293310, 1980. [19] H. Joksch, The shortest route problem with constraints, J. Math. Anal. Appl., vol. 14, pp. 191197, 1966. [20] C. Ribeiro and M. Minoux, A heuristic approach to hard constrained shortest path problems, Discrete Appl. Math., vol. 10, pp. 125137, 1985. [21] E. Dijkstra, A note on two problems in connection with graphs, Numerische Math., vol. 1, pp. 269271, 1959. [22] M. Rosseel, Comments on a paper by R. Saigal: A constrained shortest route problem, Opns. Res., vol. 16, pp. 12321234, 1968. [23] R. Saigal, A constrained shortest route problem, Opns. Res., vol. 16, pp. 205209, 1968. [24] M. Otterå, Compressing ECG signals by entropy minimized path, Master's thesis, Høgskolen i Stavanger, 1997 (in Norwegian). [25] P. P. Vaidyanathan, Multirate Systems and Filter Banks. Englewood Clis: Prentice Hall, 1993. [26] A. N. Akansu and R. A. Haddad, Multiresolution Signal Decomposition. San Diego: Academic Press, 1992. [27] T. A. Ramstad, S. O. Aase, and J. H. Husøy, Subband Compression of Images  Principles and Examples. North Holland: ELSEVIER Science Publishers BV, 1995. [28] T. Cover and J. Joy, Elements of Information Theory. New York: Wiley, 1991.

30

[29] J. H. Husøy, Subband Coding of Still Images and Video. PhD thesis, Norwegian Institute of Technology, Jan. 1991. [30] J. H. Husøy, Low complexity subband coding of still images and video, Optical Engineering, vol. 30, pp. 904911, July 1991. [31] N. S. Jayant and P. Noll, Digital Coding of Waveforms. Englewood Clis: Prentice-Hall, 1984. [32] A. K. Jain, Fundamentals of Digital Image Processing. Englewood Clis: Prentice-Hall, 1989. [33] H. S. Malvar and D. H. Staelin, The LOT: Transform coding of images without blocking eects, IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 553559, Apr. 1989. [34] S. O. Aase, Image Subband Coding Artifacts: Analysis and Remedies. PhD thesis, The Norwegian Institute of Technology, Mar. 1993. [35] T. Haugan, Compression of ECG signals using optimized FIR lter banks and entropy allocation, Master's thesis, Rogaland University Center, 1995 (in Norwegian). [36] H. Malvar, Signal Processing with Lapped Transforms. Artech House, 1992. [37] W. J. Tompkins, ed., Biomedical Digital Signal Processing: C  Language Examples and Laboratory Experiments for the IBM PC. Prentice  Hall Inc., 1993. [38] Massachusetts Institute of Technology, The MIT-BIH Arrhythmia Database CD-ROM, 2nd ed., 1992.

31