Advanced Methods and Tools for ECG Data Analysis - Part 5 pdf

P1: Shashi August 30, 2006 11:5 Chan-Horizon Azuaje˙Book 5.3 Wavelet Filtering 145 Figure 5.5 The effect of a selection of different wavelets for filtering a section of ECG (using the first approximation only) contaminated by Gaussian pink noise (SNR = 20 dB). From top to bottom; original (clean) ECG, noisy ECG, biorthogonal (8,4) filtered, discrete Meyer filtered, Coiflet filtered, symlet (6,6) filtered, symlet filtered (4,4), Daubechies (4,4) filtered, reverse biorthogonal (3,5), reverse biorthogonal (4,8), Haar filtered, and biorthogonal (6,2) filtered. The zero-noise clean ECG is created by averaging 1,228 R-peak aligned, 1-second-long segments of the author’sECG. RMS error performance of each filter is listed in Table 5.1. to the length of the highpass filter. Therefore Matlab’s bior4.4 has four vanishing moments 3 with 9 LP and 7 HP coefficients (or taps) in each of the filters. Figure 5.5 illustrates the effect of using different mother wavelets to filter a section of clean (zero-noise) ECG, using only the first approximation of each wavelet decomposition. The clean (upper) ECG is created by averaging 1,228 R-peak aligned, 1-second-long segments of the author’s ECG. Gaussian pink noise is then added with a signal-to-noise ratio (SNR) of 20 dB. The root mean square (RMS) error between the filtered waveform and the original clean ECG for each wavelet is given in Table 5.1. Note that the biorthogonal wavelets with J ,K ≥ 8, 4, 3. If the Fourier transform of the wavelet is J continuously differentiable, then the wavelet has J vanishing moments. Type wavei n f o(  bi or  ) at the Matlab prompt for more information. Viewing the filters using [lp decon , hp decon , lp recon , hp recon ] = wfilters(  bi or 4.4  ) in Matlab reveals one zero coefficient in each of the LP decomposition and HP reconstruction filters, and three zeros in the LP reconstruction and HP decomposition filters. Note that these zeros are simply padded and do not count when calculating the filter size. P1: Shashi August 30, 2006 11:5 Chan-Horizon Azuaje˙Book 146 Linear Filtering Methods Table 5.1 Signals Displayed in Figure 5.5 (from Top to Bottom) with RMS Error Between Clean and Wavelet Filtered ECG with 20-dB Additive Gaussian Pink Noise Wavelet Family Family Member RMS Error Original ECG N/A 0 ECG with pink noise N/A 0.3190 Biorthogonal ‘bior’ bior3.3 0.0296 Discrete Meyer ‘dmey’ dmey 0.0296 Coiflets ‘coif’ coif2 0.0297 Symlets ‘sym’ sym3 0.0312 Symlets ‘sym’ sym2 0.0312 Daubechies ‘db’ db2 0.0312 Reverse biorthogonal ‘rbio’ rbio3.3 0.0322 Reverse biorthogonal ‘rbio’ rbio2.2 0.0356 Haar ‘haar’ haar 0.0462 Biorthogonal ‘bior’ bior1.3 0.0472 N/A indicates not applicable. the discrete Meyer wavelet and the Coiflets appear to produce the best filtering performance in this circumstance. The RMS results agree with visual inspection, where significant morphological distortions can be seen for the other filtered signals. In general, increasing the number of taps in the filter produces a lower error filter. The wavelet transform can be considered either as a spectral filter applied over many time scales, or viewed as a linear time filter [(t −τ)/a] centered at a time τ with scale a that is convolved with the time series x(t). Therefore, convolving the filters with a shape more commensurate with that of the ECG produces a better filter. Figure 5.4 illustrates this point. Note that as we increase the number of taps in the filter, the mother wavelet begins to resemble the ECG’s P-QRS-T morphology more closely. The biorthogonal wavelet family members are FIR filters and, therefore, possess a linear phase response, which is an important characteristic for signal and image reconstruction. In general, biorthogonal spline wavelets allow ex- act reconstruction of the decomposed signal. This is not possible using orthogonal wavelets (except for the Haar wavelet). Therefore, bi or 3.3 is a good choice for a general ECG filter. It should be noted that the filtering performance of each wavelet will be different for different types of noise, and an adaptive wavelet-switching procedure may be appropriate. As with all filters, the wavelet performance may also be application-specific, and a sensitivity analysis on the ECG feature of interest is appropriate (e.g., QT interval or ST level) before selecting a particular wavelet. As a practical example of comparing different common filtering types to the ECG, observe Figure 5.6. The upper trace illustrates an unfiltered recording of a V5 ECG lead from a 30-year-old healthy adult male undergoing an exercise test. Note the presence of high amplitude 50-Hz (mains) noise. The second subplot illustrates the action of applying a 3-tap IIR notch-filter centered on 50 Hz, to reveal the underlying ECG. Note the presence of baseline wander disturbance from electrode motion around t = 467 seconds, and the difficulty in discerning the P wave (indicated by a large arrow at the far left). The third trace is a band-pass (0.1 to 45 Hz) FIR filtered version of the upper trace. Note the baseline wander is reduced P1: Shashi August 30, 2006 11:5 Chan-Horizon Azuaje˙Book 5.3 Wavelet Filtering 147 Figure 5.6 Raw ECG with 50 Hz mains noise, IIR 50-Hz notch filtered ECG, 0.1- to 45-Hz band- pass filtered ECG and bior3.3 wavelet filtered ECG. The left-most arrow indicates the low amplitude P wave. Central arrows indicate Gibbs oscillations in the FIR filter causing a distortion larger than the P wave. significantly, but a Gibbs 4 ringing phenomena is introduced into the Q and S waves (illustrated by the small arrows), which manifests as distortions with an amplitude larger than the P wave itself. A good demonstration of the Gibbs phenomenon can be found in [9, 10]. This ringing can lead to significant problems for a QRS detector (looking for Q wave onset) or any technique for analyzing at QT intervals or ST changes. The lower trace is the first approximation of a biorthogonal wavelet decomposition (bior3.3) of the notch-filtered ECG. Note that the P wave is now discernible from the background noise and the Gibbs oscillations are not present. As mentioned at the start of this section, the number of articles on ECG analysis that employ wavelets is enormous and an excellent overview of many of the key publications in this arena can be found in Addison [5]. Wavelet filtering is a lossless supervised filtering method where the basis functions are chosen a priori, much like the case of a Fourier-based filter (although some of the wavelets do not have orthogonal basis functions). Unfortunately, it is difficult to remove in-band noise because the CWT and DWT are signal separation methods that effectively occur in 4. The existence of the ripples with amplitudes independent of the filter length. Increasing the filter length narrows the transition width but does not affect the ripple. One technique to reduce the ripples is to multiply the impulse response of an ideal filter by a tapered window. P1: Shashi August 30, 2006 11:5 Chan-Horizon Azuaje˙Book 148 Linear Filtering Methods the frequency domain 5 (ECG signal and noises often have a significant overlap in the frequency domain). In the next section we will look at techniques that discover the basis functions within data, based either on the statistics of the signal’s distributions or with reference to a known signal model. The basis functions may overlap in the frequency domain, and therefore, we may separate out in-band noise. As a postscript to this section, it should be noted that there has been much discussion of the use of wavelets in HRV analysis (see Chapter 3) since long-range beat-to-beat fluctuations are obviously nonstationary. Unfortunately, very little at- tention has been paid to the unevenly sampled nature of the RR interval time series and this can lead to serious errors (see Chapter 3). Techniques for wavelet analysis of unevenly sampled data do exist [11, 12], but it is not clear how a discrete filter bank formulation with up-down sampling could avoid the inherent problems of resampling an unevenly sampled signal. A recently proposed alternative JTFA technique known as the Hilbert-Huang transform (HHT) [13, 14], which is based upon empirical mode decomposition (EMD), has shown promise in the area of nonstationary and nonlinear JFTA (since both the amplitude and frequency terms are a function of time 6 ). Furthermore, there is striking similarity between EMD and the least-squares estimation technique used in calculating the Lomb-Scargle Peri- odogram (LSP) for power spectral density estimation of unevenly sampled signals (see Chapter 3). EMD attempts to find basis functions (such as the sines and cosines in the LSP) by fitting them to the signal and then subtracting them, in much the same manner as in the calculation of the LSP (with the difference being that EMD analyzes the envelope of the signal and does not restrict the basis functions to being sinusoidal). It is therefore logical to extend the HHT technique to fit empirical modes to an unevenly sampled times series such as the RR tachogram. If the fit is optimal in a least-squares sense, then the basis functions will remain orthogonal (as we shall discover in the next section). Of course, the basis functions may not be orthogonal, and other measures for optimal fits may be employed. This concept is explored further in Section 5.4.3.2. 5.4 Data-Determined Basis Functions Sections 5.4.1 to 5.4.3 present a set of transformation techniques for filtering or separating signals without using any prior knowledge of the spectral components of the signals and are based upon a statistical analysis to discover the underlying basis functions of a set of signals. These transformation techniques are principal component analysis 7 (PCA), artificial neural networks (ANNs), and independent component analysis (ICA). 5. The wavelet is convolved with the signal. 6. Interestingly, the empirical modes of the HHT are also determined by the data and are therefore a special case where a JTFA technique (the Hilbert transform) is combined with a data-determined empirical mode decomposition to derive orthogonal basis functions that may overlap in the frequency domain in a nonlinear manner. 7. This is also known as singular value decomposition (SVD), the Hotelling transform or the Karhunen-Lo ` eve transform (KLT). P1: Shashi September 4, 2006 10:25 Chan-Horizon Azuaje˙Book 5.4 Data-Determined Basis Functions 149 Both PCA and ICA attempt to find an independent set of vectors onto which we can transform data. Those data that are projected (or mapped) onto each vector are the independent sources. The basic goal in PCA is to decorrelate the signal by projecting data onto orthogonal axes. However, ICA results in a transformation of data onto a set of axes which are not necessarily orthogonal. Both PCA and ICA can be used to perform lossy or lossless transformations by multiplying the recorded (observation) data by a separation or demixing matrix. Lossless PCA and ICA both involve projecting data onto a set of axes which are determined by the nature of those data, and are therefore methods of blind source separation (BSS). (Blind because the axes of projection and therefore the sources are determined through the application of an internal measure and without the use of any prior knowledge of a signal’s structure.) Once we have discovered the axes of the independent components in a data set and have separated them out by projecting the data set onto these axes, we can then use these techniques to filter the data set. 5.4.1 Principal Component Analysis To determine the principal components (PCs) of a multidimensional signal, we can use the method of singular value decomposition. Consider a real N × M matrix X of observations which may be decomposed as follows: X = USV T (5.8) where S is an N × M nonsquare matrix with zero entries everywhere, except on the leading diagonal with elements s i (= S nm , n = m) arranged in descending order of magnitude. Each s i is equal to √ λ i , the square root of the eigenvalues of C = X T X. A stem-plot of these values against their index i is known as the singular spectrum. The smaller the eigenvalues are, the less energy along the corresponding eigenvector there is. Therefore, the smallest eigenvalues are often considered to be associated with the noise in the signal. V is an M × M matrix of column vectors which are the eigenvectors of C. U is an N×N matrix of projections of X onto the eigenvectors of C [15]. If a truncated SVD of X is performed (i.e. we just retain the most significant p eigenvectors), 8 then the truncated SVD is given by Y = US p V T , and the columns of the N × M matrix Y are the noise-reduced signal (see Figure 5.7). SVD is a commonly employed technique to compress and/or filter the ECG. In particular, if we align M heartbeats, each N samples long, in a matrix (of size N × M), we can compress it down (into an N × p) matrix, using only the first p << M PCs. If we then reconstruct the set of heartbeats by inverting the reduced rank matrix, we effectively filter the original ECG. Figure 5.7(a) illustrates a set of 20 heartbeat waveforms which have been cut into 1-second segments (with a sampling frequency F s = 256 Hz), aligned by their R peaks and placed side by side to form a 256 × 20 matrix. Therefore, the data set is 20-dimensional, and an SVD will lead to 20 eigenvectors. Figure 5.7(b) is 8. In practice choosing the value of p depends on the nature of the data set, but is often taken to be the knee in the eigenspectrum or as the value where  p i=1 s i >α  M i=1 s i and α is some fraction ≈ 0.95. P1: Shashi August 30, 2006 11:5 Chan-Horizon Azuaje˙Book 150 Linear Filtering Methods Figure 5.7 SVD of 20 R-peak-aligned P-QRS-T complexes: (a) in the original form with in-band Gaussian pink noise noise (SNR = 14 dB), (b) eigenspectrum of decomposition (with the knee indicated by an arrow), (c) reconstruction using only the first principal component, and (d) reconstruction using only the first two principal components. the eigenspectrum obtained from SVD. 9 Note that the signal/noise boundary is generally taken to be the knee of the eigenspectrum, which is indicated by an arrow in Figure 5.7(b). Since the eigenvalues are related to the power, most of the power is contained in the first five eigenvectors (in this example). Figure 5.7(c) is a plot of the reconstruction (filtering) of the data set using just the first eigenvector. Figure 5.7(d) is the same as Figure 5.7(c), but the first five eigenvectors have been used to reconstruct the data set. 10 The data set in Figure 5.7(d) is therefore noisier than that in Figure 5.7(c), but cleaner than that in Figure 5.7(a). Note that although Figure 5.7(c) appears to be extremely clean, this is at the cost of removing some beat-to-beat morphological changes, since only one PC was used. Note that S derived from a full SVD is an invertible matrix, and no information is lost if we retain all the PCs. In other words, we recover the original data by performing the multiplication USV T . However, if we perform a truncated SVD, then the inverse of S does not exist. The transformation that performs the filtering is noninvertible, and information is lost because S is singular. From a data compression point of view, SVD is an excellent tool. If the eigenspace is known (or previously determined from experiments), then the M-dimensions of 9. In Matlab: [USV] = svd(data); stem(diag(S) 2 ). 10. In Matlab: [USV] = svds(data,5);water f all(U ∗ S ∗ V  ). P1: Shashi August 30, 2006 11:5 Chan-Horizon Azuaje˙Book 5.4 Data-Determined Basis Functions 151 data can in general be encoded in only p-dimensions of data. So for N sample points in each signal, an N×M matrix is reduced to an N×p matrix. In the above example, retaining only the first principal component, we achieve a compression ration of 20:1. Note that the data set is encoded in the U matrix, so we are only interested in the first p columns. The eigenvalues and eigenvectors are encoded in S and V matrices, and thus an additional p scalar values are required to encode the relative energies in each column (or signal source) in U. Furthermore, if we wish to encode the eigenspace onto which the data set in U is projected, we require an additional p 2 scalar values (the elements of V). Therefore, SVD compression only becomes of significant value when a large number of beats are analyzed. It should be noted that the eigenvectors will change over time since they are based upon the morphology of the beats. Morphology changes both subtly with heart rate–related cardiac conduction velocity changes, and with conduction path abnormalities that produce abnormal beats. Furthermore, the basis functions are lead dependent, unless a multidimensional basis function set is derived and the leads are mapped onto this set. In order to find the global eigenspace for all beats, we need to take a large, representa- tive set of heartbeats 11 and perform SVD upon this training set [16, 17]. Projecting each new beat onto these globally derived basis vectors leads to a filtering of the signal that is essentially equivalent to passing the P-QRS-T complex through a set of trained weights of a multilayer perceptron (MLP) neural network (see [18] and the following section). Abnormal beats or artifacts erroneously detected as normal beats will have abnormal eigenvalues (or a highly irregular structure when recon- structed by the MLP). In this way, beat classification can be performed. However, in order to retain all the subtleties of the QRS complex, at least p = 5 eigenvalues and eigenvectors are required (and another five for the rest of the beat). At a sampling frequency of F s Hz and an average beat-to-beat interval of RR av (or heart rate of 60/RR av ), the compression ratio is F s · RR av · ( N−p p ) : 1, where N is the number of samples in each segmented heartbeat. Other studies have used between 10 [19] and 16 [18] free parameters (neurons) to encode (or model) each beat, but these methods necessarily model some noise also. In Chapter 9 we will see how we can derive a global set of principal eigenvectors V (or KL basis functions) onto which we can project each beat. The strength of the projection along each eigenvector 12 allows us to classify the beat type. In the next section, we will look at an online adaptive implementation of this technique for patient-specific learning, using the framework of artificial neural networks. 5.4.2 Neural Network Filtering PCA can be reformulated as a neural network problem, and, in fact, a MLP with linear activation functions can be shown to perform singular valued decomposition [18, 20]. Consider an auto-associative multilayered perceptron (AAMLP) neural network, which has as many output nodes as input nodes, illustrated in Figure 5.8. The AAMLP can be trained using an objective cost function measured between the 11. That is, N >> 20. 12. Derived from a database of test signals. P1: Shashi August 30, 2006 11:5 Chan-Horizon Azuaje˙Book 152 Linear Filtering Methods Figure 5.8 Layout of a D-p-D auto-associative neural network. inputs and outputs; the target data vector is simply the input data vector. There- fore, no labeling of training data is required. An auto-associative neural network performs dimensionality reduction from D to p dimensions (D > p) and then projects back up to D dimensions. (See Figure 5.8.) PCA, a standard linear dimensionality reduction procedure is also a form of unsupervised learning [20]. In fact, the number of hidden-layer nodes ( dim(y j ) ) is usually chosen to be the same as the number of PCs, p, in the data set (see Section 5.4.1), since (as we shall see later) the first layer of weights performs PCA if trained with a linear activation function. The full derivation of PCA shows that PCA is based on minimizing a sum-of-squares error cost function, as is the case for the AAMLP [20]. The input data used to train the network is now defined as y i for consistency of notation. The y i are fed into the network and propagated through to give an output y k given by y k = f a    j w jk f a (  i w ij y i )   (5.9) where f a is the activation function, 13 a j =  i=N i=0 w ij y i , and D = N is the number of input nodes. Note that the x’s from the previous section are now the y i , our sources are the y j , and our filtered data (after training) are the y k . During training, the target data vector or desired output, t k , which is associated with the training data vector, is compared to the actual output y k . The weights, w jk and w ij , are then adjusted in order to minimize the difference between the propagated output and the target value. This error is defined over all training patterns, M, in the training set as ξ = 1 2 M  n=1  k   f a (  j w jk f a (  i w ij y p i )) − t p k   2 (5.10) where j = p is the number of hidden units and ξ is the error to be backpropagated at each learning cycle. Note that the y j are the values of the data set after projection 13. Often taken to be a sigmoid ( f a (a) = 1 1+e −a ), a tanh,orasoftmax function). P1: Shashi August 30, 2006 11:5 Chan-Horizon Azuaje˙Book 5.4 Data-Determined Basis Functions 153 onto the p-dimensional (p < N, D) hidden layer (the PCs). This is the point at which the dimensionality reduction (and hence filtering) really occurs, since the input dimensionality equals the output dimensionality (N = D). The squared error, ξ , can be minimized using the method of gradient descent [20]. This requires the gradient to be calculated with respect to each weight, w ij and w jk . The weight update equations for the hidden and output layers are given as follows: w (τ+1) jk = w (τ) jk − η ∂ξ ∂w jk (5.11) w (τ+1) ij = w (τ) ij − η ∂ξ ∂w ij (5.12) where τ represents the iteration step and η is a small (<< 1) learning term. In general, the weights are updated until ξ reaches some minimum. Training is an iterative process [repeated application of (5.11) and (5.12)], but, if continued for too long, 14 the network starts to fit the noise in the training set and that will have a negative effect on the performance of the trained network on test data. The decision on when to stop training is of vital importance but is often defined when the error function (or its gradient) drops below some predefined level. The use of an independent validation set is often the best way to decide on when to terminate training (see Bishop [20, p. 262] for more details). However, in the case of an auto-associative network, no validation set is required, and the training can be terminated when the ratio of the variance of the input and output data reaches a plateau. (See [21, 22].) If f a is set to be linear y k = a k , ∂y k ∂a k = 1, then the expression for δ k reduces to δ k = ∂ξ ∂a k = ∂ξ ∂y k · ∂y k ∂a k = (y k − t k ) (5.13) If the hidden layer also contains linear units, further changes must be made to the weight update equations: δ j = ∂ξ ∂a j = ∂ξ ∂a k · ∂a k ∂y j · ∂y j ∂a j =  k δ k w jk (5.14) If f a is linearized (set to unity)—this expression is differentiated with respect to w ij and the derivative is set to zero, the usual equations for least-squares optimiza- tion can be given in the form M  M  D  i  =0 y m i w i  j − t m j  y m i = 0 (5.15) 14. Note that a momentum term can be inserted into (5.11) and (5.12) to premultiply the weights and increase the speed of convergence of the network. P1: Shashi September 4, 2006 10:29 Chan-Horizon Azuaje˙Book 154 Linear Filtering Methods which is written in matrix notation as (Y T Y)W T = Y T T (5.16) Y has dimensions M × D with elements y m i where M is the number of training patterns and D the number of input nodes to the network (the length of each ECG complex in our examples). W has dimension p × D and elements w ij and T has dimensions M × p and elements t m j . The matrix (Y T Y) is a square p × p matrix which may be inverted to obtain the solution W T = Y † T (5.17) where Y † is the (p × M) pseudo-inverse of Y and is given by Y † = (Y T Y) −1 Y T (5.18) Note that in practice (Y T Y) usually turns out to be near-singular and SVD is used to avoid problems caused by the accumulation of numerical roundoff errors. Consider M training patterns, each i = N samples long presented to the auto- associative MLP with i input and k output nodes (i = k) and j ≤ i hidden nodes. For the mth (m = 1 M) input vector x i of the i × M (M ≥ i) real input matrix, X, formed by the M (i-dimensional) training vectors, the hidden unit output values are h j = f a (W 1 x i + w 1b ) (5.19) where W 1 is the input-to-hidden layer i × j weight matrix, w 1b is a rank- j vector of biases, and f a is an activation function. The output of the auto-associative MLP can then be written as y k = W 2 h j + w 2b (5.20) where W 2 is the hidden-to-output layer j × k weight matrix and w 2b is a rank-k vector of biases. Now consider the singular value decomposition of X, such that X i = U i S i V T i , where U is an i ×i column-orthogonal matrix, S is an i ×N diagonal matrix with positive or zero elements (the singular values) and V T is the transpose of an N×N orthogonal matrix [15]. The best rank- j approximation of X is W 2 h j = U j S j V T j [23], where h j = FS j V T j (5.21) W 2 = U j F −1 (5.22) with F being an arbitrary nonsingular j × j scaling matrix. U j has i × j elements, S j has j × j elements, and V T has j × M elements. It can be shown that [24] W 1 = a −1 1 FU T j (5.23) [...]... Vol 26, No 5, 20 05, pp R 155 –R199 Dickhaus, H., and H Heinrich, Analysis of ECG Late Potentials Using Time-Frequency Methods, ” in M Akay, (ed.), Time Frequency and Wavelets in Biomedical Signal Processing, Chapter 4, New York: Wiley-IEEE Press, 1997 Martınez, J P., et al., “A Wavelet-Based ECG Delineator: Evaluation on Standard ´ Database,” IEEE Trans Biomed Eng., Vol 51 , No 4, 2004, pp 57 0 58 1 Laguna,... Decomposition and the Hilbert Spectrum for Nonlinear and Non-Stationary Time Series Analysis, ” Proc R Soc Lond A, Vol 454 , 1998, pp 903–9 95 Huang, N E., and S S Shen, The Hilbert-Huang Transform and Its Applications, Singapore: World Scientific Publishing Company, 20 05, http://www.worldscibooks.com/ mathematics/etextbook /58 62/ Golub, G H., and C F Van Loan, Matrix Computations, 2nd ed., Oxford, U.K.: North Oxford... Component Analysis, New York: Springer-Verlag, 2000 Penny, W., S Roberts, and R Everson, “Hidden Markov Independent Components for Biosignal Analysis, ” Proc of MEDSIP-2000, 2000 P1: Shashi August 30, 2006 11 :5 170 Chan-Horizon Azuaje˙Book Linear Filtering Methods [44] [ 45] [46] Everson, R., and S Roberts, “Particle Filters for Nonstationary ICA,” in S Roberts and R Everson, (eds.), Independent Component Analysis: ... applets/Gibbs/index.html Greenberg, J., HST .58 2J/6 .55 5J/16. 456 J, “Design of FIR Filters by Windowing,” 1999, http://web.mit.edu/6 .55 5/www/fir.html Antoniadis, A., and J Fan, “Regularization of Wavelet Approximations,” Journal of the American Statistical Association, Vol 96, No 455 , 2001, pp 939–967 Hilton, M F., et al., “Evaluation of Frequency and Time-Frequency Spectral Analysis of Heart Rate Variability... It is at this point that we can perform a removal of the noise sources Columns of W−1 that correspond to noise and/ or artifact [signal 2 and signal 3 on Figure 5. 11(b) in this case] are set to P1: Shashi August 30, 2006 11 :5 Chan-Horizon Azuaje˙Book 5. 4 Data- Determined Basis Functions 1 65 Figure 5. 12 Ten seconds of data after ICA decomposition (see Figure 5. 11), and reconstruction with noise channels... distorting the underlying ECG 5. 4.3.4 ICA for Removing Noise on the ECG For the application of ICA for noise removal from the ECG, there is an added complication; the sources (that correspond to cardiac sources) have undergone a context-dependent transformation that depends on the signal within the analysis window Therefore, the sources are not clinically relevant ECGs, and the transformation must be inverted... B., and R G Mark, “QRS Morphology Representation and Noise Estimation Using the Karhunen-Lo` ve Transform,” Computers in Cardiology, 1989, pp 269–272 e Mark, R G., and G B Moody, ECG Arrhythmia Analysis: Design and Evaluation Strategies,” Chapter 18 in I Gath and G F Inbar, (eds.), Advances in Processing and Pattern Analysis of Biological Signals, New York: Plenum Press, 1996, pp 251 –272 Clifford,... 11 :5 Chan-Horizon Azuaje˙Book 5. 4 Data- Determined Basis Functions 155 where W1 are the input-to-hidden layer weights and a is derived from a power series expansion of the activation function, fa (x) ≈ a0 + a1 x for small x For a linear activation function, as in this application, a0 = 0, a1 = 1 The bias weights given in [24] reduce to −1 w1b = −a1 FUT µ X = −UT µ X j j w2b = µ X − a0 U j F−1 = µ X (5. 24)... alternative measure of noise reduction performance is given by a measure of the linear correlation between the cleaned signal, zn , and the original noise-free signal, xn The cross-correlation coefficient ρ between xn and zn is given by [51 ] ρ= [xn − µx ][zn − µz ] σ x σz (6.13) where µx and σx are the mean and standard deviation of xn , and µz and σz are the mean and standard deviation of zn A value of... Electrocardiogram Using an Auto-Associative Neural Network,” Neural Processing Letters, Vol 14, No 1, 2001, pp 15 25 Golub, G H., “Least Squares, Singular Values and Matrix Approximations,” Applikace Matematiky, No 13, 1968, pp 44 51 P1: Shashi August 30, 2006 11 :5 Chan-Horizon Azuaje˙Book 5. 5 Summary and Conclusions [24] [ 25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [ 35] [36] [37] [38] [39] [40] . Wavelet Filtering 147 Figure 5. 6 Raw ECG with 50 Hz mains noise, IIR 50 -Hz notch filtered ECG, 0. 1- to 4 5- Hz band- pass filtered ECG and bior3.3 wavelet filtered ECG. The left-most arrow indicates the. [24] W 1 = a −1 1 FU T j (5. 23) P1: Shashi August 30, 2006 11 :5 Chan-Horizon Azuaje˙Book 5. 4 Data- Determined Basis Functions 155 where W 1 are the input-to-hidden layer weights and a is derived from. 11 :5 Chan-Horizon Azuaje˙Book 152 Linear Filtering Methods Figure 5. 8 Layout of a D-p-D auto-associative neural network. inputs and outputs; the target data vector is simply the input data vector.

Advanced Methods and Tools for ECG Data Analysis - Part 5 pdf

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan