Fetal ECG Extraction Using Independent Component Analysis German Borda Department of Electrical Engineering, George Mason University, Fairfax, VA, 23 Abstract: An electrocardiogram (ECG) signal contains the electrical activity generated by the contraction of heart muscles. An ECG signal can provide a lot of information about an individual's heart condition and health. That being said, principles of ECG analysis can also be applied to the electrocardiogram of a fetus (FECG) in order to accurately monitor heart development and health during the different stages of pregnancy and even during labor. The biggest constraint however, is the fact that accurate and non-invasive methods for detecting FECG signals have not been developed and standardized. Therefore, in this paper we propose the use of Independent Component Analysis (ICA) as a method to perform blind source separation and retrieve the fetal ECG signal from the maternal ECG. without the presence of pain and/or additional electric current [2]. A filtered ECG signal (Figure 1) from a healthy patient contains important pieces of information about the function of the heart. The P-wave segment represents atrium contraction, the QRS complex gives information about ventricle contraction, the T-wave represents ventricle relaxation, and the interval between R peaks is the time separation between palpitations, ventricular contractions [6]. Keywords: Electrocardiogram, Independent Component Analysis, Blind Source Separation. Introduction An Electrocardiogram (ECG) is a periodic signal that represents recorded electric potentials that heart tissue generates in order produce heart contractions needed to supply oxygenated blood to the rest of the peripheral circulatory system of the human body [1]. With all the improvements in Electrocardiography technology and Electrical Engineering throughout the years, it is very easy to obtain a clean ECG signal from a patient Figure 1: This figure shows one period of a typical clean ECG signal (Adapted from [6]). Since important information about a patient's heart condition and function can be obtained from looking at specific characteristics of the wave form, ECG has become the gold standard in diagnostic cardiology. In fact, ECG analysis can easily identify: patient's heart rate, irregular palpitations, heart hypertrophy, arrhythmias, and heart failure among other conditions [2,3]. That being said, principles of ECG analysis can 1
also be used to evaluate the heart performance and development from a fetus in the womb during the different stages of pregnancy, as well as delivery, in order to assess the fetus health and potentially detect heart diseases. From a clinical stand point, an early diagnosis is essential in order to choose appropriate treatment and prevent further damage and complications [8]. Even though methods in ECG wave analysis have been standardized and optimized during the years, accurate and non-invasive fetal ECG extraction is still the biggest roadblock in this area. The most suitable method to extract a fetal electrocardiogram (FECG) is to take measurements from the mother's abdomen. However, this particular signal is heavily mixed with the maternal ECG (MECG), electric potentials from surrounding abdominal muscles, and or 6 Hz noise from power lines nearby [,6]. Therefore, we propose the use of Independent Component Analysis (ICA) for accurate FECG filtering and separation. Independent Component Analysis (ICA) has proven to be a very powerful machine learning algorithm to perform signal separation. Essentially, this method tries to find a weighted linear combination function that best represents the multivariate (mixed) data [7]. Given that we observe several different mixed signals, we can write formulate the following equation. the values of s j and w jn using only the multivariate signal [7]. That being said, we can turn the summation in equation (1) into a more practical form. x= Ws (2) Where x is a vector containing every mixed value, s a vector containing each independent elements and W, also called the mixing matrix, containing the values of each weight [7]. After performing ICA to estimate the mixing matrix W, one can easily calculate the independent components by taking the inverse of W and setting the following equation [7]. s= W -1 x (3) In order to carry a successful use of ICA a couple of important guidelines and restrictions should be noted: 1) each signal x j and s n are treated as random variables rather than actual time signals. 2) we assume each s n are independent from each other. 3) we must make sure that the input mixed signals x j do not possess Gaussian distributions [7]. One of the reasons we think that the use of ICA is suitable to tackle this problem is because of the famous model of electric activity of the heart created by Burger and Van Milaan, the lead concept model. Essentially, this concept states that each signal recorded from a particular electrode is a linear combination of bio-electrical signals running through the human body [8]. Methods and Data Where x j represents each mixed signal observed, s n represents each independent component and w jn as their respective weight. Essentially, when performing blind source separation via ICA, the only parameter we know is x j. Therefore, the algorithm tries to estimate 1. Data: In order to successfully carry appropriate experimentation and analysis, the Daisy database [4] provides real data collected from a pregnant women. The recordings involve data from eight channels. Five of these signals (Figure 2) represent abdominal signals (AECG) recorded from different places of the patient's 2
Amplitude mv abdomen taken at the same time, which as previously mentioned, they contain both the maternal ECG (MECG) and fetal ECG (FECG) signals. The last three signals (Figure 3) from the aforementioned database were measured from the mother's thorax, consequently, only containing MECG information. The data was sampled at 2 Hz over a 1 second time interval. There is no further information regarding the placement of the electrodes and gain amplification used. In fear that the five measured AECG signals might not be sufficient data to carry the analysis successfully, we decided to model three more AECG signals by using the three measured thoracic EMG wave forms provided to us. According to Kan et al.[] and Ananthanag et al. [9], we can generate simulated FECG signals by first grabbing each measured MECG signal, reducing its amplitude by 1 and increasing the number of periods by 2. Afterwards, the AECG can be constructed by passing the corresponding MECG vector through an ARMA filter and adding the output to the simulated FECG along with modeled noise. A graphical representation of the simulation system can be found below. Where the transfer function H (z) equals to: Abdominal Signal 1-1 2 3 4 6 7 8 9 1 Abdominal Signal 2 2-2 1 2 3 4 6 7 8 9 1 Abdominal Signal 3 1-1 1 2 3 4 6 7 8 9 1 Abdominal Signal 4-1 2 3 4 6 7 8 9 1 Abdominal Signal 1-1 1 2 3 4 6 7 8 9 1 Figure 2: Different abdominal signal measured from different places of the mother's abdomen. 1 Thoraxic Signal 1-1 1 2 3 4 6 7 8 9 1 1 Thoraxic Signal 2-1 1 2 3 4 6 7 8 9 1 1 Thoraxic Signal 3 Figure 4: Simulation system (Adapted from []). 3-1 1 2 3 4 6 7 8 9 1 Figure 3: Different thoracic signals measured from different places of the mother's chest.
Amplitude (au) 2. Methods: To obtain the best results possible, we have opted to perform a series of preprocessing techniques, in the time domain, to the data and then pass the processed signals through a FastICA algorithm [7]. 4 Centering the data Whitening the data Band-stop filtering FastICA 2.1 Centering the data: The purpose of centering the data is to simplify the computation that the ICA algorithm utilizes. This step can be achieved by simply taking the expected value of the data (or calculating the mean) and subtracting it from each element of the data vector [7]. 2.2 Whitening the data: The purpose of this step is to linearly transform the data vector x in order to maximize un-correlation and obtain data vectors of variance equal to one.a widely used method for whitening data is via Eigen Value Decomposition (EVD). One way to check if the whitening was successful is to verify if the autocorrelation matrix of the whiten data equals to the identity matrix [7]. E{xx'} = I (4) 2.3 Band-stop filtering: We created a Parks- McClellan band-stop filter in order to filter out Hz power line noise due to poly-grounded recording machine use [6,7]. 1.9.8.7.6..4.3.2.1 Magnitude Response of FIR filter 2 4 6 8 1 12 Frequency (Hz) 2.4 FastICa algorithm: This algorithm updates the weights in the direction of maximum non- Gaussianity. It does not require a step size and just like many neural algorithms it is parallel, distributed, easy to use and it does not require a lot of memory [7]. This algorithm uses approximation of genentropy as a measurement of non- Gaussianity. J(y) α [ E{G y } - E{G(v)} } 2 () Where the function G is non-quadratic. The basic idea of training this particular algorithm proposed in [7] is the following: 1. Randomly initialize the vector of weights w(i). 2. w(i+1)= E{ x g(w'x) } - E{ g'(w'x) }w Where the function g() is the derivative of G() from equation. 3. w(i) = w(i+1)/ w(i) 4. If converge is not achieved, then go back to step 2. Experiments and Results We opted to verify if each of our input signals, consisting of five measured and three simulated AECGs, possessed a non-gaussian distribution. We plotted histograms of each mixed signal in order to have a visual feel of what the distribution would look like. By looking at figure, it is not very clear whether the distributions are Gaussian or not. Therefore, we opted to use the "Chi-square goodness of fit test" to obtain a metric that will tell us if the distribution is indeed non-gaussian. We used the Matlab function chi2gof() [1] and input each of our AECG signals. In all eight cases, the NULL hypothesis was rejected, meaning that our eight input signals do not possess Guassian distribution, thus, allowing us to perform ICA analysis.
18 Distribution of abdominal signal 1 1 Distribution of abdominal signal 4 16 9 14 8 12 1 8 6 4 2 - -4-3 -2-1 1 2 3 4 7 6 4 3 2 1 - -4-3 -2-1 1 2 3 2 Distribution of abdominal signal 2 18 Distribution of abdominal signal 18 16 14 16 14 12 12 1 1 8 8 6 6 4 2-4 -2 2 4 6 8 1 12 4 2-1 -8-6 -4-2 2 4 14 Distribution of abdominal signal 3 18 Distribution of abdominal signal 6 12 16 1 14 8 6 4 2-8 -6-4 -2 2 4 12 1 8 6 4 2-8 -6-4 -2 2 4 6 8 1 12 16 Distribution of abdominal signal 7 18 Distribution of abdominal signal 8 14 16 12 14 1 12 8 1 8 6 6 4 4 2 2-1 -1-1 -1-1 - 1 Figure : This figure represents the distributions of the different AECG signals.
Amplitude (au) After making sure we could use the data provided to us, we centered each AECG signal to contain zero mean. Subsequently, we whitened each centered input signal via EVD. To verify the data was whitened correctly we calculated the autocorrelation matrix of the new centered and whitened inputs. Using Matlab, we obtained the following matrix: 4... -.. -. -... 4. -.. -... -.. -. 4.. -. -.. -. -... 4... -.. 1 By plotting the magnitude response of each AECG (Figure 6), we can observe that the frequencies around Hz have been attenuated. Upon finishing the pre-processing stage, we finally fed the processed AECG mixed signals to the FastICA algorithm all at once. We set the algorithm to give us the maximum number of independent vector it could, which in our case would be eight since that is the number of inputs we entered into the system. Independent Component 1. -. -.. 4... -. -.. -... 4.. -. -... -... 4... -. -.. -. -.. 4. We can easily observe that the autocorrelation matrix above is an identity matrix with gain 4. Consequently, we can assume that EVD was successful in whitening the data. Afterwards, we tested the Stop-band filter we created to see if it attenuates the harmonics from the Hz power line noise. Magnitude Response of filtered data -1 1 2 3 4 6 7 8 9 1 Independent Component 2 1-1 1 2 3 4 6 7 8 9 1 Independent Component 3 1-1 1 2 3 4 6 7 8 9 1 Independent Component 4 1-1 1 2 3 4 6 7 8 9 1 Independent Time Component (s) 12 1 8 6 4 Hz noise attenuated - 1 2 3 4 6 7 8 9 1 Independent Component 6-1 2 3 4 6 7 8 9 1 Independent Component 7 2 2 4 6 8 1 12 Frequency (Hz) Figure 6: Magnitude response of each AECG signal superimposed with each other. 6-1 2 3 4 6 7 8 9 1 Independent Component 8-1 2 3 4 6 7 8 9 1 Figure7: Independent Component outputs from FastICA algorithm.
AECG Signal 1 6 Plotting the output from the FastICA algorithm (Figure 7), we can observe that the system might have been able to retrieve the FECG signal successfully. More specifically, we can deduct that there is a good chance independent component 2 could be the FECG signal while components 1, 3, 4 represent MECGs. It seems that components through 8 consist of mainly different noise sources. Even though we do not have a test that will tell us that independent component (IC) 2 is indeed the FECG, we can look at certain characteristics of the signal. First of all, we can observe that IC2 is a periodic signal, just like the maternal ECG. Secondly, if the zoom-into the signal (Figure 8) we can see that IC2 contains The P- wave segment, the QRS complex and the T- peak. Finally, comparing the frequency IC2 alongside the AECG input signals we used, we can clearly observe that IC2 presents a higher frequency (more cycles during the 1 second period) and a much more attenuated amplitude than each of the AECG signals (Figure 9). More specifically, IC 2 contains 21 R peaks per 1 seconds while the rest of the AECG signals contain about 13 R peaks in the duration of 1 seconds. Independent Component 2 R - 1 2 3 4 6 7 8 9 1-1 2 3 4 6 7 8 9 1 AECG Signal 2-1 2 3 4 6 7 8 9 1 AECG Signal 3-1 2 3 4 6 7 8 9 1 AECG Signal 4-1 2 3 4 6 7 8 9 1 AECG Signal - 1 2 3 4 6 7 8 9 1 AECG Signal 6 AECG Signal 7 4 2 P T - 1 2 3 4 6 7 8 9 1 AECG Signal 8-2 -4 Q S - 1 2 3 4 6 7 8 9 1 Independent Component 2. 1 1. 2 Figure 8: Zoomed in version of IC2 showing 3 periods of the signal. 7-1 2 3 4 6 7 8 9 1 Figure 9: This figure compares the frequency and amplitude of eight AECG signals with IC2 in the same axis limits.
Discussion In synthesis, the use of ICA for blind source separation of multivariate signals yields promising results. However, there is a lot more work and research to be done. To begin with, the system presented in this paper should be tested with other ECG data from different pregnant women at different stages of pregnancy because the fetal ECG might be not be so easily detectable during the first months of gestation. Furthermore, the algorithm presented here used eight AECG signals from which five were measured and three simulated. Perhaps it would be wise to test the algorithm with eight or more real (measured) AEMGs from the same patient. Furthermore, the development of a metric or test that verifies one of the independent components is indeed a FECG signal would improve credibility of the algorithm even further. [7] A. Hyvarinen, E. Oja, "Independent Component Analysis: Algorithm and Applications," Neural Networks, 13(4-), pg. 411-3, 2 [8] Zarzoso, Vicente, Nandi Asoke K. "Nonivasive Fetal Electrocardiogram Extraction: Blind Separation Versus Adaptive Noise Cancellation" IEEE transaction on Biomedical Engineerin, vol. 8 (1), pg. 12-18, 21. [9] Ananthanag K.V.K., Sahambi J.S., "Investigation of Blind Source Separation Methods for Extraction of Fetal ECG," pg. 221-224, 23. [1]http://www.mathworks.com/help/stats/chi 2gof.html References [1]http://www.patient.co.uk/health/electrocar diogram-ecg. [2]http://house.wikia.com/wiki/Electrocardiogr am. [3]http://www.akwmedical.com/blog/whatekg-machine-and-how-does-it-work [4] De Moor B. Database for the Identification of Systems (DaISy) 1997. [Online]. Available:http://homes.esat.kuleuven.be/ smc /daisy/ [] A.Kam, A. Cohen, "Detection of Fetal ECG with IRR Adaptive Filtering and Genetic Algorithms," IEEE, 1999. pg. 2337. [6] Bokde, Pramod R., and Nitin K. Choudhari. "Implementation of Adaptive Filtering Algorithms for Removal of Noise From M ECG Signal." Int.J.Computer Technology & Applications 6.1, pg. 1-6, 21. 8