MATLAB for time series analysis e.g. M/EEG, ERP, ECG, EMG, fmri or anything else that shows variation over time Written by Joe Bathelt, MSc PhD candidate Developmental Cognitive Neuroscience Unit UCL Institute of Child Health
Introduction Cognitive Neuroscience data often consists of time varying signals. These might be changes in amplitude over time like the BOLD signal or changes in voltage or magnetic field strength for other imaging modalities. The analysis of these signals is very similar. We are often interested in the difference in peak or mean amplitude between conditions or participant groups. Fortunately, it is quite easy to extract these measures with MATLAB and it also allows the user to produce beautiful figures to visualise the data. The added benefit is that the we can write scripts that automate a lot of the work and decrease the influence of user errors. This tutorial will explain how to do some of the analysis in MATLAB itself. They are toolboxes with and without graphical user interfaces for many aspects of preprocessing and visualisation of data that are specific to each method. A list of toolboxes can be found in the appendix. Most of these toolboxes offer extensive documentation and often online tutorials or even video instructions. Therefore, this tutorial will mostly be concerned with the analysis and visualisation of data that is already preprocessed. We will learn how to plot data, extract measures and perform simple statistical tests in both the time and frequency domain. This handout is structured in two parts: In the first section, you will find descriptions of the analyses with practices. In the second section, you can find the solution to all practices. Please make use of the MATLAB documentation with the help function. Also refer to the function reference in the Introduction to MATLAB handout. If you have any questions about the content of this tutorial, don t hesitate to contact me: johannes.bathelt.10@ucl.ac.uk
Analysis in the time domain 1.1 Plotting time series For the analysis of time series, it is often a good idea to plot the signal to get an idea of the data. Further, figures are often needed for reports or other publications. Matlab offers many tools that are useful for visualising time varying signals. Task 1.1.1: Load the sample data called Condition1_ERPs. This contains the ERPs for a single channels for all participants in the study. Each row represents the data from one participant and each column is the amplitude of the signal (in µv) at one sampling point. The sampling rate is 250 Hz. The time ranges from 200ms before the stimulus onset to 600ms after stimulus onset. Plot the individual ERP of the first participant in the sample. Adjust the x-axis so that it displays time in ms. Plot the average ERP of the whole sample. Subtract the activity in the first 200ms (baseline) from the individual ERPs before averaging (baseline correction) Fig. 1: ERP of Participant 1 Fig. 2: Grand-average ERP Fig. 3: baseline-corrected ERP Task 1.1.2: Open the file Condition2_ERPs. This file contains a matrix with individual ERPs from the same participants and the same channel, but for a different condition. Fig. 4: both conditions Fig. 5: Standard Error Fig. 6: ERP with SE
Plot the average ERP of condition 1 and condition 2 in the same figure. Also, try to add a vertical line that illustrates the stimulus onset at 0ms. This figure shows the difference between the conditions, but we would also like to see the variation of the ERPs to get an idea if the differences between grand-mean waveforms is significant. Calculate the standard error of the grand-mean ERP for both condition. Hint: The standard error is the standard deviation divided by the square root of the number of observations (SE=std/sqrt(n)). Plot both signals with their standard error (positive and negative) in the same figure. Bonus: Use the function jbfill to make the plot look nicer For publication, we often apply filters to make the ERP look smoother in the figures. Filtering is also used during preprocessing to reduce high-frequency noise or slow baseline shifts. Apply a low-pass 6 th order Butterworth filter with a 15Hz cut-off to the grand-mean ERP using the function butter and plot the result. Fig. 7: Filtered ERP 1.2 Time domain measures We often derive measures to characterise a waveform and perform statistical tests to compare them. The most commonly used measures are maxmima and mimima, area measures and mean amplitudes. Often we only want to find these within a specific time window. In the following section, we will use MATLAB to calculate these measure in the example data. Task 1.2.1: The figures that we obtained in the last exercises suggest that the waveform has at least two clearly identifiable peak. Given the variability of the waveform, the conditions might be characterised by different peak amplitudes. Obtain the peak amplitude between 80 and 120ms (P1 window) and the peak amplitude between 130 and 200ms (N170 window) after stimulus onset for both conditions for all participants. What is the average mean amplitude for all participants in both conditions? What is the standard error? NB: Use the baseline corrected data.
Perform a two-sample t-test to see if the difference in peak amplitude between the conditions is significant Obtain the latencies of the peak in both conditions for both components Bonus: If you are super-eager or bored, create a bar graph with error bars to visualise the data Task 1.2.2: Peak measures are often misleading, because the amplitude and latency of the peak can be influences by local maxima that are independent of the underlying brain activity. Therefore, area measures are often used to characterise the activity within a given time window. Calculate the mean ERP amplitude between 80 and 120ms and 130 and 200ms Calculate the area under the curve between 80 and 120ms and 130 and 200ms. Hint: a good approximation of area under the curve is the amplitude multiplied by the number of time points, i.e. there is no need to calculate the mathematical integral. 2 Frequency domain analysis In addition to information in the time domain, we are often interested in the frequency content of a signal. We are using something called a Fourier transform to calculate the power of each frequency within a signal. There are functions in Matlab to calculate the Fast Fourier Transform of a signal (fft), but they can be quite difficult to use. Therefore, we use functions of the EEGLAB toolbox. However, note that the same functions can be applied to any type of time series signal whether EEG or not. Task 2.1 Use the function spectopo to calculate the power spectrum (frequency vs power) of the ERPs in both conditions. Also, produce a plot of the mean power spectrum in each condition. Note that the signal has been filtered during the preprocessing with a low-pass cut-off at 30Hz. Plot the spectrum between 0 and 30Hz. Fig. 8: Power spectrum Task 2.2: Extract the frequency power between 8 and 12Hz (adult alpha range). Fig. 9: Zoomed spectrum
Solutions Task 1.1.1: srate = 250; time = linspace(-200,1000*(length(condition1_erps)/ srate)-200,length(condition1_erps)); plot(time,condition1_erps(1,:)) xlabel('time [ms]') ylabel('erp amplitude [/mv]') plot(time,mean(condition1_erps)) baseline_means = mean(condition1_erps(:,1:round(0.2*srate)),2); baseline_means = repmat(baseline_means,1,length(condition1_erps)); condition1_erps_blcorrected = condition1_erps - baseline_means; plot(time,mean(condition1_erps_blcorrected)) Task 1.1.2: plot(time,mean(condition1_erps_blcorrected),'b',time,mean(condition2_erps_blcorr ected),'r') legend('condition 1','condition 2') xlabel('time [ms]') ylabel('erp amplitude [/mv]') ylim([-6 6]) yl = get(gca,'ylim'); h = line([0 0],yL,'Color','k'); set(h,'linestyle','--') condition1_se = std(condition1_erps_blcorrected)/ sqrt(length(condition1_erps_blcorrected)); condition2_se = std(condition2_erps_blcorrected)/ sqrt(length(condition2_erps_blcorrected)); figure plot(time,mean(condition1_erps_blcorrected),'b',time,mean(condition2_erps_blcorr ected),'r') hold on plot(time,mean(condition1_erps_blcorrected)+condition1_se,'-- b',time,mean(condition1_erps_blcorrected)-condition1_se,'--b') plot(time,mean(condition2_erps_blcorrected)+condition2_se,'-- r',time,mean(condition2_erps_blcorrected)-condition2_se,'--r') plot(time,mean(condition1_erps_blcorrected),'b',time,mean(condition2_erps_blcorr ected),'r') hold on jbfill(time,mean(condition1_erps_blcorrected) +condition1_se,mean(condition1_erps_blcorrected)-condition1_se,'b','b',0.5); jbfill(time,mean(condition1_erps_blcorrected) +condition1_se,mean(condition1_erps_blcorrected)-condition1_se,'b','b',0.5); condition1_grand_mean = mean(condition1_erps);
condition2_grand_mean = mean(condition2_erps); [B,A] = butter(6,2*15/srate,'low'); condition1_grand_mean_filtered = filter(b,a,condition1_grand_mean); condition2_grand_mean_filtered = filter(b,a,condition2_grand_mean); condition1_se = std(condition1_erps)/sqrt(length(condition1_erps)); condition2_se = std(condition2_erps)/sqrt(length(condition2_erps)); condition1_se_filtered = filter(b,a,condition1_se); condition2_se_filtered = filter(b,a,condition2_se); plot(time,condition1_grand_mean_filtered,'b',time,condition2_grand_mean_filtered,'r') hold on jbfill(time,condition1_grand_mean_filtered +condition1_se_filtered,condition1_grand_mean_filteredcondition1_se_filtered,'b','b',0.5); jbfill(time,condition2_grand_mean_filtered +condition2_se_filtered,condition2_grand_mean_filteredcondition2_se_filtered,'r','r',0.5); Task 1.2.1: condition1_p1_peak = max(condition1_erps_blcorrected(:,0.08*srate+0.2*srate: 0.12*srate+0.2*srate),[],2); mean(condition1_p1_peak); ans = 2.0910 condition1_p1_se = std(condition1_p1_peak)/sqrt(length(condition1_p1_peak)); ans = 0.4698 Other solutions work in analogous to the above code. Solutions: Condition 1 Condition 2 Mean SE Mean SE P1 - peak 2.091 0.4698 1.6305 0.6047 N170 - peak -3.3948 0.5879-3.8664 0.5992 P1 - latency 114.8571 2.8571 116.1905 2.2505 N170 - latency 147.7143 1.9629 152.0952 5.5592 P1 - mean amplitude N170 - mean amplitude -0.0581 0.2531 0.6585 0.2598-1.0196 0.4826-1.3959 0.543 P1 - area [µv 2 ] 249.71 50.72 322.05 62.99 N170 - area [µv 2 ] 657.52 138.16 831.99 146.44
ttest2(condition1_p1_peak,condition2_p1_peak) pval: 0.550927 ans = 0 [condition1_p1_peak,condition1_p1_latency] = max(condition1_erps_blcorrected(:, 0.08*srate+0.2*srate:0.12*srate+0.2*srate),[],2); condition1_p1_latency = 1000*(condition1_P1_latency/srate + 0.08); % peak latencies from stimulus onset in ms Task 1.2.2: condition1_p1_average = mean(condition1_erps_blcorrected(:,0.08*srate+0.2*srate: 0.12*srate+0.2*srate),2); condition1_p1_area = sum(condition1_erps_blcorrected(:,0.08*srate+0.2*srate: 0.12*srate+0.2*srate).*length(condition1_ERPs_blcorrected(:,0.08*srate +0.2*srate:0.12*srate+0.2*srate)),2); The other calculations work similar to this one. You can find the solutions in the table above. Task 2.1 [condition1_spectra,condition1_freqs] = spectopo(condition1_erps,0,srate); [condition2_spectra,condition2_freqs] = spectopo(condition2_erps,0,srate); figure condition1_mean_spectrum = mean(condition1_spectra); condition2_mean_spectrum = mean(condition2_spectra); plot(condition1_freqs,condition1_mean_spectrum,'b',condition2_freqs,condition2_m ean_spectrum,'r') [stopx,stopy] = find(abs(condition1_freqs - 30) == min(abs(condition1_freqs - 30))); plot(condition1_freqs(1:stopx),condition1_mean_spectrum(1:stopx),'b',condition2_ freqs(1:stopx),condition2_mean_spectrum(1:stopx),'r') Task 2.2 [startx,starty] = find(abs(condition1_freqs - 8) == min(abs(condition1_freqs - 8))); [stopx,stopy] = find(abs(condition1_freqs - 12) == min(abs(condition1_freqs - 12))); condition1_alpha_power = sum(condition1_spectra(startx:stopx)); condition2_alpha_power = sum(condition2_spectra(startx:stopx)); condition1_alpha_power = -117.2340 condition2_alpha_power = -135.2624