|
a |
|
b/2-Generating Synthetic ECG Data/README.md |
|
|
1 |
# Generating Synthetic ECG Data |
|
|
2 |
|
|
|
3 |
Each of our ECG data consists of 12 leads, with a duration of 10 seconds and a frequency of 250 Hz. Therefore, each sample is a data matrix of 12 by 2500. In our case, we worked on 2 sets |
|
|
4 |
of data, one was synthetic, one was real clinical data. Although we cannot share our real clinical data, the simulation code is accessible. |
|
|
5 |
|
|
|
6 |
***Simulation Code*** |
|
|
7 |
|
|
|
8 |
The simlation part includes two files: **'ecg_simulation_multichannel.py'** and **'parameters. py'**. |
|
|
9 |
|
|
|
10 |
**Instruction**: |
|
|
11 |
- Install package **neurokit2** |
|
|
12 |
- Open **sample_code_for_simulation.ipynb** which is a short sample notebook to show the way to output synthetic data. |
|
|
13 |
- Function **simulation** is all you need. |
|
|
14 |
1. Input:\ |
|
|
15 |
**normal_N**: the number of normal ECG data;\ |
|
|
16 |
**abnormal_N**: the number of abnormal ECG data;\ |
|
|
17 |
**save_params**: whether save parameters for each ecg sample, default value is *False*. |
|
|
18 |
2. Output:\ |
|
|
19 |
**sim_ecg_data.npy**: output file, `shape = (normal_N + abnormal_N, 12, sampling_rate*duration)`;\ |
|
|
20 |
**sim_ecg_labels.npy**: output file to save labels, `shape = (normal_N + abnormal_N, )`;\ |
|
|
21 |
**sim_ecg_params.npy**: depend on **save_params**, file to save parameters for each ecg sample, `shape = (normal_N + abnormal_N, )`. |
|
|
22 |
The saved data is already **shuffled**. |
|
|
23 |
- If you want more customized ECG data, please check **parameters. py** file. All the parameters' definitions are in the following table (_1 stands for normal, _2 stands for abnormal): |
|
|
24 |
|
|
|
25 |
| |Parameter |Meaning | |
|
|
26 |
|----------------|-------------------------------|-----------------------------| |
|
|
27 |
|1|sampling_rate |sampling rate, default 250 Hz | |
|
|
28 |
|2 |duration|default 10s| |
|
|
29 |
|3 |gamma|a (12,5) matrix to modify each lead's five spikes' amplitudes| |
|
|
30 |
|4 |mu_hr_1 |the mean of heart rate | |
|
|
31 |
|5 |sigma_hr_1 |the variance of heart rate | |
|
|
32 |
|6 |min_noise_1, max_noise_1 |the max value and min value of noise | |
|
|
33 |
|7|t |the starting position along the circle of each interval in radius | |
|
|
34 |
|8 |a |the amplitude of each spike; b, the width of each spike | |
|
|
35 |
|9 |b|the width of each spike| |
|
|
36 |
|
|
|
37 |
For a better understanding of the above parameters, please read the following. |
|
|
38 |
[<img src="./3D.png" width="500"/>] |
|
|
39 |
[<img src="./table.png" width="500"/>](table) |
|
|
40 |
|
|
|
41 |
We can see from the above table that each interval in the 3D trajectory can be fixed by 3 parameters: the starting position |
|
|
42 |
<img src="https://render.githubusercontent.com/render/math?math=\theta_i/t_i "> along the circle of each interval in radius, The amplitude of each spike a and the width of each wave b. By altering these 3 parameters we can change the shape of the 3D trajectory and thus change the waveform of the resulting ECG. |
|
|
43 |
|
|
|
44 |
|
|
|
45 |
**Prebuilt Synthetic Data** |
|
|
46 |
Prebuilt synthetic data can be found at the following link |
|
|
47 |
https://drive.google.com/drive/folders/1iqyAlyHAvNWdOvjEGn8Y6C8kItsdZ_GC?usp=sharing |
|
|
48 |
|
|
|
49 |
|