Diff of /datasets/README.md [000000] .. [d6904d]

Switch to unified view

a b/datasets/README.md
1
# Dataset
2
3
## Supported datasets
4
5
- [x] Tongji Hospital (aka. TJH dataset)
6
- [x] HM Hospital (aka. CDSL dataset)
7
8
## Folder structure
9
10
```shell
11
datasets/
12
    tongji/
13
        preprocess.py
14
        raw_data/
15
            ...
16
        processed_data/
17
            fold_{i=0,1,2,...}/
18
                train/
19
                    x.pkl
20
                    y.pkl
21
                    statistics.csv
22
                val/
23
                    x.pkl
24
                    y.pkl
25
                test/
26
                    x.pkl
27
                    y.pkl
28
    hm/
29
        preprocess.py
30
        ...
31
```
32
33
We also provide a Jupyter format pre-process script (`preprocess.ipynb`)
34
35
## TJH Dataset
36
37
> Refer: [An interpretable mortality prediction model for COVID-19 patients](https://www.nature.com/articles/s42256-020-0180-7)
38
39
### How to Access Data
40
41
Download Link: [Supplementary Data 1 Training and external test datasets](https://static-content.springer.com/esm/art%3A10.1038%2Fs42256-020-0180-7/MediaObjects/42256_2020_180_MOESM3_ESM.zip) (refer supplementary information of the paper above)
42
43
Or you can download raw data (Excel format) and processed data (Python PKL format) in GitHub Releases Assets.
44
45
## CDSL Dataset
46
47
You need to apply for the CDSL dataset if you need. [Link: Covid Data Save Lives Dataset](https://www.hmhospitales.com/coronavirus/covid-data-save-lives/english-version)