--- a
+++ b/datasets/README.md
@@ -0,0 +1,47 @@
+# Dataset
+
+## Supported datasets
+
+- [x] Tongji Hospital (aka. TJH dataset)
+- [x] HM Hospital (aka. CDSL dataset)
+
+## Folder structure
+
+```shell
+datasets/
+    tongji/
+        preprocess.py
+        raw_data/
+            ...
+        processed_data/
+            fold_{i=0,1,2,...}/
+                train/
+                    x.pkl
+                    y.pkl
+                    statistics.csv
+                val/
+                    x.pkl
+                    y.pkl
+                test/
+                    x.pkl
+                    y.pkl
+    hm/
+        preprocess.py
+        ...
+```
+
+We also provide a Jupyter format pre-process script (`preprocess.ipynb`)
+
+## TJH Dataset
+
+> Refer: [An interpretable mortality prediction model for COVID-19 patients](https://www.nature.com/articles/s42256-020-0180-7)
+
+### How to Access Data
+
+Download Link: [Supplementary Data 1 Training and external test datasets](https://static-content.springer.com/esm/art%3A10.1038%2Fs42256-020-0180-7/MediaObjects/42256_2020_180_MOESM3_ESM.zip) (refer supplementary information of the paper above)
+
+Or you can download raw data (Excel format) and processed data (Python PKL format) in GitHub Releases Assets.
+
+## CDSL Dataset
+
+You need to apply for the CDSL dataset if you need. [Link: Covid Data Save Lives Dataset](https://www.hmhospitales.com/coronavirus/covid-data-save-lives/english-version)