--- a
+++ b/README.md
@@ -0,0 +1,25 @@
+# MedHMP
+Source codes of the paper "Hierarchical Pretraining on Multimodal Electronic Health Records" (https://aclanthology.org/2023.emnlp-main.171.pdf). 
+
+## Datasets
+
+The original datasets of MIMIC-III and MIMIC-IV can be found in https://physionet.org/content/mimiciii/1.4/ and https://physionet.org/content/mimiciv/2.2/. Clinical notes of MIMIC-IV is available at https://physionet.org/content/mimic-iv-note/2.2/. Permission from the data owner is required.
+
+## Data Preprocessing
+
+Codes for data preprocessing can be accessed at https://github.com/MLD3/FIDDLE-experiments/tree/master/mimic3_experiments. 
+
+
+## Acknowledgement
+
+If you find any source provided in this repo or our paper are useful, please cite our paper using this BibTex:
+
+```bibtex
+@inproceedings{wang2023hierarchical,
+  title={Hierarchical Pretraining on Multimodal Electronic Health Records},
+  author={Wang, Xiaochen and Luo, Junyu and Wang, Jiaqi and Yin, Ziyi and Cui, Suhan and Zhong, Yuan and Wang, Yaqing and Ma, Fenglong},
+  booktitle={Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing},
+  pages={2839--2852},
+  year={2023}
+}
+```