The results using the archived
branch (results of the paper):
DDI: 0.0589 (0.0005) Ja: 0.5213 (0.0030) F1: 0.6768 (0.0027) PRAUC: 0.7647 (0.0025)
The results using this master
branch (@J-Zhangg obtained this in issue #23, thanks):
```
# When the learning rate is set to 5e-4:
DDI: 0.0632 (0.0003) Ja: 0.5114 (0.0026) F1: 0.6676 (0.0023) PRAUC: 0.7649 (0.0028)
DDI: 0.0607 (0.0005) Ja: 0.5089 (0.0022) F1: 0.6659 (0.0019) PRAUC: 0.7632 (0.0022)
```
[Implementation difference] Here are two main differences:
[which branch to use?] General guidance:
1. This master
branch contains more descriptions (to learn how to use our codes), and the folder structures are very similar to archived
branch.
2. Use the archived
branch to reproduce the results in the paper.
data/
Input/
(extracted from external resources)Output/
src/
Note that we previously use ./data/get_SMILES.py for getting SMILES strings from drugbank. However, due to the web structure change of drugbank, this crawler is not used in the current pipeline. Now, we are using drugbank_drugs_info.csv to obtain the SMILES string for each ATC3 code, thus, the data statistics differ a bit from the paper. The current statistics are shown below:
#patients 6350
#clinical events 15032
#diagnosis 1958
#med 112
#procedure 1430
#avg of diagnoses 10.5089143161256
#avg of medicines 11.647751463544438
#avg of procedures 3.8436668440659925
#avg of vists 2.367244094488189
#max of diagnoses 128
#max of medicines 64
#max of procedures 50
#max of visit 29
ndc->drugname
mappingndc->RXCUI
mapping (now we have RXCUI->drugname
)RXCUI->atc4
mapping, then change atc4
to atc3
(now we have atc3->drugname
)drug->SMILES
mapping (now we have atc3->SMILES
)atc3
is a coarse-granular drug classification, one atc3
code contains multiple SMILES strings.conda create -c conda-forge -n SafeDrug rdkit
conda activate SafeDrug
# can also use the following in your current env
pip install rdkit-pypi
pip install scikit-learn, dill, dnc
Note that torch setup may vary according to GPU hardware. Generally, run the following
pip install torch
If you are using RTX 3090, then plase use the following, which is the right way to make torch work.
python3 -m pip install --user torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install [xxx] # any required package if necessary, maybe do not specify the version, the packages should be compatible with rdkit
Here is a list of reference versions for all package
pandas: 1.3.0
dill: 0.3.4
torch: 1.8.0+cu111
rdkit: 2021.03.4
scikit-learn: 0.24.2
numpy: 1.21.1
Let us know any of the package dependency issue. Please pay special attention to pandas, some report that a high version of pandas would raise error for dill loading.
python
cd ./data
wget -r -N -c -np --user [account] --ask-password https://physionet.org/files/mimiciii/1.4/
python
cd ./physionet.org/files/mimiciii/1.4
gzip -d PROCEDURES_ICD.csv.gz # procedure information
gzip -d PRESCRIPTIONS.csv.gz # prescription information
gzip -d DIAGNOSES_ICD.csv.gz # diagnosis information
download the DDI file and move it to the data folder
download https://drive.google.com/file/d/1mnPc0O0ztz0fkv3HF-dpmBb8PLWsEoDz/view?usp=sharing
python
mv drug-DDI.csv ./data
processing the data to get a complete records_final.pkl
```python
cd ./data
vim processing.py
# line 323-325
# med_file = './physionet.org/files/mimiciii/1.4/PRESCRIPTIONS.csv'
# diag_file = './physionet.org/files/mimiciii/1.4/DIAGNOSES_ICD.csv'
# procedure_file = './physionet.org/files/mimiciii/1.4/PROCEDURES_ICD.csv'
python processing.py
```
python SafeDrug.py
here is the argument:
usage: SafeDrug.py [-h] [--Test] [--model_name MODEL_NAME]
[--resume_path RESUME_PATH] [--lr LR]
[--target_ddi TARGET_DDI] [--kp KP] [--dim DIM]
optional arguments:
-h, --help show this help message and exit
--Test test mode
--model_name MODEL_NAME
model name
--resume_path RESUME_PATH
resume path
--lr LR learning rate
--target_ddi TARGET_DDI
target ddi
--kp KP coefficient of P signal
--dim DIM dimension
@inproceedings{yang2021safedrug,
title = {SafeDrug: Dual Molecular Graph Encoders for Safe Drug Recommendations},
author = {Yang, Chaoqi and Xiao, Cao and Ma, Fenglong and Glass, Lucas and Sun, Jimeng},
booktitle = {Proceedings of the Thirtieth International Joint Conference on
Artificial Intelligence, {IJCAI} 2021},
year = {2021}
}
Welcome to contact me chaoqiy2@illinois.edu for any question. Partial credit to https://github.com/sjy1203/GAMENet.