Diff of /HINT/README.md [000000] .. [bc9e98]

Switch to unified view

a b/HINT/README.md
1
# HINT: Learn and Inference 
2
3
4
After processing the data, we learn the Hierarchical Interaction Network (HINT) on the following four tasks. The following figure illustrates the pipeline of HINT. All the scripts are available in the folder `HINT`. 
5
6
7
<p align="center"><img src="./hint.png" alt="logo" width="810px" /></p>
8
9
10
### Tutorial (Jupyter Notebook) 
11
12
[`tutorial_HINT.ipynb`](https://github.com/futianfan/clinical-trial-outcome-prediction/blob/main/tutorial_HINT.ipynb) is a tutorial to learn and evaluate HINT step by step. 
13
14
15
16
### Phase I/II/III Prediction (Bash command line)
17
18
Phase-level prediction predicts the success probability of a single phase study. 
19
20
```bash
21
python HINT/learn_phaseI.py
22
```
23
24
25
```bash
26
python HINT/learn_phaseII.py
27
```
28
29
30
```bash
31
python HINT/learn_phaseIII.py
32
```
33
34
35
36
37
38
39
### METRICS
40
41
- **PR-AUC** (Precision-Recall Area Under Curve). Precision-Recall curves summarize the trade-off between the true positive rate and the positive predictive value for a predictive model using different probability thresholds.
42
- **F1**. The F1 score is the harmonic mean of the precision and recall.
43
- **ROC-AUC** (Area Under the Receiver Operating Characteristic Curve). ROC curve summarize the trade-off between the true positive rate and false positive rate for a predictive model using different probability thresholds. 
44
45
46
<!-- ### Result 
47
48
The empirical results are given for reference. The mean and standard deviation of 5 independent runs are reported. 
49
50
| Dataset  | PR-AUC | F1 | ROC-AUC |
51
|-----------------|-------------|-------------|------------|
52
| Phase I | 0.745 (0.009) | 0.820 (0.007) |  0.726 (0.009) |    
53
| Phase II | 0.685 (0.011) | 0.754 (0.010) | 0.698 (0.008)  |    
54
| Phase III | 0.709 (0.009) | 0.757 (0.008) | 0.784 (0.009) |    --> 
55
56
57
58
## Contact
59
60
Please contact futianfan@gmail.com for help or submit an issue. This is a joint work with [Kexin Huang](https://www.kexinhuang.com/), [Cao(Danica) Xiao](https://sites.google.com/view/danicaxiao/), Lucas M. Glass and [Jimeng Sun](http://sunlab.org/). 
61
62
63
## Code Architecture
64
65
66
- learn and inference on various task
67
  - `learn_phaseI.py`: predict whether the trial can pass phase I. 
68
  - `learn_phaseII.py`: predict whether the trial can pass phase II.
69
  - `learn_phaseIII.py`: predict whether the trial can pass phase III.
70
  - `learn_indication.py`: predict whether the trial can pass the indication (phase I-III).
71
- model architecture 
72
  - `model.py`
73
    - three model classes (`Interaction`, `HINT_nograph`, `HINTModel`), build model from simple to complex. 
74
  - `icdcode_encode.py` 
75
    - preprocess ICD-10 code, building ontology of icd-10 codes.
76
    - GRAM to model hierarchy of icd-10 code. 
77
  - `molecule_encode.py`
78
    - message passing network (MPN)
79
  - `protocol_encode.py`
80
    - protocol embeddor 
81
  - `module.py` contains standard implementation of existing neural module, e.g., highway, GCN
82
    - Highway Network 
83
    - Graph Convolutional Network (GCN) 
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102