Diff of /README.md [000000] .. [fc9ccf]

Switch to unified view

a b/README.md
1
## Before cloning this repo
2
3
Make sure you have git-lfs installed:
4
5
```
6
sudo apt install git-lfs
7
git lfs install
8
```
9
10
## Start here
11
12
Directory tree:
13
14
```
15
.
16
├── data
17
│   ├── unzip
18
│   │   ├── stage_2_test_images
19
│   │   └── stage_2_train_images
20
│   ├── predictions
21
├── env
22
└── models
23
```
24
25
Set up conda env with:
26
27
```
28
conda env create -n ihd -f=env/tfgpu.yml
29
conda activate ihd
30
```
31
32
Then run `jupyter-notebook` from the repo's root dir:
33
34
```
35
jupyter notebook --no-browser --NotebookApp.iopub_msg_rate_limit=10000000000
36
```
37
38
## Steps to reproduce submission:
39
40
1. Start with NBs:
41
42
  * `0-preprocess-generate_csvs.ipynb`
43
  * `1-preprocess-brain_norm.ipynb`
44
  * `2-preprocess-pickle.ipynb`
45
46
... to pregenerate dcm metadata + diagnosis pivot tables + various pickles. 
47
48
For convenience we've already included these in the git repository so, altenatively, you can skip to step 2.
49
50
2. Train (level 1) L1 models:
51
52
a. fastai v1 library: `3a-L1-train-and-generate-predictions-fastai_v1.ipynb` to train the following architectures:
53
54
  * `resnet18`
55
  * `resnet50`
56
  * `resnet34`
57
  * `resnet101`
58
  * `densenet121`
59
60
For each architecture we need to train 5 models (each model for each of 5 different folds). 
61
62
All the variables must be set in cell #4, e.g.
63
64
```
65
model_fn = None
66
SZ = 512
67
arch = 'resnet34'
68
fold=0
69
n_folds=5
70
n_epochs = 4
71
lr = 1e-3
72
n_tta = 10
73
74
#model_fn = 'resnet34_sz512_cv0.0821_weighted_loss_fold1_of_5'
75
76
if model_fn is not None:
77
    model_fn_fold = int(model_fn[-6])-1
78
    assert model_fn_fold == fold
79
```
80
  
81
b. fastai v2 library to train subdural-focused models: same instructions as a) but use file `3b-L1-train-and-generate-predictions-fastai_v2.ipynb`
82
83
  * `resnet18`
84
  * `resnet34`
85
  * `resnet101`
86
  * `resnext50_32x4d`
87
  * `densenet121`
88
89
To train models from scratch and generate test and OOF predictions, you need to:
90
91
- Set arch to each of the archs above and train the model for each fold (set `FOLD` variable from 0 to 4 to train each fold. You NEED to train all 5 folds).
92
93
- Comment the second `model_fn` instance (this is used if you need to fine-tune an existing model)
94
95
- Execute all code except for the final section which builds CSV to send to submit single-model predictions to Kaggle (which we do NOT want to do at this stage).
96
97
The code for fastai v1 allocates batch size and finds LR dynamically, but in the fastai v2 version you need to specify your GPU memory in cell #4 as well.
98
99
For convenience: since training models takes a long time, we are providing trained models and test and OOF predictions so, altenatively, you can skip to step 3.
100
101
3. Train (level 2) L2 models and generate submission: With all the models (5 models per arch) trained and predictions in `./data/predictions`, run `4-L2-train-and-submit.ipynb` to generate the final predictions/submission.
102
103
## Resources
104
105
* Dataset visualizer: https://rsna.md.ai/annotator/project/G9qOnN0m/workspace