https://www.kaggle.com/c/rsna-intracranial-hemorrhage-detection
A video of our solution can be found here: https://www.youtube.com/watch?v=1zLBxwTAcAs
Modify the data paths at the top of data_prep.py
, datasets.py
& model.py
Run data_prep.py
. This takes around 12-15 hours for each set of images and will create:
train_metadata.parquet.gzip
stage_1_test_metadata.parquet.gzip
stage_2_test_metadata.parquet.gzip
train_triplets.csv
stage_1_test_triplets.csv
stage_1_test_triplets.csv
png
for all 3 stages containing the preprocessed & cropped imagesRun batch_run.sh
. This will train (using 5 fold CV) and make submission files (as well as
out-of fold predictions) using:
This will create a timestamped folder in the OUTPUT_DIR
containing:
* A submission file
* Out-of-fold (OOF) predictions (for later stacking models)
* Model checkpoints for each fold
* QC plots (e.g. ROC curves, train/validation loss curves)
To infer on different datasets:
config.yml
set stage
to either test1
or test2
checkpoint
eneter the name of the timestamped folder containing the model checkpointsbatch_run.sh
. A new output directory will be created with the ICHDataset
found in datasets.py
will need to be modifiedFirst the metadata is collected from the individual DICOM images. This allows the studies to be
grouped by PatientID
which is important for a stable cross validation due to overlapping patients
Based on StudyInstanceUID
and sorting on ImagePositionPatient
it is possible to reconstruct
3D volumes for each study. However since each study contained a variable number of axial slices
(between 20-60) this makes it difficult to create a architecture that implements 3D convolutions.
Instead, triplets of images were created from the 3D volumes to represent the RGB channels of an
image, i.e. the green channel being the target image and the red & blue channels being the adjacent
images. If an image was at the edge of the volume, then the green channel was repeated. This is
essentially a 3D volume but only using 3 axial slices. At this stage no windowing was applied
and the image is retained in Hounsfield units.
The images then had the objects labeled using scipy.ndimage.label
which looks for groups of
connected pixels. The group with the second largest number of pixels was assumed to be the head
(the first largest group is the background). This removes most of the dead space and the headrest
of the CT scanner. A 10 pixel border was retained to keep some space for rotation augmentations
The images were clipped between 0-255 Hounsfield units and saved as 8-bit PNG files to pass to
the PyTorch dataset object. The reason for this was a) most of the interesting features are
between 0-255 HU so we shouldn't be losing too much detail and b) this makes it easier to try
different windows without recreating the images. I also found that processing DICOM images on the
fly was too slow to keep 2 GPUs busy when small images/large batch sizes were used, (224x224,
batch size=256) which is why I went down the PNG route.
The images are then windowed once loaded into the dataset. A subdural window with
window_width, window_length = 200, 80
was used on all 3 channels.
An alternative model using different CT windowing for each channel is also used. Here the windows
are applied when the PNG images are made according to the channels in prepare_png
found in
data_prep.py
. The images are cropped in the same way above. The windows used were :
* Brain - window_width, window_length = 80, 40
* Subdural - window_width, window_length = 200, 80
* Bone - window_width, window_length = 2000, 600
Image augmentations:
RandomHorizontalFlip(p=0.5)
RandomRotation(degrees=15)
RandomResizedCrop(scale=(0.85, 1.0), ratio=(0.8, 1.2))
The models were then trained as follows:
PatientID
. See the section below on the CV scheme.weights=[2, 1, 1, 1, 1, 1]
Postprocessing:
The CV scheme was fixed using 5 pairs of CSV files agreed by the team in the format train_n.csv
& valid_n.csv
. The first version of these files were designed to prevent the same patient appearing
in train & validation sets. A second version of these files were made removing patients that were present
in the train & stage 1 test sets to prevent fitting to the overlapping patients on the stage 1 public LB.
Some of these models are trained with the V1 scheme and others with the V2 scheme (most of the team used
the latter)
These CSV files are included in a file called team_folds.zip
and should be in the same folder
as the the rest of the input data. The scheme is selected using the cv_scheme
value in the config file