*To run this notebook, please provide the following four file paths:*

In [0]:
path_to_train = '/path/to/train/images'
path_to_test = '/path/to/test/images'
path_to_labels = '/path/to/labels.csv'

path_to_save_model = '/path/to/save/model/to/cnn-checkpoint-{epoch:02d}-{val_accuracy:.2f}.hdf5'

## **Installing & Importing Dependencies**

In [0]:
!pip install keras

In [0]:
import tensorflow as tf
import pandas as pd

from keras import applications
from keras import optimizers
from keras import backend as k 

from keras.models import Sequential, Model
from keras.layers import Dropout, Flatten, Dense, GlobalAveragePooling2D
from keras.callbacks import ModelCheckpoint, LearningRateScheduler, TensorBoard, EarlyStopping
from keras.preprocessing.image import ImageDataGenerator

In [0]:
# Make sure that GPU is available on the machine
assert tf.test.is_gpu_available()
assert tf.test.is_built_with_cuda()

## **Reading in Data Labels**

In [0]:
labels_df = pd.read_csv(path_to_labels)

# For flow_from_dataframe to function, string datatype is required
labels_df = labels_df.astype(str)

In [0]:
# Inspect the DataFrame containing the labels
labels_df.head(3)

Unnamed: 0.1,Unnamed: 0,ID,any
0,0,ID_000039fa0.png,0
1,1,ID_00005679d.png,0
2,2,ID_00008ce3c.png,0


## **Building Transfer-Learning Model**

*As our pretrained model, we choose VGG19 with ImageNet weights. Note that include_top = False (otherwise, we would be including VGG19's final 1000-node dense softmax prediction layer)*

In [0]:
model = applications.VGG19(weights = "imagenet", include_top=False, input_shape = (128, 128, 3))

*Freeze the first three convolutional blocks, leave the last two unfrozen. Thereby, we transfer the model's knowledge of low-level features (like edges and angles) while allowing for it to learn new high-level features (like hemorrhages).*

In [0]:
for layer in model.layers[0:12]:
    layer.trainable = False

*We append our own custom layers to the end of VGG19. Note that a single sigmoid final prediction node is equivalent to two softmax final prediction nodes.*

In [0]:
x = model.output
x = Flatten()(x)

x = Dense(1000, activation = 'relu')(x)
x = Dropout(0.5)(x)
x = Dense(1000, activation = 'relu')(x)
output = Dense(1, activation = 'sigmoid')(x)

custom_model = Model(inputs = model.input, outputs = output)

*Compiling the model. The primary metric we care about is recall (that is, the CNN's ability to correctly detect intracranial hemorrhages).*

In [0]:
custom_model.compile(loss = 'binary_crossentropy',
                     optimizer = optimizers.Adam(lr=0.0001),
                     metrics=['accuracy',
                              tf.keras.metrics.Recall(),
                              tf.keras.metrics.AUC(),
                              tf.keras.metrics.Precision()])

# **Creating Train & Test Generators**

In [0]:
# Initializing train & test generators to flow train and test images straight from the folders
# that they are stored in
train_datagen = ImageDataGenerator(rescale = 1./255,
                                   horizontal_flip = True,
                                   fill_mode = "nearest",
                                   zoom_range = 0.3,
                                   width_shift_range = 0.3,
                                   height_shift_range=0.3,
                                   rotation_range=30)

test_datagen = ImageDataGenerator(rescale = 1./255)

In [0]:
# We flow from DataFrames, that is, our images are not stored in class-specific folders---instead,
# their labels are stored in separate files (specifically, in DataFrames)
train_generator = train_datagen.flow_from_dataframe(dataframe=labels_df,
                                                    directory= path_to_train,
                                                    x_col='ID',
                                                    y_col='any',
                                                    target_size=(128, 128),
                                                    class_mode='binary')

test_generator = test_datagen.flow_from_dataframe(dataframe=labels_df,
                                                  directory=path_to_test,
                                                  x_col='ID',
                                                  y_col='any',
                                                  target_size=(128, 128),
                                                  class_mode='binary')

# **Fitting the Model**

In [0]:
# Callbacks
checkpoint = ModelCheckpoint(path_to_save_model, monitor='val_acc', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)
early_stopper = EarlyStopping(monitor='val_acc', min_delta=0, patience=3, verbose=1, mode='auto')

In [0]:
# Fitting the model 
custom_model.fit_generator(train_generator,
                           epochs = 50,
                           validation_data = test_generator,
                           callbacks = [checkpoint, early_stopper])