Switch to unified view

a/README.md b/README.md
1
# Semantic Segmentation using U-Net for Pneumonia detection on RSNA dataset.
1
# Semantic Segmentation using U-Net for Pneumonia detection on RSNA dataset.
2
2
3
3
4
## INTRODUCTION
4
## INTRODUCTION
5
5
6
Medical imaging and applications of Artificial intelligence for diagnosis has been one of the hot topics since the onset of deep learning. Quite a lot of datasets have come up in this field. 
6
Medical imaging and applications of Artificial intelligence for diagnosis has been one of the hot topics since the onset of deep learning. Quite a lot of datasets have come up in this field. 
7
7
8
This project is an attempt at using UNET architecture for diagnosing pneumonia in chest X-ray images. The [RSNA dataset](https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/) from Kaggle has been used for training the model. This project taught me how to structure a deep learning project, follow the coding standards and more importantly gave insights into image segmentation.
8
This project is an attempt at using UNET architecture for diagnosing pneumonia in chest X-ray images. The [RSNA dataset](https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/) from Kaggle has been used for training the model. This project taught me how to structure a deep learning project, follow the coding standards and more importantly gave insights into image segmentation.
9
        
9
        
10
Considering the fact that even though the dataset was a region localization dataset the results were promising. The dataset consisted of bounding boxes of pneumonia regions, so on generating masks from the coordinates of the bounding boxes there will be class overlap, even then the model did a decent job in predicting the class maps. On drawing bounding boxes for the predicted pneumonia region and then calculating the metrics, the model achieves good results. 
10
Considering the fact that even though the dataset was a region localization dataset the results were promising. The dataset consisted of bounding boxes of pneumonia regions, so on generating masks from the coordinates of the bounding boxes there will be class overlap, even then the model did a decent job in predicting the class maps. On drawing bounding boxes for the predicted pneumonia region and then calculating the metrics, the model achieves good results. 
11
11
12
12
13
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
13
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
14
14
15
## ABOUT THE DATASET
15
## ABOUT THE DATASET
16
16
17
17
18
### 2.1 Dataset
18
### 2.1 Dataset
19
19
20
The dataset was directly imported from Kaggle pneumonia detection challenge page onto a Google colab instance. It consisted of close to 30,000 images in .dcm format.  Train and validation sets were separately available with ~27,000 and ~4000 train and validation images. Images were of 1024x1024 resolution. A .csv file was also provided as train lables. The csv files consisted of bounding box coordinates of the pneumonia region in each image.
20
The dataset was directly imported from Kaggle pneumonia detection challenge page onto a Google colab instance. It consisted of close to 30,000 images in .dcm format.  Train and validation sets were separately available with ~27,000 and ~4000 train and validation images. Images were of 1024x1024 resolution. A .csv file was also provided as train lables. The csv files consisted of bounding box coordinates of the pneumonia region in each image.
21
21
22
### 2.2 Data Generator class for efficient data flow.
22
### 2.2 Data Generator class for efficient data flow.
23
23
24
The primary step of the data generator class was to convert the files from .dcm format to .jpg followed by resizing the image to 256 x 256 pixels. Along with the image a mask image was also prepared using the bounding box coordinates from the .csv file.
24
The primary step of the data generator class was to convert the files from .dcm format to .jpg followed by resizing the image to 256 x 256 pixels. Along with the image a mask image was also prepared using the bounding box coordinates from the .csv file.
25
The mask image was of binary type with 1 as value inside the bounding box and 0 outside it. The mask rectangle was first drawn on 1024x1024x1 numpy array and then resized to 256x256. Finally using the to_categorical() it was converted to one hot encodings.  Binary value 1 represents pneumonia affected regions and 0 represents normal regions.
25
The mask image was of binary type with 1 as value inside the bounding box and 0 outside it. The mask rectangle was first drawn on 1024x1024x1 numpy array and then resized to 256x256. Finally using the to_categorical() it was converted to one hot encodings.  Binary value 1 represents pneumonia affected regions and 0 represents normal regions.
26
26
27
The data generator class passes a number of images at a time corresponding to the batch size specified. The returned arrays are of size 
27
The data generator class passes a number of images at a time corresponding to the batch size specified. The returned arrays are of size 
28
28
29
```python
29
```python
30
images : (self.batch_size, self.img_height, self.img_width, 1)
30
images : (self.batch_size, self.img_height, self.img_width, 1)
31
```
31
```
32
```python
32
```python
33
masks : (self.batch_size, self.img_height, self.img_width, 2)
33
masks : (self.batch_size, self.img_height, self.img_width, 2)
34
```
34
```
35
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
35
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
36
36
37
##  METHOD
37
##  METHOD
38
38
39
### 3.1 Model build
39
### 3.1 Model build
40
40
41
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/unet.png" height="300" align = "right">
41
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/unet.png?raw=true" height="300" align = "right">
42
42
43
The model used is similar to a standard UNET architecture. The model involves an encoder path and a decoder path. The encoder consists of a stacks of convolutions and max pooling layers. This helps to capture the context of the image. 
43
The model used is similar to a standard UNET architecture. The model involves an encoder path and a decoder path. The encoder consists of a stacks of convolutions and max pooling layers. This helps to capture the context of the image. 
44
The decoder path is symmetric to the encoder it consists of stacks of upsampling and convolution layers.
44
The decoder path is symmetric to the encoder it consists of stacks of upsampling and convolution layers.
45
<br />
45
<br />
46
46
47
### Activation functions.
47
### Activation functions.
48
48
49
Relu activation was used in the convolution layers. In the output layer  softmax function was used, this gives us the class label of the predicted class. 
49
Relu activation was used in the convolution layers. In the output layer  softmax function was used, this gives us the class label of the predicted class. 
50
50
51
### Loss function.
51
### Loss function.
52
Diceloss and MeanIoU were the loss functions used.
52
Diceloss and MeanIoU were the loss functions used.
53
53
54
Diceloss is calculated by : 
54
Diceloss is calculated by : 
55
55
56
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/diceloss.png" align = "center">
56
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/diceloss.png?raw=true" align = "center">
57
 
57
 
58
MeanIoU is calculated by:
58
MeanIoU is calculated by:
59
        
59
        
60
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/Iou.jpg" align = "center" width="200">
60
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/Iou.jpg?raw=true" align = "center" width="200">
61
61
62
62
63
Where TP is the true positives (ie correct class predictions) FP is the false positives (class 0 getting predicted as 1), and FN is the false negatives  (class 1 getting predicted as 0).
63
Where TP is the true positives (ie correct class predictions) FP is the false positives (class 0 getting predicted as 1), and FN is the false negatives  (class 1 getting predicted as 0).
64
64
65
65
66
**Function for implementing dice loss :**
66
**Function for implementing dice loss :**
67
67
68
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/Loss function.jpg" align = "center">
68
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/Loss function.jpg?raw=true" align = "center">
69
69
70
70
71
## Training the model.
71
## Training the model.
72
72
73
The model was trained on google colab instance for 50 epochs, the callbacks used were checkpoint (to save the model at every epoch), earlystopping with min_delta = 0.0001 on validation accuracy.
73
The model was trained on google colab instance for 50 epochs, the callbacks used were checkpoint (to save the model at every epoch), earlystopping with min_delta = 0.0001 on validation accuracy.
74
74
75
75
76
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
76
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
77
77
78
## RESULTS
78
## RESULTS
79
79
80
The model achieved a dice coefficient of 0.65 and Mean IoU 0.5 on the validation set.
80
The model achieved a dice coefficient of 0.65 and Mean IoU 0.5 on the validation set.
81
The above measures are about how good the model predicts the segmentation map. But however our intended purpose is to finally predict bounding boxes for pneumonia region. Since our dataset was one which had bounding boxes rather than a classification mask.
81
The above measures are about how good the model predicts the segmentation map. But however our intended purpose is to finally predict bounding boxes for pneumonia region. Since our dataset was one which had bounding boxes rather than a classification mask.
82
On drawing bounding boxes for the predicted regions with pneumonia class and comparing with the ground truth we get the below results.
82
On drawing bounding boxes for the predicted regions with pneumonia class and comparing with the ground truth we get the below results.
83
83
84
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/Validation score.jpg" width="900">
84
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/Validation score.jpg?raw=true" width="900">
85
85
86
Dice coefficient of **0.97** and Mean IoU of **0.95**
86
Dice coefficient of **0.97** and Mean IoU of **0.95**
87
87
88
88
89
### Learning curve :
89
### Learning curve :
90
90
91
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/learning curve.jpg" width="400" height="400"> |<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/Dice curve.jpg" width="400" height="400">
91
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/learning curve.jpg?raw=true" width="400" height="400"> |<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/Dice curve.jpg?raw=true" width="400" height="400">
92
------------ | -------------
92
------------ | -------------
93
93
94
-------------------------------
94
-------------------------------
95
### Some predicted x-rays.
95
### Some predicted x-rays.
96
96
97
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/R1.jpg" width="900">
97
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/R1.jpg?raw=true" width="900">
98
98
99
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/R2.jpg" width="900">
99
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/R2.jpg?raw=true" width="900">
100
100
101
---------------------
101
---------------------
102
102
103
### Results with bounding boxes:
103
### Results with bounding boxes:
104
104
105
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/R3.jpg" width="800">
105
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/R3.jpg?raw=true" width="800">
106
106
107
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/R4.jpg" width="800">
107
<img src="https://github.com/harikishorep122/UNET/blob/main/Final_results/R4.jpg?raw=true" width="800">
108
108
109
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
109
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
110
110
111
111
112
## CONCLUSION
112
## CONCLUSION
113
113
114
This project was a great learning experience for me. I learned how to preprocess a dataset and also how to code a data generator function to feed data while training the model. 
114
This project was a great learning experience for me. I learned how to preprocess a dataset and also how to code a data generator function to feed data while training the model. 
115
115
116
I could appreciate the working of U-Net architecture for image segmentation after the project. I understood the encoder decoder batches in the network and how it works to first capture the context of the image and then predict the classes for each classes.
116
I could appreciate the working of U-Net architecture for image segmentation after the project. I understood the encoder decoder batches in the network and how it works to first capture the context of the image and then predict the classes for each classes.
117
117
118
Finally it was a great experience to code the whole workflow. I learned the standard coding methods for data preprocessing, building the network architecture, training the model with proper checkpoints such as callbacks and finally testing the model.
118
Finally it was a great experience to code the whole workflow. I learned the standard coding methods for data preprocessing, building the network architecture, training the model with proper checkpoints such as callbacks and finally testing the model.
119
119
120
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
120
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
121
121
122
## BIBLIOGRAPHY
122
## BIBLIOGRAPHY
123
123
124
124
125
1. U-Net: Convolutional Networks for Biomedical Image Segmentation Olaf Ronneberger, Philipp Fischer, and Thomas Brox
125
1. U-Net: Convolutional Networks for Biomedical Image Segmentation Olaf Ronneberger, Philipp Fischer, and Thomas Brox
126
2. https://towardsdatascience.com/understanding-semantic-segmentation-with-unet-6be4f42d4b47
126
2. https://towardsdatascience.com/understanding-semantic-segmentation-with-unet-6be4f42d4b47
127
3. https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/
127
3. https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/