a b/README.md
1
---
2
languages:
3
- python
4
products:
5
- Azure Machine Learning Service
6
- Tensorflow 2.0
7
description: "Image segmentation TensorFlow model for Cardiac MRI's"
8
---
9
10
![sample_prediction](readme/sample_prediction.png)
11
12
#  Image Segmentation Cardiac MRI's with Azure Machine Learning Service and TensorFlow 
13
14
There are a variety of important image analysis deep learning applications that need to go beyond detecting individual objects within an image and instead segment the image into spatial regions of interest. For example, in medical imagery analysis it is often important to separate the pixels corresponding to different types of tissue, blood or abnormal cells so that we can isolate a particular organ. In this self-paced, hands-on lab we will use the [TensorFlow](https://www.tensorflow.org/) machine learning framework to train and evaluate an image segmentation network using a medical imagery dataset.
15
16
## Objectives
17
18
* Understand how Neural Networks can solve imaging problems
19
* Use Transpose Convolutional Neural Networks
20
* Use Keras and TensorFlow 2 to analyze image data
21
* Use Azure Machine learning Service
22
23
|Technology|
24
|----------|
25
|Azure Machine Learning |
26
|Tensorflow 2.0 |
27
|Unet Architecture |
28
29
## Why Azure Machine Learning Service?
30
31
![mlops](readme/ml-lifecycle.png)
32
33
Azure Machine Learning Service give to use the capability to use MLOps techniques, it empowers data scientists and app developers to help bring ML models to production.
34
35
This MLOps functionalities that Azure Machine Learning have, enables you to track / version / audit / certify / re-use every asset in your ML lifecycle and provides orchestration services to streamline managing this lifecycle.
36
37
### What are the key challenges we wish to solve with?
38
39
![mlops_flow](readme/mlops.png)
40
41
**Model reproducibility & versioning**
42
- Track, snapshot & manage assets used to create the model
43
- Enable collaboration and sharing of ML pipelines
44
45
**Model packaging & validation**
46
- Support model portability across a variety of platforms
47
- Certify model performance meets functional and latency requirements
48
49
**[Not implemented] Model auditability & explainability**
50
- Maintain asset integrity & persist access control logs
51
- Certify model behavior meets regulatory & adversarial standards
52
53
**[Not implemented] Model deployment & monitoring**
54
- Release models with confidence
55
- Monitor & know when to retrain by analyzing signals such as data drift
56
57
## Deploy Infrastructure
58
59
Follow the next documentation if you dont have Azure Machine Learning up and running yet
60
61
#### [Set up Azure Resources with Azure Reource Manager (ARM)](./infrastructure/README.md)
62
63
64
## Virtual environment to execute Azure Machine Learning notebook
65
66
### Ananconda and Jupyter Notebook Local
67
68
To create the virual environment, we need to have anaconda installed in our computer. It can be downloaded in this [link](https://www.anaconda.com/download/)
69
70
- Instalation: https://www.anaconda.com/distribution/
71
- Conda commands to create local env by environment.yml: ```conda env create -f environment.yml```
72
- Set conda env into jupyter notebook: ```python -m ipykernel install --user --name <environment_name> --display-name "Python (<environment_name>)"```
73
74
Once the environment is created, to activate it:
75
76
`conda activate <environment-name>`
77
78
To deactivate the environment:
79
80
`conda deactivate <environment-name>`
81
82
### Dataset
83
84
Cardiac MRI short-axis (SAX) scans:
85
86
- Sunnybrook cardiac images from earlier competition
87
http://smial.sri.utoronto.ca/LV_Challenge/Data.html
88
89
- "Sunnybrook Cardiac MR Database" is made available under the CC0 1.0 Universal
90
license described above, and with more detail here:
91
http://creativecommons.org/publicdomain/zero/1.0/
92
93
Original images are 256 x 256 grayscale DICOM format
94
95
- Output is a tensor of size 256 x 256 x 2
96
97
- Each pixel belongs to one of two classes
98
99
- Training set consist of 234 images
100
101
- Validation set consist of 26 images
102
103
Download here TFrecords: https://drive.google.com/drive/folders/1bYoJMycuKsTg2LUWRKircUzjtQYYGzNf?usp=sharing
104
105
#### Attribution: 
106
107
Radau P, Lu Y, Connelly K, Paul G, Dick AJ, Wright GA. "Evaluation Framework for
108
    Algorithms Segmenting Short Axis Cardiac MRI." The MIDAS Journal -Cardiac MR Left
109
    Ventricle Segmentation Challenge, http://hdl.handle.net/10380/3070
110
111
# Image Segmentation
112
113
In this repository you will work through a series of exercises performing image segmentation, also called semantic segmentation.  Semantic segmentation is the task of placing each pixel into a specific class.  In a sense it's a classification problem where you'll classify on a pixel basis rather than an entire image.  In this lab the task will be classifying each pixel in a cardiac MRI image based on whether the pixel is a part of the left ventricle (LV) or not.
114
115
This lab is not an introduction to deep learning, nor is it intended to be a rigorous mathematical formalism of convolutional neural networks.  We'll assume that you have at least a passing understanding of neural networks including concepts like forward and backpropagation, activations, SGD, convolutions, pooling, bias, and the like.  It is helpful if you've encountered convolutional neural networks (CNN) already and you understand image recognition tasks.  The lab will use Google's TensorFlow machine learning framework so if you have Python and TensorFlow experience it is helpful, but not required.  Most of the work we'll do in this lab is not coding per se, but setting up and running training and evaluation tasks using TensorFlow.
116
117
118
## Input Data Set
119
120
![dataset](readme/dataset_sample.png)
121
122
The data set you'll be utilizing is a series of cardiac images (specifically MRI short-axis (SAX) scans) that have been expertly labeled.  See  References for full citation information. 
123
124
Four representative examples of the data are shown below.  Each row of images is an instance of the data.  On the left are the MRI images and the right are the expertly-segmented regions (often called contours).  The portions of the images that are part of the LV are denoted in white.  Note that the size of LV varies from image to image, but the LV typically takes up a relatively small region of the entire image.
125
126
The data extraction from the raw images and then subsequent preparation of these images for ingestion into TensorFlow will not be showcased in this lab.  Suffice it to say that data preparation is a non-trivial aspect of machine learning workflows and is outside the scope of this lab.
127
128
For those that are interested in the details, we obtained guidance and partial code from a prior [Kaggle competition](https://www.kaggle.com/c/second-annual-data-science-bowl/details/deep-learning-tutorial) on how to extract the images properly.  At that point we took the images, converted them to TensorFlow records (TFRecords), and stored them to files.  [TFRecords](https://www.tensorflow.org/programmers_guide/reading_data) are a special file format provided by TensorFlow, which allow you to use built-in TensorFlow functions for data management including multi-threaded data reading and sophisticated pre-processing of the data such as randomizing and even augmenting the training data.
129
130
The images themselves are originally 256 x 256 grayscale [DICOM](https://en.wikipedia.org/wiki/DICOM) format, a common image format in medical imaging.  The label is a tensor of size 256 x 256 x 2.  The reason the last dimension is a 2 is that the pixel is in one of two classes so each pixel label has a vector of size 2 associated with it.  The training set is 234 images and the validation set (data NOT used for training but used to test the accuracy of the model) is 26 images.
131
132
# TensorBoard
133
134
In this experiment we generate TensorFlow logs. Tensorboard allows you to visualize many aspects of your program. In TensorBoard, you can see a visual representation of your computation graph and you can plot different metrics of your computation such as loss, accuracy, and learning rate.  Essentially any data that is generated during the execution of TensorFlow can be visually displayed by TensorBoard with the addition of a few extra API calls in your program.
135
136
Inside aml-pipeline.ipynb notebook you have the correspondant cells to execute Tensorboard after model training.
137
138
![tensorboard](readme/tensorboard_execution.png)
139
140
Tensorboard hooks into Keras through a [training callback](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/TensorBoard)
141
142
<a id="References"></a>
143
# References
144
145
Thanks to **NVIDIA** and **Tensorflow** Community for give the facilities to obtains this kind of dataset and lend for everyone the opportunity to train, analyze and prove this kind of images that usually we found that are private.
146
147
- [Tensorflow 2.0](https://www.tensorflow.org/learn)
148
- [Azure Machine Learning](https://azure.microsoft.com/es-es/services/machine-learning/)
149
- [UNET](https://arxiv.org/abs/1505.04597)
150
151
<a id="1"></a>
152
[1] Sunnybrook cardiac images from earlier competition https://smial.sri.utoronto.ca/LV_Challenge/Data.html
153
154
<a id="2"></a>
155
[2] This "Sunnybrook Cardiac MR Database" is made available under the CC0 1.0 Universal license described above, and with more detail here: https://www.cardiacatlas.org/studies/sunnybrook-cardiac-data/
156
157
<a id="3"></a>
158
[3] Attribution:
159
Radau P, Lu Y, Connelly K, Paul G, Dick AJ, Wright GA. "Evaluation Framework for Algorithms Segmenting Short Axis Cardiac MRI." The MIDAS Journal -Cardiac MR Left Ventricle Segmentation Challenge, http://hdl.handle.net/10380/3070
160
161
<a id="4"></a>
162
[4] http://fcn.berkeleyvision.org/
163
164
<a id="5"></a>
165
[5] Long, Shelhamer, Darrell; "Fully Convoutional Networks for Semantic Segmentation", CVPR 2015.
166
167
<a id="6"></a>
168
[6] Zeiler, Krishnan, Taylor, Fergus; "Deconvolutional Networks", CVPR 2010.
169
170
<a id="7"></a>
171
[7] https://www.kaggle.com/c/second-annual-data-science-bowl/details/deep-learning-tutorial
172
173
<a id="8"></a>
174
[8] https://www.tensorflow.org/tutorials/images/segmentation