Diff of /README.md [000000] .. [83198a]

Switch to unified view

a b/README.md
1
![Salmon-logo-1](images/salmon.JPG)
2
# SALMON v.2: Segmentation deep learning ALgorithm based on MONai toolbox
3
- SALMON is a computational toolbox for segmentation using neural networks (3D patches-based segmentation)
4
- SALMON is based on MONAI 0.7.0 : PyTorch-based, open-source frameworks for deep learning in healthcare imaging. 
5
(https://github.com/Project-MONAI/MONAI)
6
(https://github.com/MIC-DKFZ/nnUNet)
7
(https://arxiv.org/abs/2103.10504)
8
9
This is my "open-box" version if I want to modify the parameters for some particular task, while the two above are hard-coded. The monai 0.5.0 folder contains the previous versions based on the old monai version.
10
11
*******************************************************************************
12
## Requirements
13
Follow the steps in "installation_commands.txt". Installation via Anaconda and creation of a virtual env to download the python libraries and pytorch/cuda.
14
*******************************************************************************
15
## Python scripts and their function
16
17
- organize_folder_structure.py: Organize the data in the folder structure (training,validation,testing) for the network. 
18
Labels are resampled and resized to the corresponding image, to avoid array size conflicts. You can set here a new image resolution for the dataset. 
19
20
- init.py: List of options used to train the network. 
21
22
- check_loader_patches: Shows example of patches fed to the network during the training.  
23
24
- networks.py: The architectures available for segmentation are nn-Unet and UneTR (based on Visual transformers)
25
26
- train.py: Runs the training
27
28
- predict_single_image.py: It launches the inference on a single input image chosen by the user.
29
*******************************************************************************
30
## Usage
31
### Folders structure:
32
33
Use first "organize_folder_structure.py" to create organize the data.
34
Modify the input parameters to select the two folders: images and labels folders with the dataset. Set the resolution of the images here before training.
35
36
    .
37
    ├── Data_folder                   
38
    |   ├── CT               
39
    |   |   ├── 1.nii 
40
    |   |   ├── 2.nii   
41
    |   |   └── 3.nii                     
42
    |   ├── CT_labels                         
43
    |   |   ├── 1.nii 
44
    |   |   ├── 2.nii   
45
    |   |   └── 3.nii  
46
47
Data structure after running it:
48
49
    .
50
    ├── Data_folder  
51
    |   ├── CT  
52
    |   ├── CT_labels 
53
    |   ├── images              
54
    |   |   ├── train             
55
    |   |   |   ├── image1.nii              
56
    |   |   |   └── image2.nii                     
57
    |   |   └── val             
58
    |   |   |   ├── image3.nii             
59
    |   |   |   └── image4.nii
60
    |   |   └── test             
61
    |   |   |   ├── image5.nii              
62
    |   |   |   └── image6.nii
63
    |   ├── labels              
64
    |   |   ├── train             
65
    |   |   |   ├── label1.nii              
66
    |   |   |   └── label2.nii                     
67
    |   |   └── val             
68
    |   |   |   ├── label3.nii             
69
    |   |   |   └── label4.nii
70
    |   |   └── test             
71
    |   |   |   ├── label5.nii              
72
    |   |   |   └── label6.nii
73
    
74
*******************************************************************************
75
### Training:
76
- Modify the "init.py" to set the parameters and start the training/testing on the data. Read the descriptions for each parameter.
77
- Afterwards launch the "train.py" for training. Tensorboard is available to monitor the training ("runs" folder created)   
78
- Check and modify the train_transforms applied to the images  in "train.py" for your specific case. (e.g. In the last update there is a HU windowing for CT images)
79
80
Sample images: the following images show the segmentation of carotid artery from MRI sequence
81
82
![Image](images/image.gif)![result](images/result.gif)
83
84
Sample images: the following images show the multi-label segmentation of prostate transition zone and peripheral zone from MRI sequence
85
86
![Image1](images/prostate.gif)![result1](images/prostate_inf.gif)!
87
88
*******************************************************************************
89
### Inference:
90
- Launch "predict_single_image.py" to test the network. Modify the parameters in the parse section to select the path of the weights, images to infer and result. 
91
- You can test the model on a new image, with different size and resolution from the training. The script will resample it before the inference and give you a mask
92
with same size and resolution of the source image.
93
*******************************************************************************
94
### Tips:
95
- Use and modify "check_loader_patches.py" to check the patches fed during training. 
96
- The "networks.py" calls the nn-Unet, which adapts itself to the input data (resolution and patches size). The script also saves the graph of you network, so you can visualize it. 
97
- "networks.py" includes also UneTR (based on Visual transformers). This is experimental. For more info check (https://arxiv.org/abs/2103.10504) and https://github.com/Project-MONAI/tutorials/blob/master/3d_segmentation/unetr_btcv_segmentation_3d.ipynb
98
- Is it possible to add other networks, but for segmentation the U-net architecture is the state of the art.
99
100
### Sample script inference
101
- The label can be omitted (None) if you segment an unknown image. You have to add the --resolution if you resampled the data during training (look at the argsparse in the code).
102
```console
103
python predict_single_image.py --image './Data_folder/image.nii' --label './Data_folder/label.nii' --result './Data_folder/prova.nii' --weights './best_metric_model.pth'
104
```
105
*******************************************************************************
106
107
### Some note:
108
- Tensorboard can show you all segmented channels, but for now the metric is the Mean-Dice (of all channels). If you want to evaluate the Dice score for each channel you 
109
  have to modify a bit the plot_dice function. I will do it...one day...who knows...maybe not
110
- The loss is the DiceLoss + CrossEntropy. You can modify it if you want to try others (https://docs.monai.io/en/latest/losses.html#diceloss)
111
112
Check more examples at https://github.com/Project-MONAI/tutorials/blob/master/3d_segmentation/.
113
114
### UneTR Notes from the authors:
115
116
Feature_size and pos_embed are the parameters that need to changed to adopt it for your application of interest. Other parameters that are mentioned come from Vision Transformer (ViT) default hyper-parameters (original architecture). In addition, the new revision of UNETR paper with more descriptions is now publicly available. Please check for more details:
117
https://arxiv.org/pdf/2103.10504.pdf
118
119
Now let's look at each of these hyper-parameters in the order of importance:
120
121
- feature_size : In UNETR, we multiply the size of the CNN-based features in the decoder by a factor of 2 at every resolution ( just like the original UNet paper). By default, we set this value to 16 ( to make the entire network lighter). However using larger values such as 32 can improve the segmentation performance if GPU memory is not an issue. Figure2 of the paper also shows this in details.
122
123
- pos_embed: this determines how the image is divided into non-overlapping patches. Essentially, there are 2 ways to achieve this ( by setting it to conv or perceptron). Let's further dive into it for more information:
124
First is by directly applying a convolutional layer with the same stride and kernel size of the patch size and with feature size of the hidden size in the ViT model. Second is by first breaking the image into patches by properly resizing the tensor ( for which we use einops) and then feed it into a perceptron (linear) layer with a hidden size of the ViT model. Our experiments show that for certain applications such as brain segmentation with multiple modalities (e.g. 4 modes such as T1,T2 etc.), using the convolutional layer works better as it takes into account all modes concurrently. For CT images ( e.g. BTCV multi-organ segmentation), we did not see any difference in terms of performance between these two approaches.
125
126
- hidden_size : this is the size of the hidden layers in the ViT encoder. We follow the original ViT model and set this value to 768. In addition, the hidden size should be    divisible by the number of attention heads in the ViT model.
127
128
- num_heads : in the multi-headed self-attention block, this is the number of attention heads. Following the ViT architecture, we set it to 12.
129
130
- mlp_dim : this is the dimension of the multi-layer perceptrons (MLP) in the transformer encoder. Again, we follow the ViT model and set this to 3072 as default value to be     consistent with their architecture.
131