|
a |
|
b/README.md |
|
|
1 |
# Survival-Prediction |
|
|
2 |
This repository presents our code for the 2020 challenge "Predicting lung cancer survival time" (proposed by Owkin and hosted by ENS - Collège de France), done jointly with Léon Zheng. We took part to this challenge as part of the Multi-scale models and convolutional neural networks MVA course (teached by Stéphane Mallat). Our work is summarized in Report.pdf. |
|
|
3 |
|
|
|
4 |
We finished first on the private leaderboard among the 98 teams that participated (https://challengedata.ens.fr/participants/challenges/33/), with a C-index of 77.47. Our best solution relies on feature selection with LASSO regression using a combination of features (pyradiomics features including provided ones and additional ones, Laplacian of Gaussian features and wavelet features), plus a 3D U-Net trained for binary semantic segmentation to recover corrupted masks upstream of extraction. |
|
|
5 |
|
|
|
6 |
## Feature extraction |
|
|
7 |
Extraction of more pyradiomics features (~130 instead of the ~50 provided), Laplacian of Gaussian and wavelet features: see notebook Feature_extraction/Feature_Extraction_CT.ipynb. |
|
|
8 |
|
|
|
9 |
## Feature Selection |
|
|
10 |
We tried 3 different feature selection procedures: |
|
|
11 |
- Recursive Feature Elimination |
|
|
12 |
- LASSO regression: see notebook lasso/FeatureSelectionLasso.ipynb |
|
|
13 |
- Random survival forest: see notebook rsf/rsf_feature_selection.ipynb |
|
|
14 |
|
|
|
15 |
## Computer Vision approach |
|
|
16 |
The following functionalities aim to provide solutions to the prediction of survival time from images only with a deep learning based approach - without relying on pyradiomics features - and are provided in the notebook ScanCNN/ScanCNN.ipynb: |
|
|
17 |
- Visualization of the tumor in various forms: video, 2D slice, 3D shape |
|
|
18 |
- Simplification to a 2D problem with extraction of the 2D "most discriminative" slice and feature extraction with a pretrained ResNet-18 model |
|
|
19 |
- Simplification to binary classification between patients that have high risks or low risks |
|
|
20 |
- End-to-end pipeline with negative log partial likelihood to finetune a ResNet-18 while learning a 3-layer perceptron or an attention-based model predicting the risk of patients on top of it |
|
|
21 |
- Analysis of these approaches with Grad-Cam |
|
|
22 |
- Binary Image Semantic Segmentation with a 3D U-Net to recover missing masks and for 3D feature extraction |
|
|
23 |
|
|
|
24 |
One approach that could be interesting to extend this work would consist in using 3D models pretrained on additional data like https://github.com/Tencent/MedicalNet. |