|
a |
|
b/README.md |
|
|
1 |
# Lung Cancer Predition ML Project |
|
|
2 |
A Machine Learning project leveraging synthetic patient data to predict lung cancer risk. |
|
|
3 |
<br> <br> |
|
|
4 |
*Authors: [Krzysztof Adamczyk](https://github.com/KAdamczykk), [Paweł Florek](https://github.com/FlorekPawel), [Julia Kruk](https://github.com/krukj) & [Nadia Serafin](https://github.com/nadias03)* |
|
|
5 |
|
|
|
6 |
## Data source |
|
|
7 |
[Synthea dataset | Harvard Dataverse](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/GD5XWE) |
|
|
8 |
## Used technologies |
|
|
9 |
- programming language: **Python** |
|
|
10 |
- used technologies and libraries: |
|
|
11 |
- numpy |
|
|
12 |
- pandas |
|
|
13 |
- seaborn, matplotlib |
|
|
14 |
- sklearn |
|
|
15 |
|
|
|
16 |
## Project structure |
|
|
17 |
``` |
|
|
18 |
ML-Synthea-lung-cancer-prediction/ |
|
|
19 |
├── data/ # Raw data |
|
|
20 |
├── notebooks/ # Jupyter Notebooks |
|
|
21 |
├── validation_data/ # Data for validation |
|
|
22 |
└── README.md # This file |
|
|
23 |
``` |