|
a |
|
b/README.md |
|
|
1 |
This README.md file was generated on 2021-08-23 by Péter Gargya |
|
|
2 |
|
|
|
3 |
GENERAL INFORMATION |
|
|
4 |
|
|
|
5 |
1. Title of Dataset: Dataset used in the article: |
|
|
6 |
"Histological Grade of Endometrioid Endometrial Cancer and Relapse Risk Can Be Predicted With Machine Learning From Gene Expression Data" |
|
|
7 |
|
|
|
8 |
2. Author Information |
|
|
9 |
A. Principal Investigator Contact Information |
|
|
10 |
Name: dr. Bálint László Bálint |
|
|
11 |
Institution: Genomic Medicine and Bioinformatics Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen |
|
|
12 |
Address: Egyetem tér 1, 4032 Debrecen, Hungary |
|
|
13 |
Email: lbalint@med.unideb.hu |
|
|
14 |
B. Associate or Co-investigator Contact Information |
|
|
15 |
Name: Péter Gargya |
|
|
16 |
Institution: Genomic Medicine and Bioinformatics Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen |
|
|
17 |
Address: Egyetem tér 1, 4032 Debrecen, Hungary |
|
|
18 |
Email: gargya.peter@gmail.com |
|
|
19 |
|
|
|
20 |
|
|
|
21 |
SHARING/ACCESS INFORMATION |
|
|
22 |
|
|
|
23 |
1. Licenses/restrictions placed on the data: The data and codes provided are free to use, however we kindly ask everybody to cite our article as written below. |
|
|
24 |
|
|
|
25 |
2. Links to publications that cite or use the data: |
|
|
26 |
https://www.mdpi.com/2072-6694/13/17/4348/htm |
|
|
27 |
|
|
|
28 |
3. Recommended citation for this dataset: |
|
|
29 |
Gargya, P.; Bálint, B.L. Histological Grade of Endometrioid Endometrial Cancer and Relapse Risk Can Be Predicted with Machine Learning from Gene Expression Data. Cancers 2021, 13, 4348. https://doi.org/10.3390/cancers13174348 |
|
|
30 |
|
|
|
31 |
DATA & FILE OVERVIEW |
|
|
32 |
|
|
|
33 |
1. File List: |
|
|
34 |
- part1_R_prepare_data.Rmd: Codes used to produce the data before applying machine learning. |
|
|
35 |
- part2_Python_ML.py: Creating our Machine Learning model |
|
|
36 |
- part3_R_survival_analysis.Rmd: Survival analysis between low-risk and high-risk G2 subgroups, which were defined by our model. |
|
|
37 |
- part4_extra_requests_by_reviewers.Rmd: Codes used to analyse the distribution of TCGA subgroups inside our risk-specific subgroups. |
|
|
38 |
- ucec_tcga_clinical_data.zip: Raw clinical data, downloaded from cBioportal. |
|
|
39 |
- uterus_rnaseq_VST1.z01, uterus_rnaseq_VST1.z02, uterus_rnaseq_VST1.zip, uterus_rnaseq_VST_G2.zip: These zip files contain the output of part1_R_prepare_data.Rmd and the input of part2_Python_ML.py |
|
|
40 |
|