The period is over short time frame but it useful for hypothesis testing and statistical analysis. There are >400 rows so is a great beginners dataset.
This dataset consists of a group of breast cancer patients, who had surgery to remove their tumour. The dataset consists of the following variables:
Patient_ID: unique identifier id of a patient
Age: age at diagnosis (Years)
Gender: Male/Female
Protein1, Protein2, Protein3, Protein4: expression levels (undefined units)
Tumour_Stage: I, II, III
Histology: Infiltrating Ductal Carcinoma, Infiltrating Lobular Carcinoma, Mucinous Carcinoma
ER status: Positive/Negative
PR status: Positive/Negative
HER2 status: Positive/Negative
Surgery_type: Lumpectomy, Simple Mastectomy, Modified Radical Mastectomy, Other
Date_of_Surgery: Date on which surgery was performed (in DD-MON-YY)
Date_of_Last_Visit: Date of last visit (in DD-MON-YY) [can be null, in case the patient didn’t visited again after the surgery]
Patient_Status: Alive/Dead [can be null, in case the patient didn’t visited again after the surgery and there is no information available whether the patient is alive or dead].