Data: De-identified Tabular Specialty: Internal Medicine Neurology Public Health EHR: Diagnoses Vital Signs Demographics: 18-64 Clinical Purpose: Screening Task: Classification Disease Progression Modeling Risk Stratification License: Other
Card

Stroke Prediction Dataset

11 clinical features for predicting stroke events

Context

According to the World Health Organization (WHO), stroke is the second leading cause of death globally, accounting for approximately 11% of total deaths.

This dataset is designed to help predict whether a patient is likely to have a stroke, based on input parameters such as gender, age, comorbid conditions, and smoking status. Each row in the dataset contains relevant patient information that could be used in a classification model.

Attribute Information

  1. id: Unique identifier
  2. gender: "Male", "Female", or "Other"
  3. age: Age of the patient
  4. hypertension: 0 = No hypertension, 1 = Has hypertension
  5. heart_disease: 0 = No heart disease, 1 = Has heart disease
  6. ever_married: "No" or "Yes"
  7. work_type: "children", "Govt_job", "Never_worked", "Private", or "Self-employed"
  8. Residence_type: "Rural" or "Urban"
  9. avg_glucose_level: Average blood glucose level
  10. bmi: Body Mass Index
  11. smoking_status: "formerly smoked", "never smoked", "smokes", or "Unknown"
  12. stroke: 1 = Had a stroke, 0 = Did not have a stroke

Note: "Unknown" in smoking_status indicates missing or unavailable information for the patient.

Acknowledgements

(Confidential Source) — This dataset is intended for educational purposes only.
If you use this dataset in your research or projects, please credit the author.