|
a |
|
b/README.md |
|
|
1 |
# Patient Risk Profiling using Machine Learning |
|
|
2 |
|
|
|
3 |
## Overview |
|
|
4 |
This repository contains a Jupyter Notebook that implements three different machine learning models to create patient risk profiles using healthcare and clinical datasets. This is only a sample model. The models included are: |
|
|
5 |
|
|
|
6 |
1. **Logistic Regression** - A simple baseline model for binary classification. |
|
|
7 |
2. **Random Forest** - An ensemble-based model for improved performance. |
|
|
8 |
3. **XGBoost** - A gradient boosting model optimized for structured data. |
|
|
9 |
|
|
|
10 |
## Dataset |
|
|
11 |
The script expects a healthcare dataset in CSV format. The dataset should include a `Risk` column as the target variable (0: Low Risk, 1: High Risk) and a `PatientID` column, which will be dropped during processing. All other numerical features will be used for training the models. |
|
|
12 |
|
|
|
13 |
## Prerequisites |
|
|
14 |
Ensure you have the following dependencies installed before running the notebook: |
|
|
15 |
|
|
|
16 |
```bash |
|
|
17 |
pip install pandas numpy scikit-learn xgboost |
|
|
18 |
``` |
|
|
19 |
|
|
|
20 |
## Usage |
|
|
21 |
1. Clone the repository: |
|
|
22 |
|
|
|
23 |
```bash |
|
|
24 |
git clone https://github.com/rkumar1010/patient-risk-profiling.git |
|
|
25 |
cd patient-risk-profiling |
|
|
26 |
``` |
|
|
27 |
|
|
|
28 |
2. Place your dataset in the project directory and update the `healthcare_data.csv` filename in the notebook if necessary. |
|
|
29 |
|
|
|
30 |
3. Run the Jupyter Notebook: |
|
|
31 |
|
|
|
32 |
```bash |
|
|
33 |
jupyter notebook patient_risk_models.ipynb |
|
|
34 |
``` |
|
|
35 |
|
|
|
36 |
4. The script will: |
|
|
37 |
- Load and preprocess the dataset. |
|
|
38 |
- Train and evaluate three different machine learning models. |
|
|
39 |
- Print performance metrics including accuracy and classification reports. |
|
|
40 |
|
|
|
41 |
## Model Performance |
|
|
42 |
The notebook compares model performance based on accuracy and classification metrics. The best-performing model can be selected for further deployment. |
|
|
43 |
|
|
|
44 |
## Contributing |
|
|
45 |
Feel free to fork this repository and submit pull requests for improvements, additional models, or dataset enhancements. |
|
|
46 |
|
|
|
47 |
## License |
|
|
48 |
This project is licensed under the MIT License. |
|
|
49 |
|
|
|
50 |
--- |
|
|
51 |
|
|
|
52 |
For any questions or suggestions, please open an issue in the repository or contact the maintainers. |
|
|
53 |
|