|
a |
|
b/README.md |
|
|
1 |
# $$Multi-Class \space Prediction \space of \space Obesity \space Risk$$ |
|
|
2 |
<div id="header" align="center"> |
|
|
3 |
<img src="https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExZ3RlY3NqZ2tqbjh1ZnQ4dWx3d3BldmluYXMzamlvYzUxbXNxcWltZSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/wdFph4zhLiBPi/giphy.gif" width="600"/> |
|
|
4 |
</div> |
|
|
5 |
<!--  --> |
|
|
6 |
|
|
|
7 |
## Project Description |
|
|
8 |
|
|
|
9 |
This repository contains the code and materials for our Machine Learning project on "Multi-Class Prediction of Obesity Risk". |
|
|
10 |
In this project, we focused on predicting the risk of obesity using a multi-class classification approach. Our work involved various stages including exploratory data analysis (EDA), feature engineering, and building predictive models using a diverse set of machine learning algorithms and choose the best model for our pipeline. |
|
|
11 |
|
|
|
12 |
## Key Components |
|
|
13 |
|
|
|
14 |
- **Exploratory Data Analysis (EDA):** We thoroughly examined the dataset to understand its underlying patterns and distributions. |
|
|
15 |
- **Feature Engineering:** We engineered relevant features to enhance the predictive power of our models. |
|
|
16 |
- **Modeling:** We implemented multiple machine learning models including Logistic Regression, Decision Tree, Random Forest, SVC, KNN Classifier, XGBoost, LGBM, Catboost, and Adaboost and choose the best one. |
|
|
17 |
- **Pipeline:** We utilized a pipeline to streamline our machine learning workflow and ensure reproducibility. |
|
|
18 |
|
|
|
19 |
## Repository Structure |
|
|
20 |
|
|
|
21 |
- `Data/`: Contains the **dataset** used in the project, the **submission file** and the **Presentation** slides summarizing our project findings. |
|
|
22 |
- `Preprocessing/`: unfinished Jupyter notebooks containing the code for EDA, feature engineering, and modeling and picture of submission on Kaggle competition. |
|
|
23 |
- `Multi_Class Prediction of Obesity Risk`: Jupyter notebooks containing the code for EDA, feature engineering, and modeling. |
|
|
24 |
- `README.md`: You are here! It provides an overview of the project and instructions for replicating our work. |
|
|
25 |
|
|
|
26 |
## Getting Started |
|
|
27 |
|
|
|
28 |
To replicate our project, follow these steps: |
|
|
29 |
|
|
|
30 |
1. Clone this repository to your local machine. |
|
|
31 |
2. Navigate to the `Multi_Class Prediction of Obesity Risk` Jupyter notebook. |
|
|
32 |
3. Open the Jupyter notebooks and execute the code cells sequentially. |
|
|
33 |
4. Refer to the presentation slides in the `Data/` directory for a summary of our findings. |
|
|
34 |
|
|
|
35 |
## Additional Resources |
|
|
36 |
|
|
|
37 |
- Kaggle: [[Kaggle Competition](https://www.kaggle.com/competitions/playground-series-s4e2)] |
|
|
38 |
|
|
|
39 |
## Authors |
|
|
40 |
|
|
|
41 |
This project was created by: |
|
|
42 |
1. [Amina Mohamed](https://github.com/am231am) |
|
|
43 |
2. [Ashraf Mahmoud](https://github.com/AshrafMah) |
|
|
44 |
3. [Nagham Ehab](https://github.com/Naghamehab5) |
|
|
45 |
4. [Shorouq Hossam](https://github.com/ShorouqHossamMohammed) |