The "Bangladesh Smokers Survey" repository hosts an extensive data analysis and predictive modeling project that examines the impact of smoking habits on health issues in different districts of Bangladesh. This project uses machine learning techniques to analyze data collected from both smokers and non-smokers, focusing on a wide array of factors such as health symptoms, smoking status, and demographic details.
The project utilizes a range of machine learning techniques, including RandomForestClassifier, KMeans, GaussianNB, LogisticRegression, and DecisionTreeClassifier. Principal Component Analysis (PCA) is also employed for effective data visualization.
Survey data cover various aspects such as age, gender, profession, and health symptoms like cough, chest pain, and wheezing, aiming to provide comprehensive insights into the smoking trends and associated health issues in Bangladesh.
The notebook includes significant findings from the survey data, offering insights into the prevalence of smoking habits, their correlation with health issues, and the effectiveness of different predictive models in understanding these trends.
This repository welcomes contributions. Feel free to suggest enhancements, report issues, or submit pull requests.
This project is licensed under the MIT License - see the LICENSE.md file for details.