Diff of /README.md [000000] .. [204e02]

Switch to unified view

a b/README.md
1
# ML-Project-Cancer-Prediction
2
My Kaggle Project(Based on Medical Treatment)
3
Kaggle Problem link: 
4
Some libraries/Subpackages used are: https://www.kaggle.com/c/msk-redefining-cancer-treatment
5
1.nltk
6
2.sklearn.calibration
7
3.sklearn.naive_bayes
8
4.mlxtend.classifier
9
5.sklearn.linear_model
10
6.seaborn
11
7.sklearn.metrics 
12
13
Algorithm applied:
14
1.Naive Bayes
15
2.Random Forest(using oneHotEncoding)
16
3.Random Forest(using ResponseEncoding)
17
4.Logistic Regression
18
4.Linear Support Vector Machine(Linear SVM)
19
5.K Nearest Neighbours
20
6.Stacking Model
21
7.Maximum Voting classifier
22
23
Evaluation is done on basis of multi log-loss
24
1.Log-loss for Naive Bayes Model is: 1.2174351082980228
25
2.Log-loss for Logistic Regression is: 1.0139465030649317
26
3.Log-loss for KNN is : 0.9686092822627863
27
4.Log-loss for LinearSVM is: 1.0518829636631724
28
5.Log-loss for Random Forest Classifier is : 1.1440820641479814(using one hot encoding) and 1.220569827205813(using Response Encoding)
29
6.Log-loss for Stacking Model is:[training_set:0.497983218669304,test_set:1.1751619600947567,cross_validation_set:1.09245098514981767
30
7.Log-loss for Maximum Voting Classifier is:[training_set: 0.8677287779975493,test_set:1.2148355813599823,cross_validation_set:1.142669504]
31
32
As per the Evaluation Logistic Regression is the best Model to be fitted in this problem .