This dataset contain various skin disease symptoms. Some row of a disease contain direct symptoms text and some row contain text which people used to express the symptoms to the doctor. This is a binary text classification problem dataset.
** Just change the file path. **
encodings_to_try = ['utf-8', 'Latin-1', 'ISO-8859-1']
for encoding in encodings_to_try:
try:
df = pd.read_csv('F:/Skin text classifier.csv', encoding=encoding)
print("File read successfully with encoding:", encoding)
break
except UnicodeDecodeError:
pass
df.head()
The disease are:
1.'Vitiligo', 'Scabies'