Card

About Dataset

This dataset contain various skin disease symptoms. Some row of a disease contain direct symptoms text and some row contain text which people used to express the symptoms to the doctor. This is a binary text classification problem dataset.

To read the file use the following code:

** Just change the file path. **

encodings_to_try = ['utf-8', 'Latin-1', 'ISO-8859-1']
for encoding in encodings_to_try:
    try:
        df = pd.read_csv('F:/Skin text classifier.csv', encoding=encoding)
        print("File read successfully with encoding:", encoding)
        break
    except UnicodeDecodeError:
        pass


df.head()

The disease are:
1.'Vitiligo', 'Scabies'

  1. 'Hives (Urticaria)'
  2. 'Folliculitis',
  3. .'Eczema'
  4. 'Ringworm (Tinea Corporis)'
  5. "Athlete's Foot (Tinea Pedis)"
  6. 'Rosacea', 'Psoriasis'
  7. 'Shingles (Herpes Zoster)'
  8. 'Impetigo'
  9. 'Contact Dermatitis'
  10. 'Acne'