During the initial peak outbreak phase of Mpox, a significant challenge emerged due to the absence of a publicly available reliable dataset for the detection of Mpox. The rapid escalation of Mpox cases, with its potential spread reaching Europe and America as highlighted by the World Health Organization, along with emerging possibilities of Mpox cases in Asian countries, underscored the urgency of implementing computer-assisted detection as a critical tool. In this context, the immediate diagnosis of Mpox became an increasingly challenging endeavor. As the possibility of a Mpox outbreak loomed over densely populated countries like Bangladesh, the limitations of our available resources rendered rapid diagnosis unattainable. Hence, the dire need for computer-assisted detection methods became apparent.
To address this pressing need, the development of computer-assisted methods demanded an ample amount of diverse data, including skin lesion images of Mpox from individuals of different sexes, ethnicities, and skin tones. However, the scarcity of available data posed a considerable obstacle in this endeavor. In response to this critical situation, our research group took the initiative to develop one of the earliest datasets (MSLD) specifically tailored for Mpox, encompassing various classes including non-Mpox samples.
From June 2022 to May 2023, the Mpox Skin Lesion Dataset (MSLD) has undergone two iterations, resulting in the current version, MSLD v2.0. The previous version included two classes: "Mpox" and "Others" (non-Mpox), with the "Others" class comprising skin lesion images of chickenpox and measles, chosen for their similarity to Mpox. Building upon the limitations identified in the initial release, we have developed an enhanced and more comprehensive version, MSLD v2.0. This updated dataset encompasses a wider range of classes and provides a more diverse set of images suitable for multi-class classification.
MSLD v2.0 comprises images from six distinct classes, namely Mpox (284 images), Chickenpox (75 images), Measles (55 images), Cowpox (66 images), Hand-foot-mouth disease or HFMD (161 images), and Healthy (114 images). The dataset includes 755 original skin lesion images sourced from 541 distinct patients, ensuring a representative sample. Importantly, the latest version has received endorsement from professional dermatologists and obtained approval from appropriate regulatory authorities.
The dataset is organized into two folders:
Original Images: This folder includes a subfolder named "FOLDS" containing five folds (fold1-fold5) for 5-fold cross-validation with the original images. Each fold has separate folders for the test, train, and validation sets.
Augmented Images: To enhance the classification task, various data augmentation techniques, such as rotation, translation, reflection, shear, hue, saturation, contrast, brightness jitter, noise, and scaling, were applied using MATLAB R2020a. To ensure result reproducibility, the augmented images are provided in this folder. It contains a subfolder called "FOLDS_AUG" with augmented images of the train sets from each fold in the "FOLDS" subfolder of the "Original Images". The augmentation process resulted in an approximate 14-fold increase in the number of images.
Each image is assigned a name following the format of DiseaseCode_PatientNumber_ImageNumber. The corresponding disease codes assigned to each of the six disease classes are - Mpox -> MKP, Chickenpox -> CHP, Cowpox -> CWP, Measles -> MSL, Hand,foot and mouth disease -> HFMD, Healthy -> HEALTHY. Assignment of the keywords is illustrated in the provided image "Keywords.jpg".
For instance, an image named "MKP_17_01" indicates that it belongs to the Mpox class and is the first image captured from a patient with the ID 17.
The dataset includes an Excel file named "datalog.xlsx" consisting of 5 sheets (Sheet1-5), with each sheet corresponding to a specific fold (fold1-5). Each sheet contains three columns: train, validation, and test. These columns contain the names of the images belonging to the respective train, validation, and test sets for a particular fold.
Since we intend to build an end to end solution - starting with dataset creation and ending with a live web app, a prototype of the web-app has already been developed using the open-source python streamlit framework with a flask core and has been hosted in the streamlit provided server for better user experience. In the app, Skin Lesion Detector, users can get, not only a suggestion but also the accuracy of the suggestion.
The codes required to build and train the model, all the javascript, css and html files as well as the trained model will be made opem-source soon. The app's dynamic and future updates will incorporate the ability to store user data and use them to train the model realtime.
Learn more at our GitHub repo!
If this dataset helped your research, please cite the following articles:
Ali, S. N., Ahmed, M. T., Jahan, T., Paul, J., Sani, S. M. Sakeef, Noor, N., Asma, A. N., & Hasan, T. (2024). A Web-based Mpox Skin Lesion Detection System Using State-of-the-art Deep Learning Models Considering Racial Diversity. Biomedical Signal Processing and Control, 98, 106742.
@article{Nafisa2024,
title={A Web-based Mpox Skin Lesion Detection System Using State-of-the-art Deep Learning Models Considering Racial Diversity},
author={Ali, Shams Nafisa and Ahmed, Md. Tazuddin and Jahan, Tasnim and Paul, Joydip and Sani, S. M. Sakeef and Noor, Nawshaba and Asma, Anzirun Nahar and Hasan, Taufiq},
journal={Biomedical Signal Processing and Control},
volume={98},
pages={106742},
year={2024},
publisher={Elsevier}
}
Ali, S. N., Ahmed, M. T., Paul, J., Jahan, T., Sani, S. M. Sakeef, Noor, N., & Hasan, T. (2022). Monkeypox Skin Lesion Detection Using Deep Learning Models: A Preliminary Feasibility Study. arXiv preprint arXiv:2207.03342.
@article{Nafisa2022,
title={Monkeypox Skin Lesion Detection Using Deep Learning Models: A Preliminary Feasibility Study},
author={Ali, Shams Nafisa and Ahmed, Md. Tazuddin and Paul, Joydip and Jahan, Tasnim and Sani, S. M. Sakeef and Noor, Nawshaba and Hasan, Taufiq},
journal={arXiv preprint arXiv:2207.03342},
year={2022}
}