|
a |
|
b/Dataset-pre-process/README.md |
|
|
1 |
|
|
|
2 |
#### Data-set |
|
|
3 |
|
|
|
4 |
[KVASIR version 2](https://datasets.simula.no/kvasir/data/kvasir-dataset-v2.zip) is used wchich contains 8000 images.(1000 images for each class.) |
|
|
5 |
|
|
|
6 |
|
|
|
7 |
###### 8 Classes of the data-set |
|
|
8 |
|
|
|
9 |
Anatomical Landmarks:- |
|
|
10 |
- Z-line |
|
|
11 |
- Pylorus |
|
|
12 |
- Cecum |
|
|
13 |
|
|
|
14 |
Pathological Findings:- |
|
|
15 |
- Esophagitis |
|
|
16 |
- Polyps |
|
|
17 |
- Ulcerative Colitis |
|
|
18 |
|
|
|
19 |
|
|
|
20 |
Polyp Removals:- |
|
|
21 |
- Dyed and Lifted Polyps |
|
|
22 |
- Dyed Resection Margins |
|
|
23 |
|
|
|
24 |
Data-set is shuffled using Linux ```shuf``` command and split into two as follows while preserving the data distribution. |
|
|
25 |
|
|
|
26 |
Train set - 6400 images (800 images from each class) |
|
|
27 |
Test set - 1600 images (200 images from each class) |
|
|
28 |
|
|
|
29 |
Data-set is stored back in gdrive in order to load in to Google Colab when needed. |