a b/README.md
1
<div class="sc-kdrUpr eZtUed"><div class="sc-UEtKG dGqiYy sc-hDzlxo bEIZRR"><div class="sc-fqwslf gsqkEc"><div class="sc-cBQMlg kAHhUk"><h2 class="sc-dcKlJK sc-cVttbi gqEuPW ksnHgj">About Dataset</h2></div></div></div><div class="sc-fHzVOS cUYeeo"><div class="sc-davvxH flNyFK"><div style="min-height: 80px;"><div class="sc-etVRix jqYJaa sc-jCNfQM igJSrG"><p>This dataset is a preprocessed version of the original UniToChest dataset. The UniToChest dataset, created by Chaudhry et al. and made available through their research work, serves as the foundation for the following preprocessing techniques applied here:</p>
2
<ol>
3
<li><p><strong>Hounsfield Units</strong>: The raw CT scan values have been converted into Hounsfield Units (HU). </p></li>
4
<li><p><strong>Windowing</strong>: The dataset includes images with windowing applied, a technique commonly used to enhance specific ranges of Hounsfield Units, thereby improving the visualization of certain tissues.</p></li>
5
<li><p><strong>Lung Segmentation:</strong> This preprocessing step isolates the lung regions within the CT scans, using U-Net R231 and thresholding, allowing for focused analysis of lung tissue.</p></li>
6
<li><p><strong>CLAHE (Contrast Limited Adaptive Histogram Equalization)</strong>: CLAHE has been applied to improve the contrast of the images, particularly in areas with low contrast.</p></li>
7
</ol>
8
<p><strong>Nodule Subsets</strong></p>
9
<p>The dataset includes two specific subsets based on the size of the lung nodules:</p>
10
<ul>
11
<li><strong>Large Nodules</strong>: Nodules greater than 10 mm in diameter.</li>
12
<li><strong>Small Nodules</strong>: Nodules smaller than 10 mm in diameter.</li>
13
</ul>
14
<p>These subsets allow for targeted analysis of different nodule sizes, which can be critical for research focused on early detection and characterization of lung conditions.</p>
15
<p><strong>File Naming Convention</strong></p>
16
<p>The files in this dataset follow a naming convention designed for compatibility with the nn-UNet framework. Each file name is structured as follows:</p>
17
<ul>
18
<li>Images:  datasetname_patient_study_slice_modality.png</li>
19
<li>Masks: datasetname_patient_study_slice.png</li>
20
</ul>
21
<p><strong>References</strong></p>
22
<p>[1] Daniele Perlo, «UniToChest». Zenodo, dic. 22, 2021. doi: 10.5281/zenodo.5797912.<br>
23
[2] H. A. H. Chaudhry et&nbsp;al., «UniToChest: A Lung Image Dataset for Segmentation of Cancerous Nodules on CT Scans», en Image Analysis and Processing – ICIAP 2022, S. Sclaroff, C. Distante, M. Leo, G. M. Farinella, y F. Tombari, Eds., Cham: Springer International Publishing, 2022, pp. 185-196. doi: 10.1007/978-3-031-06427-2_16.<br>
24
[3] J. Hofmanninger, F. Prayer, J. Pan, et al., "Automatic lung segmentation in routine imaging is primarily a data diversity problem, not a methodology problem," Eur Radiol Exp, vol. 4, p. 50, 2020. doi: 10.1186/s41747-020-00173-2.</p></div></div></div><button aria-label="View more about this dataset" tabindex="0" role="button" aria-expanded="false" class="sc-dHSNJi LIdZl" style="margin: 24px 0px 8px;"><span class="google-symbols notranslate MuiIcon-root MuiIcon-fontSizeMedium sc-kmqAS HYlEG notranslate css-1jgtvd5" aria-hidden="true">expand_more</span><span class="sc-ifkGpL jcrTve">View more</span></button></div></div>