|
a |
|
b/README.md |
|
|
1 |
<div class="sc-kdrUpr eZtUed"><div class="sc-UEtKG dGqiYy sc-hDzlxo bEIZRR"><div class="sc-fqwslf gsqkEc"><div class="sc-cBQMlg kAHhUk"><h2 class="sc-dcKlJK sc-cVttbi gqEuPW ksnHgj">About Dataset</h2></div></div></div><div class="sc-fHzVOS cUYeeo"><div class="sc-davvxH nUNNB"><div style="min-height: 80px;"><div class="sc-etVRix bfyesi sc-jCNfQM igJSrG"><h1>Content and context</h1> |
|
|
2 |
<p>Tuberculosis is a disease that affects many people in developing countries. While treatment is possible, it requires an accurate diagnosis first. In these countries projects there are in many cases available X-ray machines (through low-cost projects and donations), but often the radiological expertise is missing for accurately assessing the images. An algorithm that could perform this task quickly and cheaply could drastically improve the ability to diagnose and ultimately treat the disease.</p> |
|
|
3 |
<p>In more developed countries, X-ray radiography is often used for screening new arrivals and determining eligibility for a work-permit. The task of manually examining images is time consuming and an algorithm could increase efficiency, improve performance and ultimately reduce cost of this screening. </p> |
|
|
4 |
<p>This dataset contains over 500 x-rays scans with clinical labels collected by radiologists.</p> |
|
|
5 |
<h1>Acknowledgements</h1> |
|
|
6 |
<p>The two datasets were published together in an analysis here: <a rel="noreferrer nofollow" aria-label="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4256233/ (opens in a new tab)" target="_blank" href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4256233/">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4256233/</a>.<br> |
|
|
7 |
The datasets come from Shenzhen and Montgomery respectively.</p> |
|
|
8 |
<h3>China Set - The Shenzhen set - Chest X-ray Database</h3> |
|
|
9 |
<p>The standard digital image database for Tuberculosis is created by the National Library of Medicine, Maryland, USA in collaboration with Shenzhen No.3 People’s Hospital, Guangdong Medical College, Shenzhen, China. The Chest X-rays are from out-patient clinics, and were captured as part of the daily routine using Philips DR Digital Diagnose systems. <br> |
|
|
10 |
Number of X-rays: </p> |
|
|
11 |
<ul> |
|
|
12 |
<li>336 cases with manifestation of tuberculosis, and </li> |
|
|
13 |
<li>326 normal cases.</li> |
|
|
14 |
</ul> |
|
|
15 |
<p>It is requested that publications resulting from the use of this data attribute the source (National Library of Medicine, National Institutes of Health, Bethesda, MD, USA and Shenzhen No.3 People’s Hospital, Guangdong Medical College, Shenzhen, China) and cite the following publications: </p> |
|
|
16 |
<ul> |
|
|
17 |
<li>Jaeger S, Karargyris A, Candemir S, Folio L, Siegelman J, Callaghan F, Xue Z, Palaniappan K, Singh RK, Antani S, Thoma G, Wang YX, Lu PX, McDonald CJ. Automatic tuberculosis screening using chest radiographs. IEEE Trans Med Imaging. 2014 Feb;33(2):233-45. doi: 10.1109/TMI.2013.2284099. PMID: 24108713</li> |
|
|
18 |
<li>Candemir S, Jaeger S, Palaniappan K, Musco JP, Singh RK, Xue Z, Karargyris A, Antani S, Thoma G, McDonald CJ. Lung segmentation in chest radiographs using anatomical atlases with nonrigid registration. IEEE Trans Med Imaging. 2014 Feb;33(2):577-90. doi: 10.1109/TMI.2013.2290491. PMID: 24239990</li> |
|
|
19 |
</ul> |
|
|
20 |
<h3>Montgomery County X-ray Set</h3> |
|
|
21 |
<p>X-ray images in this data set have been acquired from the tuberculosis control program of the Department of Health and Human Services of Montgomery County, MD, USA. This set contains 138 posterior-anterior x-rays, of which 80 x-rays are normal and 58 x-rays are abnormal with manifestations of tuberculosis. All images are de-identified and available in DICOM format. The set covers a wide range of abnormalities, including effusions and miliary patterns. The data set includes radiology readings available as a text file.</p> |
|
|
22 |
<h1>Ideas</h1> |
|
|
23 |
<ul> |
|
|
24 |
<li>Experiment with lung segmentation</li> |
|
|
25 |
<li>Build disease classifiers for various conditions</li> |
|
|
26 |
<li>Test models on data across different manufacturers </li> |
|
|
27 |
<li>Build GANs that are able to make the datasets indistinguishable (Adversarial Discriminative Domain Adaptation: <a rel="noreferrer nofollow" aria-label="https://arxiv.org/abs/1702.05464 (opens in a new tab)" target="_blank" href="https://arxiv.org/abs/1702.05464">https://arxiv.org/abs/1702.05464</a>)</li> |
|
|
28 |
</ul></div></div></div> |