Tuberculosis is a disease that affects many people in developing countries. While treatment is possible, it requires an accurate diagnosis first. In these countries projects there are in many cases available X-ray machines (through low-cost projects and donations), but often the radiological expertise is missing for accurately assessing the images. An algorithm that could perform this task quickly and cheaply could drastically improve the ability to diagnose and ultimately treat the disease.
In more developed countries, X-ray radiography is often used for screening new arrivals and determining eligibility for a work-permit. The task of manually examining images is time consuming and an algorithm could increase efficiency, improve performance and ultimately reduce cost of this screening.
This dataset contains over 500 x-rays scans with clinical labels collected by radiologists.
The two datasets were published together in an analysis here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4256233/.
The datasets come from Shenzhen and Montgomery respectively.
The standard digital image database for Tuberculosis is created by the National Library of Medicine, Maryland, USA in collaboration with Shenzhen No.3 People’s Hospital, Guangdong Medical College, Shenzhen, China. The Chest X-rays are from out-patient clinics, and were captured as part of the daily routine using Philips DR Digital Diagnose systems.
Number of X-rays:
It is requested that publications resulting from the use of this data attribute the source (National Library of Medicine, National Institutes of Health, Bethesda, MD, USA and Shenzhen No.3 People’s Hospital, Guangdong Medical College, Shenzhen, China) and cite the following publications:
X-ray images in this data set have been acquired from the tuberculosis control program of the Department of Health and Human Services of Montgomery County, MD, USA. This set contains 138 posterior-anterior x-rays, of which 80 x-rays are normal and 58 x-rays are abnormal with manifestations of tuberculosis. All images are de-identified and available in DICOM format. The set covers a wide range of abnormalities, including effusions and miliary patterns. The data set includes radiology readings available as a text file.