NIH Chest X-ray
Dataset Information
The NIH Chest X-ray Dataset is a large-scale medical imaging dataset, consisting of 112,120 X-ray images from 30,805 unique patients. Collected by the National Institutes of Health (NIH), each image is annotated with disease labels derived using Natural Language Processing (NLP) techniques from associated radiological reports. The dataset includes fourteen different thorax disease categories, making it a valuable resource for developing and evaluating computer-aided detection and diagnosis (CAD) systems. The labels are expected to be over 90% accurate, suitable for weakly-supervised learning. This dataset addresses the challenge of limited annotated medical imaging data and aims to facilitate advancements in medical image analysis. The dataset is randomly split into training, validation, and test sets to support machine learning model development and evaluation.
Task: Binary-Class
Labels:
0: normal, 1: disease
Samples:
- Train: 90,000
- Validation: 11,000
- Test: 11,120
Experiment Parameter
Learning Rate: 2e-3
Training Epoch: 50
Convergence Epoch: 11