Saved in:
Bibliographic Details
Main Authors: Hou, Benjamin, Zhu, Qingqing, Mathai, Tejas Sudarshan, Jin, Qiao, Lu, Zhiyong, Summers, Ronald M.
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2406.03688
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910474667294720
author Hou, Benjamin
Zhu, Qingqing
Mathai, Tejas Sudarshan
Jin, Qiao
Lu, Zhiyong
Summers, Ronald M.
author_facet Hou, Benjamin
Zhu, Qingqing
Mathai, Tejas Sudarshan
Jin, Qiao
Lu, Zhiyong
Summers, Ronald M.
contents In this paper, we introduce DRR-RATE, a large-scale synthetic chest X-ray dataset derived from the recently released CT-RATE dataset. DRR-RATE comprises of 50,188 frontal Digitally Reconstructed Radiographs (DRRs) from 21,304 unique patients. Each image is paired with a corresponding radiology text report and binary labels for 18 pathology classes. Given the controllable nature of DRR generation, it facilitates the inclusion of lateral view images and images from any desired viewing position. This opens up avenues for research into new and novel multimodal applications involving paired CT, X-ray images from various views, text, and binary labels. We demonstrate the applicability of DRR-RATE alongside existing large-scale chest X-ray resources, notably the CheXpert dataset and CheXnet model. Experiments demonstrate that CheXnet, when trained and tested on the DRR-RATE dataset, achieves sufficient to high AUC scores for the six common pathologies cited in common literature: Atelectasis, Cardiomegaly, Consolidation, Lung Lesion, Lung Opacity, and Pleural Effusion. Additionally, CheXnet trained on the CheXpert dataset can accurately identify several pathologies, even when operating out of distribution. This confirms that the generated DRR images effectively capture the essential pathology features from CT images. The dataset and labels are publicly accessible at https://huggingface.co/datasets/farrell236/DRR-RATE.
format Preprint
id arxiv_https___arxiv_org_abs_2406_03688
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Shadow and Light: Digitally Reconstructed Radiographs for Disease Classification
Hou, Benjamin
Zhu, Qingqing
Mathai, Tejas Sudarshan
Jin, Qiao
Lu, Zhiyong
Summers, Ronald M.
Image and Video Processing
Computer Vision and Pattern Recognition
In this paper, we introduce DRR-RATE, a large-scale synthetic chest X-ray dataset derived from the recently released CT-RATE dataset. DRR-RATE comprises of 50,188 frontal Digitally Reconstructed Radiographs (DRRs) from 21,304 unique patients. Each image is paired with a corresponding radiology text report and binary labels for 18 pathology classes. Given the controllable nature of DRR generation, it facilitates the inclusion of lateral view images and images from any desired viewing position. This opens up avenues for research into new and novel multimodal applications involving paired CT, X-ray images from various views, text, and binary labels. We demonstrate the applicability of DRR-RATE alongside existing large-scale chest X-ray resources, notably the CheXpert dataset and CheXnet model. Experiments demonstrate that CheXnet, when trained and tested on the DRR-RATE dataset, achieves sufficient to high AUC scores for the six common pathologies cited in common literature: Atelectasis, Cardiomegaly, Consolidation, Lung Lesion, Lung Opacity, and Pleural Effusion. Additionally, CheXnet trained on the CheXpert dataset can accurately identify several pathologies, even when operating out of distribution. This confirms that the generated DRR images effectively capture the essential pathology features from CT images. The dataset and labels are publicly accessible at https://huggingface.co/datasets/farrell236/DRR-RATE.
title Shadow and Light: Digitally Reconstructed Radiographs for Disease Classification
topic Image and Video Processing
Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2406.03688