Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Ghelichkhan, Elham, Tasdizen, Tolga
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2503.01037
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910854296895488
author	Ghelichkhan, Elham Tasdizen, Tolga
author_facet	Ghelichkhan, Elham Tasdizen, Tolga
contents	Chest diseases rank among the most prevalent and dangerous global health issues. Object detection and phrase grounding deep learning models interpret complex radiology data to assist healthcare professionals in diagnosis. Object detection locates abnormalities for classes, while phrase grounding locates abnormalities for textual descriptions. This paper investigates how text enhances abnormality localization in chest X-rays by comparing the performance and explainability of these two tasks. To establish an explainability baseline, we proposed an automatic pipeline to generate image regions for report sentences using radiologists' eye-tracking data. The better performance - mIoU = 0.36 vs. 0.20 - and explainability - Containment ratio 0.48 vs. 0.26 - of the phrase grounding model infers the effectiveness of text in enhancing chest X-ray abnormality localization.
format	Preprint
id	arxiv_https___arxiv_org_abs_2503_01037
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	A Comparison of Object Detection and Phrase Grounding Models in Chest X-ray Abnormality Localization using Eye-tracking Data Ghelichkhan, Elham Tasdizen, Tolga Computer Vision and Pattern Recognition Machine Learning Chest diseases rank among the most prevalent and dangerous global health issues. Object detection and phrase grounding deep learning models interpret complex radiology data to assist healthcare professionals in diagnosis. Object detection locates abnormalities for classes, while phrase grounding locates abnormalities for textual descriptions. This paper investigates how text enhances abnormality localization in chest X-rays by comparing the performance and explainability of these two tasks. To establish an explainability baseline, we proposed an automatic pipeline to generate image regions for report sentences using radiologists' eye-tracking data. The better performance - mIoU = 0.36 vs. 0.20 - and explainability - Containment ratio 0.48 vs. 0.26 - of the phrase grounding model infers the effectiveness of text in enhancing chest X-ray abnormality localization.
title	A Comparison of Object Detection and Phrase Grounding Models in Chest X-ray Abnormality Localization using Eye-tracking Data
topic	Computer Vision and Pattern Recognition Machine Learning
url	https://arxiv.org/abs/2503.01037

Similar Items