Saved in:
Bibliographic Details
Main Authors: Zhu, Qingqing, Jin, Qiao, Mathai, Tejas S., Fang, Yin, Wang, Zhizheng, Yang, Yifan, Sarfo-Gyamfi, Maame, Hou, Benjamin, Gu, Ran, Balamuralikrishna, Praveen T. S., Wang, Kenneth C., Summers, Ronald M., Lu, Zhiyong
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.14879
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Artificial intelligence (AI) can automatically delineate lesions on computed tomography (CT) and generate radiology report content, yet progress is limited by the scarcity of publicly available CT datasets with lesion-level annotations. To bridge this gap, we introduce CT-Bench, a first-of-its-kind benchmark dataset comprising two components: a Lesion Image and Metadata Set containing 20,335 lesions from 7,795 CT studies with bounding boxes, descriptions, and size information, and a multitask visual question answering benchmark with 2,850 QA pairs covering lesion localization, description, size estimation, and attribute categorization. Hard negative examples are included to reflect real-world diagnostic challenges. We evaluate multiple state-of-the-art multimodal models, including vision-language and medical CLIP variants, by comparing their performance to radiologist assessments, demonstrating the value of CT-Bench as a comprehensive benchmark for lesion analysis. Moreover, fine-tuning models on the Lesion Image and Metadata Set yields significant performance gains across both components, underscoring the clinical utility of CT-Bench.