Saved in:
| Main Authors: | Mao, Yuren, Xu, Wenyi, Qin, Yuyang, Gao, Yunjun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.16229 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MARCH: Multi-Agent Radiology Clinical Hierarchy for CT Report Generation
by: Lin, Yi, et al.
Published: (2026)
by: Lin, Yi, et al.
Published: (2026)
RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering
by: Butsanets, Léo, et al.
Published: (2025)
by: Butsanets, Léo, et al.
Published: (2025)
MedErr-CT: A Visual Question Answering Benchmark for Identifying and Correcting Errors in CT Reports
by: Kyung, Sunggu, et al.
Published: (2025)
by: Kyung, Sunggu, et al.
Published: (2025)
Warehouse Spatial Question Answering with LLM Agent
by: Huang, Hsiang-Wei, et al.
Published: (2025)
by: Huang, Hsiang-Wei, et al.
Published: (2025)
MedRegion-CT: Region-Focused Multimodal LLM for Comprehensive 3D CT Report Generation
by: Kyung, Sunggu, et al.
Published: (2025)
by: Kyung, Sunggu, et al.
Published: (2025)
Radiology Report Conditional 3D CT Generation with Multi Encoder Latent diffusion Model
by: Amirrajab, Sina, et al.
Published: (2025)
by: Amirrajab, Sina, et al.
Published: (2025)
CT-GLIP: 3D Grounded Language-Image Pretraining with CT Scans and Radiology Reports for Full-Body Scenarios
by: Lin, Jingyang, et al.
Published: (2024)
by: Lin, Jingyang, et al.
Published: (2024)
Free Form Medical Visual Question Answering in Radiology
by: Narayanan, Abhishek, et al.
Published: (2024)
by: Narayanan, Abhishek, et al.
Published: (2024)
Enhanced Multimodal RAG-LLM for Accurate Visual Question Answering
by: Xue, Junxiao, et al.
Published: (2024)
by: Xue, Junxiao, et al.
Published: (2024)
CT2Rep: Automated Radiology Report Generation for 3D Medical Imaging
by: Hamamci, Ibrahim Ethem, et al.
Published: (2024)
by: Hamamci, Ibrahim Ethem, et al.
Published: (2024)
VDMA: Video Question Answering with Dynamically Generated Multi-Agents
by: Kugo, Noriyuki, et al.
Published: (2024)
by: Kugo, Noriyuki, et al.
Published: (2024)
GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes
by: Hamamci, Ibrahim Ethem, et al.
Published: (2023)
by: Hamamci, Ibrahim Ethem, et al.
Published: (2023)
VideoMultiAgents: A Multi-Agent Framework for Video Question Answering
by: Kugo, Noriyuki, et al.
Published: (2025)
by: Kugo, Noriyuki, et al.
Published: (2025)
3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models
by: Chen, Hao, et al.
Published: (2024)
by: Chen, Hao, et al.
Published: (2024)
Learning to Search: A Decision-Based Agent for Knowledge-Based Visual Question Answering
by: Chen, Zhuohong, et al.
Published: (2026)
by: Chen, Zhuohong, et al.
Published: (2026)
Evaluating Hallucination in Text-to-Image Diffusion Models with Scene-Graph based Question-Answering Agent
by: Qin, Ziyuan, et al.
Published: (2024)
by: Qin, Ziyuan, et al.
Published: (2024)
Imitating Radiological Scrolling: A Global-Local Attention Model for 3D Chest CT Volumes Multi-Label Anomaly Classification
by: Di Piazza, Theo, et al.
Published: (2025)
by: Di Piazza, Theo, et al.
Published: (2025)
Grounding Chest X-Ray Visual Question Answering with Generated Radiology Reports
by: Serra, Francesco Dalla, et al.
Published: (2025)
by: Serra, Francesco Dalla, et al.
Published: (2025)
ORCA: Orchestrated Reasoning with Collaborative Agents for Document Visual Question Answering
by: Lassoued, Aymen, et al.
Published: (2026)
by: Lassoued, Aymen, et al.
Published: (2026)
Reflective Dialogue between Teacher and Solver Agents for Video Question Answering
by: Murakawa, Takuya, et al.
Published: (2026)
by: Murakawa, Takuya, et al.
Published: (2026)
ChartAgent: A Multimodal Agent for Visually Grounded Reasoning in Complex Chart Question Answering
by: Kaur, Rachneet, et al.
Published: (2025)
by: Kaur, Rachneet, et al.
Published: (2025)
Adapting Lightweight Vision Language Models for Radiological Visual Question Answering
by: Shourya, Aditya, et al.
Published: (2025)
by: Shourya, Aditya, et al.
Published: (2025)
3D-MoRe: Unified Modal-Contextual Reasoning for Embodied Question Answering
by: Xu, Rongtao, et al.
Published: (2025)
by: Xu, Rongtao, et al.
Published: (2025)
SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
by: Qiu, Jielin, et al.
Published: (2024)
by: Qiu, Jielin, et al.
Published: (2024)
PETWB-REP: A Multi-Cancer Whole-Body FDG PET/CT and Radiology Report Dataset for Medical Imaging Research
by: Xue, Le, et al.
Published: (2025)
by: Xue, Le, et al.
Published: (2025)
Multimodal Rationales for Explainable Visual Question Answering
by: Li, Kun, et al.
Published: (2024)
by: Li, Kun, et al.
Published: (2024)
MTA-Agent: An Open Recipe for Multimodal Deep Search Agents
by: Peng, Xiangyu, et al.
Published: (2026)
by: Peng, Xiangyu, et al.
Published: (2026)
D-PerceptCT: Deep Perceptual Enhancement for Low-Dose CT Images
by: Nabila, Taifour Yousra, et al.
Published: (2025)
by: Nabila, Taifour Yousra, et al.
Published: (2025)
Sketch2CT: Multimodal Diffusion for Structure-Aware 3D Medical Volume Generation
by: An, Delin, et al.
Published: (2026)
by: An, Delin, et al.
Published: (2026)
Opportunistic Promptable Segmentation: Leveraging Routine Radiological Annotations to Guide 3D CT Lesion Segmentation
by: Church, Samuel, et al.
Published: (2026)
by: Church, Samuel, et al.
Published: (2026)
Barriers in Integrating Medical Visual Question Answering into Radiology Workflows: A Scoping Review and Clinicians' Insights
by: Mishra, Deepali, et al.
Published: (2025)
by: Mishra, Deepali, et al.
Published: (2025)
Towards Top-Down Reasoning: An Explainable Multi-Agent Approach for Visual Question Answering
by: Wang, Zeqing, et al.
Published: (2023)
by: Wang, Zeqing, et al.
Published: (2023)
3D Question Answering for City Scene Understanding
by: Sun, Penglei, et al.
Published: (2024)
by: Sun, Penglei, et al.
Published: (2024)
MobileFlow: A Multimodal LLM For Mobile GUI Agent
by: Nong, Songqin, et al.
Published: (2024)
by: Nong, Songqin, et al.
Published: (2024)
MedLSAM: Localize and Segment Anything Model for 3D CT Images
by: Lei, Wenhui, et al.
Published: (2023)
by: Lei, Wenhui, et al.
Published: (2023)
Enhancing Visual Question Answering with Multimodal LLMs via Chain-of-Question Guided Retrieval-Augmented Generation
by: Xu, Quanxing, et al.
Published: (2026)
by: Xu, Quanxing, et al.
Published: (2026)
A Short Review and Evaluation of SAM2's Performance in 3D CT Image Segmentation
by: He, Yufan, et al.
Published: (2024)
by: He, Yufan, et al.
Published: (2024)
GLeVE: Graph-Guided Lesion Grounding with Proposal Verification in 3D CT
by: Jiang, Shuo, et al.
Published: (2026)
by: Jiang, Shuo, et al.
Published: (2026)
Space3D-Bench: Spatial 3D Question Answering Benchmark
by: Szymanska, Emilia, et al.
Published: (2024)
by: Szymanska, Emilia, et al.
Published: (2024)
Foundation VAEs for 3D CT Reconstruction, Augmentation, and Generation
by: Chen, Qi, et al.
Published: (2026)
by: Chen, Qi, et al.
Published: (2026)
Similar Items
-
MARCH: Multi-Agent Radiology Clinical Hierarchy for CT Report Generation
by: Lin, Yi, et al.
Published: (2026) -
RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering
by: Butsanets, Léo, et al.
Published: (2025) -
MedErr-CT: A Visual Question Answering Benchmark for Identifying and Correcting Errors in CT Reports
by: Kyung, Sunggu, et al.
Published: (2025) -
Warehouse Spatial Question Answering with LLM Agent
by: Huang, Hsiang-Wei, et al.
Published: (2025) -
MedRegion-CT: Region-Focused Multimodal LLM for Comprehensive 3D CT Report Generation
by: Kyung, Sunggu, et al.
Published: (2025)