:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Jun, Liu, Che, Bai, Wenjia, Liu, Mingxuan, Arcucci, Rossella, Bercea, Cosmin I., Schnabel, Julia A.
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2508.04572
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Enhancing Abnormality Grounding for Vision Language Models with Knowledge Descriptions
by: Li, Jun, et al.
Published: (2025)

FMBench: Benchmarking Fairness in Multimodal Large Language Models on Medical Tasks
by: Wu, Peiran, et al.
Published: (2024)

Dynamic Decision Learning: Test-Time Evolution for Abnormality Grounding in Rare Diseases
by: Li, Jun, et al.
Published: (2026)

Utilizing Synthetic Data for Medical Vision-Language Pre-training: Bypassing the Need for Real Images
by: Liu, Che, et al.
Published: (2023)

Interpretable Representation Learning of Cardiac MRI via Attribute Regularization
by: Di Folco, Maxime, et al.
Published: (2024)

G2D: From Global to Dense Radiography Representation Learning via Vision-Language Pre-training
by: Liu, Che, et al.
Published: (2023)

How Far Have Medical Vision-Language Models Come? A Comprehensive Benchmarking Study
by: Liu, Che, et al.
Published: (2025)

IMITATE: Clinical Prior Guided Hierarchical Vision-Language Pre-training
by: Liu, Che, et al.
Published: (2023)

Denoising Diffusion Models for Anomaly Localization in Medical Images
by: Bercea, Cosmin I., et al.
Published: (2024)

Towards Universal Unsupervised Anomaly Detection in Medical Imaging
by: Bercea, Cosmin I., et al.
Published: (2024)

Diffusion Models with Implicit Guidance for Medical Anomaly Detection
by: Bercea, Cosmin I., et al.
Published: (2024)

How Does Diverse Interpretability of Textual Prompts Impact Medical Vision-Language Zero-Shot Tasks?
by: Wang, Sicheng, et al.
Published: (2024)

Semantic Alignment of Unimodal Medical Text and Vision Representations
by: Di Folco, Maxime, et al.
Published: (2025)

LocBAM: Advancing 3D Patch-Based Image Segmentation by Integrating Location Contex
by: Hooft, Donnate, et al.
Published: (2026)

BIMCV-R: A Landmark Dataset for 3D CT Text-Image Retrieval
by: Chen, Yinda, et al.
Published: (2024)

NOVA: A Benchmark for Anomaly Localization and Clinical Reasoning in Brain MRI
by: Bercea, Cosmin I., et al.
Published: (2025)

MedEdit: Counterfactual Diffusion-based Image Editing on Brain MRI
by: Alaya, Malek Ben, et al.
Published: (2024)

BOTM: Echocardiography Segmentation via Bi-directional Optimal Token Matching
by: Liu, Zhihua, et al.
Published: (2025)

Knowledge-enhanced Multimodal ECG Representation Learning with Arbitrary-Lead Inputs
by: Liu, Che, et al.
Published: (2025)

Influence of Classification Task and Distribution Shift Type on OOD Detection in Fetal Ultrasound
by: Wong, Chun Kit, et al.
Published: (2025)

Freeze the backbones: A Parameter-Efficient Contrastive Approach to Robust Medical Vision-Language Pre-training
by: Qin, Jiuming, et al.
Published: (2024)

Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement
by: Liu, Che, et al.
Published: (2024)

T3D: Advancing 3D Medical Vision-Language Pre-training by Learning Multi-View Visual Consistency
by: Liu, Che, et al.
Published: (2023)

Language Models Meet Anomaly Detection for Better Interpretability and Generalizability
by: Li, Jun, et al.
Published: (2024)

Does DINOv3 Set a New Medical Vision Standard? Benchmarking 2D and 3D Classification, Segmentation, and Registration
by: Liu, Che, et al.
Published: (2025)

Unsupervised Analysis of Alzheimer's Disease Signatures using 3D Deformable Autoencoders
by: Avci, Mehmet Yigit, et al.
Published: (2024)

Noise2Noise Denoising of CRISM Hyperspectral Data
by: Platt, Robert, et al.
Published: (2024)

Wasserstein-Aligned Localisation for VLM-Based Distributional OOD Detection in Medical Imaging
by: Kainz, Bernhard, et al.
Published: (2026)

EchoSight: Advancing Visual-Language Models with Wiki Knowledge
by: Yan, Yibin, et al.
Published: (2024)

Argus: Benchmarking and Enhancing Vision-Language Models for 3D Radiology Report Generation
by: Liu, Che, et al.
Published: (2024)

Selective Test-Time Adaptation for Unsupervised Anomaly Detection using Neural Implicit Representations
by: Ambekar, Sameer, et al.
Published: (2024)

KARL: Knowledge-Aware Reasoning and Reinforcement Learning for Knowledge-Intensive Visual Grounding
by: Ma, Xinyu, et al.
Published: (2025)

Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge
by: Zhao, Yaqi, et al.
Published: (2024)

Can Medical Vision-Language Pre-training Succeed with Purely Synthetic Data?
by: Liu, Che, et al.
Published: (2024)

Insight Over Sight: Exploring the Vision-Knowledge Conflicts in Multimodal LLMs
by: Liu, Xiaoyuan, et al.
Published: (2024)

Enhancing Medical Visual Grounding via Knowledge-guided Spatial Prompts
by: Gao, Yifan, et al.
Published: (2026)

Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by Diminishing Bias
by: Wan, Zhongwei, et al.
Published: (2023)

Self-Rewarding Vision-Language Model via Reasoning Decomposition
by: Li, Zongxia, et al.
Published: (2025)

LATex: Leveraging Attribute-based Text Knowledge for Aerial-Ground Person Re-Identification
by: Zhang, Pingping, et al.
Published: (2025)

Knowledge Condensation and Reasoning for Knowledge-based VQA
by: Hao, Dongze, et al.
Published: (2024)