Saved in:
| Main Authors: | Liu, Zhihua, Tong, Lei, He, Xilin, Liu, Che, Arcucci, Rossella, Jin, Chen, Zhou, Huiyu |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.18052 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Segment Anyword: Mask Prompt Inversion for Open-Set Grounded Segmentation
by: Liu, Zhihua, et al.
Published: (2025)
by: Liu, Zhihua, et al.
Published: (2025)
How Does Diverse Interpretability of Textual Prompts Impact Medical Vision-Language Zero-Shot Tasks?
by: Wang, Sicheng, et al.
Published: (2024)
by: Wang, Sicheng, et al.
Published: (2024)
Utilizing Synthetic Data for Medical Vision-Language Pre-training: Bypassing the Need for Real Images
by: Liu, Che, et al.
Published: (2023)
by: Liu, Che, et al.
Published: (2023)
BIMCV-R: A Landmark Dataset for 3D CT Text-Image Retrieval
by: Chen, Yinda, et al.
Published: (2024)
by: Chen, Yinda, et al.
Published: (2024)
G2D: From Global to Dense Radiography Representation Learning via Vision-Language Pre-training
by: Liu, Che, et al.
Published: (2023)
by: Liu, Che, et al.
Published: (2023)
FMBench: Benchmarking Fairness in Multimodal Large Language Models on Medical Tasks
by: Wu, Peiran, et al.
Published: (2024)
by: Wu, Peiran, et al.
Published: (2024)
How Far Have Medical Vision-Language Models Come? A Comprehensive Benchmarking Study
by: Liu, Che, et al.
Published: (2025)
by: Liu, Che, et al.
Published: (2025)
Freeze the backbones: A Parameter-Efficient Contrastive Approach to Robust Medical Vision-Language Pre-training
by: Qin, Jiuming, et al.
Published: (2024)
by: Qin, Jiuming, et al.
Published: (2024)
Knowledge to Sight: Reasoning over Visual Attributes via Knowledge Decomposition for Abnormality Grounding
by: Li, Jun, et al.
Published: (2025)
by: Li, Jun, et al.
Published: (2025)
IMITATE: Clinical Prior Guided Hierarchical Vision-Language Pre-training
by: Liu, Che, et al.
Published: (2023)
by: Liu, Che, et al.
Published: (2023)
Can Medical Vision-Language Pre-training Succeed with Purely Synthetic Data?
by: Liu, Che, et al.
Published: (2024)
by: Liu, Che, et al.
Published: (2024)
Enhancing Abnormality Grounding for Vision Language Models with Knowledge Descriptions
by: Li, Jun, et al.
Published: (2025)
by: Li, Jun, et al.
Published: (2025)
Noise2Noise Denoising of CRISM Hyperspectral Data
by: Platt, Robert, et al.
Published: (2024)
by: Platt, Robert, et al.
Published: (2024)
DomainForensics: Exposing Face Forgery across Domains via Bi-directional Adaptation
by: Lv, Qingxuan, et al.
Published: (2023)
by: Lv, Qingxuan, et al.
Published: (2023)
Argus: Benchmarking and Enhancing Vision-Language Models for 3D Radiology Report Generation
by: Liu, Che, et al.
Published: (2024)
by: Liu, Che, et al.
Published: (2024)
Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by Diminishing Bias
by: Wan, Zhongwei, et al.
Published: (2023)
by: Wan, Zhongwei, et al.
Published: (2023)
OT-Drive: Out-of-Distribution Off-Road Traversable Area Segmentation via Optimal Transport
by: Zhao, Zhihua, et al.
Published: (2026)
by: Zhao, Zhihua, et al.
Published: (2026)
TokenSeg: Efficient 3D Medical Image Segmentation via Hierarchical Visual Token Compression
by: Zeng, Sen, et al.
Published: (2026)
by: Zeng, Sen, et al.
Published: (2026)
OMH: Structured Sparsity via Optimally Matched Hierarchy for Unsupervised Semantic Segmentation
by: Ozaydin, Baran, et al.
Published: (2024)
by: Ozaydin, Baran, et al.
Published: (2024)
One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos
by: Bai, Zechen, et al.
Published: (2024)
by: Bai, Zechen, et al.
Published: (2024)
Neural B-frame Video Compression with Bi-directional Reference Harmonization
by: Liu, Yuxi, et al.
Published: (2025)
by: Liu, Yuxi, et al.
Published: (2025)
T3D: Advancing 3D Medical Vision-Language Pre-training by Learning Multi-View Visual Consistency
by: Liu, Che, et al.
Published: (2023)
by: Liu, Che, et al.
Published: (2023)
DTBS: Dual-Teacher Bi-directional Self-training for Domain Adaptation in Nighttime Semantic Segmentation
by: Huang, Fanding, et al.
Published: (2024)
by: Huang, Fanding, et al.
Published: (2024)
Hierarchical Spatio-temporal Segmentation Network for Ejection Fraction Estimation in Echocardiography Videos
by: Wang, Dongfang, et al.
Published: (2025)
by: Wang, Dongfang, et al.
Published: (2025)
DOMR: Establishing Cross-View Segmentation via Dense Object Matching
by: Liao, Jitong, et al.
Published: (2025)
by: Liao, Jitong, et al.
Published: (2025)
A Semi-Supervised Approach with Error Reflection for Echocardiography Segmentation
by: Han, Xiaoxiang, et al.
Published: (2024)
by: Han, Xiaoxiang, et al.
Published: (2024)
Cross Fusion RGB-T Tracking with Bi-directional Adapter
by: Zeng, Zhirong, et al.
Published: (2024)
by: Zeng, Zhirong, et al.
Published: (2024)
Towards Robust Semantic Segmentation against Patch-based Attack via Attention Refinement
by: Yuan, Zheng, et al.
Published: (2024)
by: Yuan, Zheng, et al.
Published: (2024)
SMC-NCA: Semantic-guided Multi-level Contrast for Semi-supervised Temporal Action Segmentation
by: Zhou, Feixiang, et al.
Published: (2023)
by: Zhou, Feixiang, et al.
Published: (2023)
SPARNet: Continual Test-Time Adaptation via Sample Partitioning Strategy and Anti-Forgetting Regularization
by: Meng, Xinru, et al.
Published: (2025)
by: Meng, Xinru, et al.
Published: (2025)
Mask-adaptive Gated Convolution and Bi-directional Progressive Fusion Network for Depth Completion
by: Huang, Tingxuan, et al.
Published: (2024)
by: Huang, Tingxuan, et al.
Published: (2024)
Efficient Point Clouds Upsampling via Flow Matching
by: Liu, Zhi-Song, et al.
Published: (2025)
by: Liu, Zhi-Song, et al.
Published: (2025)
Scaling Mesh Generation via Compressive Tokenization
by: Weng, Haohan, et al.
Published: (2024)
by: Weng, Haohan, et al.
Published: (2024)
Fuse & Calibrate: A bi-directional Vision-Language Guided Framework for Referring Image Segmentation
by: Yan, Yichen, et al.
Published: (2024)
by: Yan, Yichen, et al.
Published: (2024)
GDKVM: Echocardiography Video Segmentation via Spatiotemporal Key-Value Memory with Gated Delta Rule
by: Wang, Rui, et al.
Published: (2025)
by: Wang, Rui, et al.
Published: (2025)
Does DINOv3 Set a New Medical Vision Standard? Benchmarking 2D and 3D Classification, Segmentation, and Registration
by: Liu, Che, et al.
Published: (2025)
by: Liu, Che, et al.
Published: (2025)
SimToken: A Simple Baseline for Referring Audio-Visual Segmentation
by: Jin, Dian, et al.
Published: (2025)
by: Jin, Dian, et al.
Published: (2025)
Medical Referring Image Segmentation via Next-Token Mask Prediction
by: Chen, Xinyu, et al.
Published: (2025)
by: Chen, Xinyu, et al.
Published: (2025)
DCFS: Continual Test-Time Adaptation via Dual Consistency of Feature and Sample
by: Yin, Wenting, et al.
Published: (2025)
by: Yin, Wenting, et al.
Published: (2025)
OSA: Echocardiography Video Segmentation via Orthogonalized State Update and Anatomical Prior-aware Feature Enhancement
by: Wang, Rui, et al.
Published: (2026)
by: Wang, Rui, et al.
Published: (2026)
Similar Items
-
Segment Anyword: Mask Prompt Inversion for Open-Set Grounded Segmentation
by: Liu, Zhihua, et al.
Published: (2025) -
How Does Diverse Interpretability of Textual Prompts Impact Medical Vision-Language Zero-Shot Tasks?
by: Wang, Sicheng, et al.
Published: (2024) -
Utilizing Synthetic Data for Medical Vision-Language Pre-training: Bypassing the Need for Real Images
by: Liu, Che, et al.
Published: (2023) -
BIMCV-R: A Landmark Dataset for 3D CT Text-Image Retrieval
by: Chen, Yinda, et al.
Published: (2024) -
G2D: From Global to Dense Radiography Representation Learning via Vision-Language Pre-training
by: Liu, Che, et al.
Published: (2023)