Saved in:
| Main Authors: | Zhao, Weiren, Dong, Yi, Chen, Cheng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.08724 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding
by: Li, Hao, et al.
Published: (2024)
by: Li, Hao, et al.
Published: (2024)
GenMed: A Pairwise Generative Reformulation of Medical Diagnostic Tasks
by: Zhang, Hantao, et al.
Published: (2026)
by: Zhang, Hantao, et al.
Published: (2026)
MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos
by: Wang, Rongsheng, et al.
Published: (2025)
by: Wang, Rongsheng, et al.
Published: (2025)
MedRAT: Unpaired Medical Report Generation via Auxiliary Tasks
by: Hirsch, Elad, et al.
Published: (2024)
by: Hirsch, Elad, et al.
Published: (2024)
MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding
by: Su, Yuhao, et al.
Published: (2025)
by: Su, Yuhao, et al.
Published: (2025)
MedMO: Grounding and Understanding Multimodal Large Language Model for Medical Images
by: Deria, Ankan, et al.
Published: (2026)
by: Deria, Ankan, et al.
Published: (2026)
CrossMed: A Multimodal Cross-Task Benchmark for Compositional Generalization in Medical Imaging
by: Singh, Pooja, et al.
Published: (2025)
by: Singh, Pooja, et al.
Published: (2025)
Semi-MedRef: Semi-Supervised Medical Referring Image Segmentation with Cross-Modal Alignment
by: Li, Yuchen, et al.
Published: (2026)
by: Li, Yuchen, et al.
Published: (2026)
MultiMed: Massively Multimodal and Multitask Medical Understanding
by: Mo, Shentong, et al.
Published: (2024)
by: Mo, Shentong, et al.
Published: (2024)
MedVLThinker: Simple Baselines for Multimodal Medical Reasoning
by: Huang, Xiaoke, et al.
Published: (2025)
by: Huang, Xiaoke, et al.
Published: (2025)
Generalized Task-Driven Medical Image Quality Enhancement with Gradient Promotion
by: Zhang, Dong, et al.
Published: (2025)
by: Zhang, Dong, et al.
Published: (2025)
Synergizing Understanding and Generation with Interleaved Analyzing-Drafting Thinking
by: Wu, Shengqiong, et al.
Published: (2026)
by: Wu, Shengqiong, et al.
Published: (2026)
Med3DInsight: Enhancing 3D Medical Image Understanding with 2D Multi-Modal Large Language Models
by: Chen, Qiuhui, et al.
Published: (2024)
by: Chen, Qiuhui, et al.
Published: (2024)
MedHorizon: Towards Long-context Medical Video Understanding in the Wild
by: Du, Bodong, et al.
Published: (2026)
by: Du, Bodong, et al.
Published: (2026)
DuoGen: Towards General Purpose Interleaved Multimodal Generation
by: Shi, Min, et al.
Published: (2026)
by: Shi, Min, et al.
Published: (2026)
ProbMed: A Probabilistic Framework for Medical Multimodal Binding
by: Gao, Yuan, et al.
Published: (2025)
by: Gao, Yuan, et al.
Published: (2025)
MedXChat: A Unified Multimodal Large Language Model Framework towards CXRs Understanding and Generation
by: Yang, Ling, et al.
Published: (2023)
by: Yang, Ling, et al.
Published: (2023)
MetaSSL: A General Heterogeneous Loss for Semi-Supervised Medical Image Segmentation
by: Zhao, Weiren, et al.
Published: (2025)
by: Zhao, Weiren, et al.
Published: (2025)
UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation
by: Li, Teng, et al.
Published: (2025)
by: Li, Teng, et al.
Published: (2025)
MedSeg-R: Reasoning Segmentation in Medical Images with Multimodal Large Language Models
by: Huang, Yu, et al.
Published: (2025)
by: Huang, Yu, et al.
Published: (2025)
Synergizing Discriminative Exemplars and Self-Refined Experience for MLLM-based In-Context Learning in Medical Diagnosis
by: Zhao, Wenkai, et al.
Published: (2026)
by: Zhao, Wenkai, et al.
Published: (2026)
MedCycle: Unpaired Medical Report Generation via Cycle-Consistency
by: Hirsch, Elad, et al.
Published: (2024)
by: Hirsch, Elad, et al.
Published: (2024)
Med-RewardBench: Benchmarking Reward Models and Judges for Medical Multimodal Large Language Models
by: Ding, Meidan, et al.
Published: (2025)
by: Ding, Meidan, et al.
Published: (2025)
MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale
by: Gai, Xiaotang, et al.
Published: (2024)
by: Gai, Xiaotang, et al.
Published: (2024)
GenAgent: Scaling Text-to-Image Generation via Agentic Multimodal Reasoning
by: Jiang, Kaixun, et al.
Published: (2026)
by: Jiang, Kaixun, et al.
Published: (2026)
MedKCO: Medical Vision-Language Pretraining via Knowledge-Driven Cognitive Orchestration
by: Zhang, Chenran, et al.
Published: (2026)
by: Zhang, Chenran, et al.
Published: (2026)
MedMerge: Merging Models for Effective Transfer Learning to Medical Imaging Tasks
by: Almakky, Ibrahim, et al.
Published: (2024)
by: Almakky, Ibrahim, et al.
Published: (2024)
OmniGenBench: A Benchmark for Omnipotent Multimodal Generation across 50+ Tasks
by: Wang, Jiayu, et al.
Published: (2025)
by: Wang, Jiayu, et al.
Published: (2025)
UniGen: Enhanced Training & Test-Time Strategies for Unified Multimodal Understanding and Generation
by: Tian, Rui, et al.
Published: (2025)
by: Tian, Rui, et al.
Published: (2025)
MedVAR: Towards Scalable and Efficient Medical Image Generation via Next-scale Autoregressive Prediction
by: He, Zhicheng, et al.
Published: (2026)
by: He, Zhicheng, et al.
Published: (2026)
AlignGen: Boosting Personalized Image Generation with Cross-Modality Prior Alignment
by: Lin, Yiheng, et al.
Published: (2025)
by: Lin, Yiheng, et al.
Published: (2025)
All-In-One Medical Image Restoration via Task-Adaptive Routing
by: Yang, Zhiwen, et al.
Published: (2024)
by: Yang, Zhiwen, et al.
Published: (2024)
SynerMix: Synergistic Mixup Solution for Enhanced Intra-Class Cohesion and Inter-Class Separability in Image Classification
by: Xu, Ye, et al.
Published: (2024)
by: Xu, Ye, et al.
Published: (2024)
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment
by: Yan, Ziang, et al.
Published: (2024)
by: Yan, Ziang, et al.
Published: (2024)
MedMoE: Modality-Specialized Mixture of Experts for Medical Vision-Language Understanding
by: Chopra, Shivang, et al.
Published: (2025)
by: Chopra, Shivang, et al.
Published: (2025)
Aligning Medical Images with General Knowledge from Large Language Models
by: Fang, Xiao, et al.
Published: (2024)
by: Fang, Xiao, et al.
Published: (2024)
Hulu-Med: A Transparent Generalist Model towards Holistic Medical Vision-Language Understanding
by: Jiang, Songtao, et al.
Published: (2025)
by: Jiang, Songtao, et al.
Published: (2025)
Med-Evo: Test-time Self-evolution for Medical Multimodal Large Language Models
by: Xu, Dunyuan, et al.
Published: (2026)
by: Xu, Dunyuan, et al.
Published: (2026)
ITO: Images and Texts as One via Synergizing Multiple Alignment and Training-Time Fusion
by: Liu, Hanpeng, et al.
Published: (2026)
by: Liu, Hanpeng, et al.
Published: (2026)
SAM-Med3D: Towards General-purpose Segmentation Models for Volumetric Medical Images
by: Wang, Haoyu, et al.
Published: (2023)
by: Wang, Haoyu, et al.
Published: (2023)
Similar Items
-
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding
by: Li, Hao, et al.
Published: (2024) -
GenMed: A Pairwise Generative Reformulation of Medical Diagnostic Tasks
by: Zhang, Hantao, et al.
Published: (2026) -
MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos
by: Wang, Rongsheng, et al.
Published: (2025) -
MedRAT: Unpaired Medical Report Generation via Auxiliary Tasks
by: Hirsch, Elad, et al.
Published: (2024) -
MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding
by: Su, Yuhao, et al.
Published: (2025)