:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhou, Kangcheng, Jiang, Jun, Zhang, Qing, Zheng, Shuang, Li, Qingli, Xu, Shugong
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2601.14757
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Unreal is all you need: Multimodal ISAC Data Simulation with Only One Engine
by: Huang, Kongwu, et al.
Published: (2025)

Historical Report Guided Bi-modal Concurrent Learning for Pathology Report Generation
by: Zhang, Ling, et al.
Published: (2025)

LoC-Path: Learning to Compress for Pathology Multimodal Large Language Models
by: Hu, Qingqiao, et al.
Published: (2025)

PathMR: Multimodal Visual Reasoning for Interpretable Pathology Diagnosis
by: Zhang, Ye, et al.
Published: (2025)

ReinDriveGen: Reinforcement Post-Training for Out-of-Distribution Driving Scene Generation
by: Zhang, Hao, et al.
Published: (2026)

PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology
by: Sun, Yuxuan, et al.
Published: (2024)

Stage-wise Adaptive Label Distribution for Facial Age Estimation
by: Wu, Bo, et al.
Published: (2025)

ReinDiffuse: Crafting Physically Plausible Motions with Reinforced Diffusion Model
by: Han, Gaoge, et al.
Published: (2024)

PathVG: A New Benchmark and Dataset for Pathology Visual Grounding
by: Zhong, Chunlin, et al.
Published: (2025)

Benchmarking PathCLIP for Pathology Image Analysis
by: Zheng, Sunyi, et al.
Published: (2024)

ReinPool: Reinforcement Learning Pooling Multi-Vector Embeddings for Retrieval System
by: Cha, Sungguk, et al.
Published: (2026)

PathAR: Structure-First Autoregressive Synthesis of Multimodal Pathology Images
by: Zhang, Yuan, et al.
Published: (2026)

Rein3D: Reinforced 3D Indoor Scene Generation with Panoramic Video Diffusion Models
by: Wang, Dehui, et al.
Published: (2026)

TLD: A Vehicle Tail Light signal Dataset and Benchmark
by: Chai, Jinhao, et al.
Published: (2024)

Rein++: Efficient Generalization and Adaptation for Semantic Segmentation with Vision Foundation Models
by: Wei, Zhixiang, et al.
Published: (2025)

Progressive Vision-Language Prompt for Multi-Organ Multi-Class Cell Semantic Segmentation with Single Branch
by: Zhang, Qing, et al.
Published: (2024)

ATR-UMMIM: A Benchmark Dataset for UAV-Based Multimodal Image Registration under Complex Imaging Conditions
by: Bin, Kangcheng, et al.
Published: (2025)

Knowledge Transfer from Interaction Learning
by: Gao, Yilin, et al.
Published: (2025)

Patho-AgenticRAG: Towards Multimodal Agentic Retrieval-Augmented Generation for Pathology VLMs via Reinforcement Learning
by: Zhang, Wenchuan, et al.
Published: (2025)

PathFound: An Agentic Multimodal Model Activating Evidence-seeking Pathological Diagnosis
by: Hua, Shengyi, et al.
Published: (2025)

PathVLM-R1: A Reinforcement Learning-Driven Reasoning Model for Pathology Visual-Language Tasks
by: Wu, Jianyu, et al.
Published: (2025)

Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert Reasoner
by: Zhang, Wenchuan, et al.
Published: (2025)

Multimodal Model for Computational Pathology:Representation Learning and Image Compression
by: Wu, Peihang, et al.
Published: (2026)

PathFL: Multi-Alignment Federated Learning for Pathology Image Segmentation
by: Zhang, Yuan, et al.
Published: (2025)

Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
by: Jiang, Jingjing, et al.
Published: (2025)

PathFLIP: Fine-grained Language-Image Pretraining for Versatile Computational Pathology
by: Liu, Fengchun, et al.
Published: (2025)

Learning Spatial-Preserving Hierarchical Representations for Digital Pathology
by: Wu, Weiyi, et al.
Published: (2024)

Visual Bridge: Universal Visual Perception Representations Generating
by: Gao, Yilin, et al.
Published: (2025)

Metric-Solver: Sliding Anchored Metric Depth Estimation from a Single Image
by: Wen, Tao, et al.
Published: (2025)

Dictionary-based Pathology Mining with Hard-instance-assisted Classifier Debiasing for Genetic Biomarker Prediction from WSIs
by: Zhang, Ling, et al.
Published: (2026)

OD-DETR: Online Distillation for Stabilizing Training of Detection Transformer
by: Wu, Shengjian, et al.
Published: (2024)

SAM-R1: Leveraging SAM for Reward Feedback in Multimodal Segmentation via Reinforcement Learning
by: Huang, Jiaqi, et al.
Published: (2025)

A Learnable Color Correction Matrix for RAW Reconstruction
by: Liu, Anqi, et al.
Published: (2024)

AdaRevD: Adaptive Patch Exiting Reversible Decoder Pushes the Limit of Image Deblurring
by: Mao, Xintian, et al.
Published: (2024)

On Path to Multimodal Generalist: General-Level and General-Bench
by: Fei, Hao, et al.
Published: (2025)

SpineBench: Benchmarking Multimodal LLMs for Spinal Pathology Analysis
by: Zhang, Chenghanyu, et al.
Published: (2025)

Boosting Open-Domain Continual Learning via Leveraging Intra-domain Category-aware Prototype
by: Lu, Yadong, et al.
Published: (2024)

PathAsst: A Generative Foundation AI Assistant Towards Artificial General Intelligence of Pathology
by: Sun, Yuxuan, et al.
Published: (2023)

Viewport-Unaware Blind Omnidirectional Image Quality Assessment: A Unified and Generalized Approach
by: Yan, Jiebin, et al.
Published: (2026)

Nearshore Underwater Target Detection Meets UAV-borne Hyperspectral Remote Sensing: A Novel Hybrid-level Contrastive Learning Framework and Benchmark Dataset
by: Qi, Jiahao, et al.
Published: (2025)