:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yang, Tao, Zhou, Qing, Li, Yanliang, Wang, Qi
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2603.04002
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

CORE-Seg: Reasoning-Driven Segmentation for Complex Lesions via Reinforcement Learning
by: Xie, Yuxin, et al.
Published: (2026)

Decoupling the Image Perception and Multimodal Reasoning for Reasoning Segmentation with Digital Twin Representations
by: Li, Yizhen, et al.
Published: (2025)

Privacy-Concealing Cooperative Perception for BEV Scene Segmentation
by: Wang, Song, et al.
Published: (2026)

Defending LVLMs Against Vision Attacks through Partial-Perception Supervision
by: Zhou, Qi, et al.
Published: (2024)

SpotNet: An Image Centric, Lidar Anchored Approach To Long Range Perception
by: Foucard, Louis, et al.
Published: (2024)

Improving Medical Visual Reinforcement Fine-Tuning via Perception and Reasoning Augmentation
by: Yang, Guangjing, et al.
Published: (2026)

Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios
by: Wang, Kai, et al.
Published: (2024)

AD-MIR: Bridging the Gap from Perception to Persuasion in Advertising Video Understanding via Structured Reasoning
by: Xu, Binxiao, et al.
Published: (2026)

AFFormer: Adaptive Feature Fusion Transformer for V2X Cooperative Perception under Channel Impairments
by: Zhou, Xi, et al.
Published: (2026)

Towards Long-window Anchoring in Vision-Language Model Distillation
by: Zhou, Haoyi, et al.
Published: (2025)

Perception-R1: Advancing Multimodal Reasoning Capabilities of MLLMs via Visual Perception Reward
by: Xiao, Tong, et al.
Published: (2025)

Novel Extraction of Discriminative Fine-Grained Feature to Improve Retinal Vessel Segmentation
by: Zeng, Shuang, et al.
Published: (2025)

Enhancing Compositional Reasoning in CLIP via Reconstruction and Alignment of Text Descriptions
by: Kwon, Jihoon, et al.
Published: (2025)

ARIADNE: A Perception-Reasoning Synergy Framework for Trustworthy Coronary Angiography Analysis
by: Jin, Zhan, et al.
Published: (2026)

Perception Before Reasoning: Two-Stage Reinforcement Learning for Visual Reasoning in Vision-Language Models
by: Chen, Yan, et al.
Published: (2025)

SATORI-R1: Incentivizing Multimodal Reasoning through Explicit Visual Anchoring
by: Shen, Chuming, et al.
Published: (2025)

UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning
by: Liu, Ye, et al.
Published: (2025)

SegLLM: Multi-round Reasoning Segmentation
by: Wang, XuDong, et al.
Published: (2024)

Bridging the Semantic Chasm: Synergistic Conceptual Anchoring for Generalized Few-Shot and Zero-Shot OOD Perception
by: Christoforos, Alexandros, et al.
Published: (2026)

Visually Descriptive Language Model for Vector Graphics Reasoning
by: Wang, Zhenhailong, et al.
Published: (2024)

HRVVS: A High-resolution Video Vasculature Segmentation Network via Hierarchical Autoregressive Residual Priors
by: Yao, Xincheng, et al.
Published: (2025)

Guiding Perception-Reasoning Closer to Human in Blind Image Quality Assessment
by: Li, Yuan, et al.
Published: (2025)

CogFlow: Bridging Perception and Reasoning through Knowledge Internalization for Visual Mathematical Problem Solving
by: Chen, Shuhang, et al.
Published: (2026)

Volumetric Directional Diffusion: Anchoring Uncertainty Quantification in Anatomical Consensus for Ambiguous Medical Image Segmentation
by: Wu, Chao, et al.
Published: (2026)

Enhancing Shape Perception and Segmentation Consistency for Industrial Image Inspection
by: Mao, Guoxuan, et al.
Published: (2025)

Towards Visual Discrimination and Reasoning of Real-World Physical Dynamics: Physics-Grounded Anomaly Detection
by: Li, Wenqiao, et al.
Published: (2025)

RSVP: Reasoning Segmentation via Visual Prompting and Multi-modal Chain-of-Thought
by: Lu, Yi, et al.
Published: (2025)

All in One: Visual-Description-Guided Unified Point Cloud Segmentation
by: Han, Zongyan, et al.
Published: (2025)

Rule-VLN: Bridging Perception and Compliance via Semantic Reasoning and Geometric Rectification
by: Wen, Jiawen, et al.
Published: (2026)

Deep Expert Injection for Anchoring Retinal VLMs with Domain-Specific Knowledge
by: Lu, Shuai, et al.
Published: (2026)

CaughtCheating: Is Your MLLM a Good Cheating Detective? Exploring the Boundary of Visual Perception and Reasoning
by: Li, Ming, et al.
Published: (2025)

VisionCoach: Reinforcing Grounded Video Reasoning via Visual-Perception Prompting
by: Lee, Daeun, et al.
Published: (2026)

Perception, Understanding and Reasoning, A Multimodal Benchmark for Video Fake News Detection
by: Yakun, Cui, et al.
Published: (2025)

Gram-Anchored Prompt Learning for Vision-Language Models via Second-Order Statistics
by: Chen, Minglei, et al.
Published: (2026)

Beyond Prompt Degradation: Prototype-guided Dual-pool Prompting for Incremental Object Detection
by: Zhang, Yaoteng, et al.
Published: (2026)

XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation
by: Wang, Ziyi, et al.
Published: (2024)

SS-ADA: A Semi-Supervised Active Domain Adaptation Framework for Semantic Segmentation
by: Yan, Weihao, et al.
Published: (2024)

DiG: Differential Grounding for Enhancing Fine-Grained Perception in Multimodal Large Language Model
by: Tao, Zhou, et al.
Published: (2025)

Evolving, Not Training: Zero-Shot Reasoning Segmentation via Evolutionary Prompting
by: Ye, Kai, et al.
Published: (2025)

From Noisy Labels to Intrinsic Structure: A Geometric-Structural Dual-Guided Framework for Noise-Robust Medical Image Segmentation
by: Wang, Tao, et al.
Published: (2025)