Saved in:
| Main Authors: | Yang, Tao, Zhou, Qing, Li, Yanliang, Wang, Qi |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.04002 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CORE-Seg: Reasoning-Driven Segmentation for Complex Lesions via Reinforcement Learning
by: Xie, Yuxin, et al.
Published: (2026)
by: Xie, Yuxin, et al.
Published: (2026)
Decoupling the Image Perception and Multimodal Reasoning for Reasoning Segmentation with Digital Twin Representations
by: Li, Yizhen, et al.
Published: (2025)
by: Li, Yizhen, et al.
Published: (2025)
Privacy-Concealing Cooperative Perception for BEV Scene Segmentation
by: Wang, Song, et al.
Published: (2026)
by: Wang, Song, et al.
Published: (2026)
Defending LVLMs Against Vision Attacks through Partial-Perception Supervision
by: Zhou, Qi, et al.
Published: (2024)
by: Zhou, Qi, et al.
Published: (2024)
SpotNet: An Image Centric, Lidar Anchored Approach To Long Range Perception
by: Foucard, Louis, et al.
Published: (2024)
by: Foucard, Louis, et al.
Published: (2024)
Improving Medical Visual Reinforcement Fine-Tuning via Perception and Reasoning Augmentation
by: Yang, Guangjing, et al.
Published: (2026)
by: Yang, Guangjing, et al.
Published: (2026)
Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios
by: Wang, Kai, et al.
Published: (2024)
by: Wang, Kai, et al.
Published: (2024)
AD-MIR: Bridging the Gap from Perception to Persuasion in Advertising Video Understanding via Structured Reasoning
by: Xu, Binxiao, et al.
Published: (2026)
by: Xu, Binxiao, et al.
Published: (2026)
AFFormer: Adaptive Feature Fusion Transformer for V2X Cooperative Perception under Channel Impairments
by: Zhou, Xi, et al.
Published: (2026)
by: Zhou, Xi, et al.
Published: (2026)
Towards Long-window Anchoring in Vision-Language Model Distillation
by: Zhou, Haoyi, et al.
Published: (2025)
by: Zhou, Haoyi, et al.
Published: (2025)
Perception-R1: Advancing Multimodal Reasoning Capabilities of MLLMs via Visual Perception Reward
by: Xiao, Tong, et al.
Published: (2025)
by: Xiao, Tong, et al.
Published: (2025)
Novel Extraction of Discriminative Fine-Grained Feature to Improve Retinal Vessel Segmentation
by: Zeng, Shuang, et al.
Published: (2025)
by: Zeng, Shuang, et al.
Published: (2025)
Enhancing Compositional Reasoning in CLIP via Reconstruction and Alignment of Text Descriptions
by: Kwon, Jihoon, et al.
Published: (2025)
by: Kwon, Jihoon, et al.
Published: (2025)
ARIADNE: A Perception-Reasoning Synergy Framework for Trustworthy Coronary Angiography Analysis
by: Jin, Zhan, et al.
Published: (2026)
by: Jin, Zhan, et al.
Published: (2026)
Perception Before Reasoning: Two-Stage Reinforcement Learning for Visual Reasoning in Vision-Language Models
by: Chen, Yan, et al.
Published: (2025)
by: Chen, Yan, et al.
Published: (2025)
SATORI-R1: Incentivizing Multimodal Reasoning through Explicit Visual Anchoring
by: Shen, Chuming, et al.
Published: (2025)
by: Shen, Chuming, et al.
Published: (2025)
UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning
by: Liu, Ye, et al.
Published: (2025)
by: Liu, Ye, et al.
Published: (2025)
SegLLM: Multi-round Reasoning Segmentation
by: Wang, XuDong, et al.
Published: (2024)
by: Wang, XuDong, et al.
Published: (2024)
Bridging the Semantic Chasm: Synergistic Conceptual Anchoring for Generalized Few-Shot and Zero-Shot OOD Perception
by: Christoforos, Alexandros, et al.
Published: (2026)
by: Christoforos, Alexandros, et al.
Published: (2026)
Visually Descriptive Language Model for Vector Graphics Reasoning
by: Wang, Zhenhailong, et al.
Published: (2024)
by: Wang, Zhenhailong, et al.
Published: (2024)
HRVVS: A High-resolution Video Vasculature Segmentation Network via Hierarchical Autoregressive Residual Priors
by: Yao, Xincheng, et al.
Published: (2025)
by: Yao, Xincheng, et al.
Published: (2025)
Guiding Perception-Reasoning Closer to Human in Blind Image Quality Assessment
by: Li, Yuan, et al.
Published: (2025)
by: Li, Yuan, et al.
Published: (2025)
CogFlow: Bridging Perception and Reasoning through Knowledge Internalization for Visual Mathematical Problem Solving
by: Chen, Shuhang, et al.
Published: (2026)
by: Chen, Shuhang, et al.
Published: (2026)
Volumetric Directional Diffusion: Anchoring Uncertainty Quantification in Anatomical Consensus for Ambiguous Medical Image Segmentation
by: Wu, Chao, et al.
Published: (2026)
by: Wu, Chao, et al.
Published: (2026)
Enhancing Shape Perception and Segmentation Consistency for Industrial Image Inspection
by: Mao, Guoxuan, et al.
Published: (2025)
by: Mao, Guoxuan, et al.
Published: (2025)
Towards Visual Discrimination and Reasoning of Real-World Physical Dynamics: Physics-Grounded Anomaly Detection
by: Li, Wenqiao, et al.
Published: (2025)
by: Li, Wenqiao, et al.
Published: (2025)
RSVP: Reasoning Segmentation via Visual Prompting and Multi-modal Chain-of-Thought
by: Lu, Yi, et al.
Published: (2025)
by: Lu, Yi, et al.
Published: (2025)
All in One: Visual-Description-Guided Unified Point Cloud Segmentation
by: Han, Zongyan, et al.
Published: (2025)
by: Han, Zongyan, et al.
Published: (2025)
Rule-VLN: Bridging Perception and Compliance via Semantic Reasoning and Geometric Rectification
by: Wen, Jiawen, et al.
Published: (2026)
by: Wen, Jiawen, et al.
Published: (2026)
Deep Expert Injection for Anchoring Retinal VLMs with Domain-Specific Knowledge
by: Lu, Shuai, et al.
Published: (2026)
by: Lu, Shuai, et al.
Published: (2026)
CaughtCheating: Is Your MLLM a Good Cheating Detective? Exploring the Boundary of Visual Perception and Reasoning
by: Li, Ming, et al.
Published: (2025)
by: Li, Ming, et al.
Published: (2025)
VisionCoach: Reinforcing Grounded Video Reasoning via Visual-Perception Prompting
by: Lee, Daeun, et al.
Published: (2026)
by: Lee, Daeun, et al.
Published: (2026)
Perception, Understanding and Reasoning, A Multimodal Benchmark for Video Fake News Detection
by: Yakun, Cui, et al.
Published: (2025)
by: Yakun, Cui, et al.
Published: (2025)
Gram-Anchored Prompt Learning for Vision-Language Models via Second-Order Statistics
by: Chen, Minglei, et al.
Published: (2026)
by: Chen, Minglei, et al.
Published: (2026)
Beyond Prompt Degradation: Prototype-guided Dual-pool Prompting for Incremental Object Detection
by: Zhang, Yaoteng, et al.
Published: (2026)
by: Zhang, Yaoteng, et al.
Published: (2026)
XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation
by: Wang, Ziyi, et al.
Published: (2024)
by: Wang, Ziyi, et al.
Published: (2024)
SS-ADA: A Semi-Supervised Active Domain Adaptation Framework for Semantic Segmentation
by: Yan, Weihao, et al.
Published: (2024)
by: Yan, Weihao, et al.
Published: (2024)
DiG: Differential Grounding for Enhancing Fine-Grained Perception in Multimodal Large Language Model
by: Tao, Zhou, et al.
Published: (2025)
by: Tao, Zhou, et al.
Published: (2025)
Evolving, Not Training: Zero-Shot Reasoning Segmentation via Evolutionary Prompting
by: Ye, Kai, et al.
Published: (2025)
by: Ye, Kai, et al.
Published: (2025)
From Noisy Labels to Intrinsic Structure: A Geometric-Structural Dual-Guided Framework for Noise-Robust Medical Image Segmentation
by: Wang, Tao, et al.
Published: (2025)
by: Wang, Tao, et al.
Published: (2025)
Similar Items
-
CORE-Seg: Reasoning-Driven Segmentation for Complex Lesions via Reinforcement Learning
by: Xie, Yuxin, et al.
Published: (2026) -
Decoupling the Image Perception and Multimodal Reasoning for Reasoning Segmentation with Digital Twin Representations
by: Li, Yizhen, et al.
Published: (2025) -
Privacy-Concealing Cooperative Perception for BEV Scene Segmentation
by: Wang, Song, et al.
Published: (2026) -
Defending LVLMs Against Vision Attacks through Partial-Perception Supervision
by: Zhou, Qi, et al.
Published: (2024) -
SpotNet: An Image Centric, Lidar Anchored Approach To Long Range Perception
by: Foucard, Louis, et al.
Published: (2024)