Saved in:
| Main Authors: | Li, Yang, Chen, Xing, Liu, Yutao, Qi, Gege, BI, Yanxian, Wang, Zizhe, Zhang, Yunjian, Zhu, Yao |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.09337 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
by: Wang, Sudong, et al.
Published: (2025)
by: Wang, Sudong, et al.
Published: (2025)
SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition
by: Xu, Peiran, et al.
Published: (2025)
by: Xu, Peiran, et al.
Published: (2025)
RS-OOD: A Vision-Language Augmented Framework for Out-of-Distribution Detection in Remote Sensing
by: Wang, Chenhao, et al.
Published: (2025)
by: Wang, Chenhao, et al.
Published: (2025)
SOPSeg: Prompt-based Small Object Instance Segmentation in Remote Sensing Imagery
by: Wang, Chenhao, et al.
Published: (2025)
by: Wang, Chenhao, et al.
Published: (2025)
An Efficient Framework for Enhancing Discriminative Models via Diffusion Techniques
by: Li, Chunxiao, et al.
Published: (2024)
by: Li, Chunxiao, et al.
Published: (2024)
CAPO: Reinforcing Consistent Reasoning in Medical Decision-Making
by: Jiang, Songtao, et al.
Published: (2025)
by: Jiang, Songtao, et al.
Published: (2025)
Do LLMs Understand Visual Anomalies? Uncovering LLM's Capabilities in Zero-shot Anomaly Detection
by: Zhu, Jiaqi, et al.
Published: (2024)
by: Zhu, Jiaqi, et al.
Published: (2024)
Think 360°: Evaluating the Width-centric Reasoning Capability of MLLMs Beyond Depth
by: Chen, Mingrui, et al.
Published: (2026)
by: Chen, Mingrui, et al.
Published: (2026)
Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis
by: Miao, Boming, et al.
Published: (2024)
by: Miao, Boming, et al.
Published: (2024)
Exploring Decision-Making Capabilities of LLM Agents: An Experimental Study on Jump-Jump Game
by: Li, Juwu
Published: (2025)
by: Li, Juwu
Published: (2025)
Sliding-Window Merging for Compacting Patch-Redundant Layers in LLMs
by: Ding, Xuan, et al.
Published: (2025)
by: Ding, Xuan, et al.
Published: (2025)
Angle of Arrival Estimation with Transformer: A Sparse and Gridless Method with Zero-Shot Capability
by: Zhu, Zhaoxuan, et al.
Published: (2024)
by: Zhu, Zhaoxuan, et al.
Published: (2024)
AdvLogo: Adversarial Patch Attack against Object Detectors based on Diffusion Models
by: Miao, Boming, et al.
Published: (2024)
by: Miao, Boming, et al.
Published: (2024)
HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments
by: Zhou, Qinhong, et al.
Published: (2024)
by: Zhou, Qinhong, et al.
Published: (2024)
Strat-Reasoner: Reinforcing Strategic Reasoning of LLMs in Multi-Agent Games
by: He, Yidong, et al.
Published: (2026)
by: He, Yidong, et al.
Published: (2026)
TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models
by: Shangguan, Ziyao, et al.
Published: (2024)
by: Shangguan, Ziyao, et al.
Published: (2024)
Unlocking Multilingual Reasoning Capability of LLMs and LVLMs through Representation Engineering
by: Li, Qiming, et al.
Published: (2025)
by: Li, Qiming, et al.
Published: (2025)
D2-Mamba: Dual-Scale Fusion and Dual-Path Scanning with SSMs for Shadow Removal
by: Li, Linhao, et al.
Published: (2025)
by: Li, Linhao, et al.
Published: (2025)
Making Large Language Models Better Planners with Reasoning-Decision Alignment
by: Huang, Zhijian, et al.
Published: (2024)
by: Huang, Zhijian, et al.
Published: (2024)
Unsupervised Domain Adaptive Lane Detection via Contextual Contrast and Aggregation
by: Zhou, Kunyang, et al.
Published: (2024)
by: Zhou, Kunyang, et al.
Published: (2024)
Beyond Specialization: Assessing the Capabilities of MLLMs in Age and Gender Estimation
by: Kuprashevich, Maksim, et al.
Published: (2024)
by: Kuprashevich, Maksim, et al.
Published: (2024)
UCAgents: Unidirectional Convergence for Visual Evidence Anchored Multi-Agent Medical Decision-Making
by: Feng, Qianhan, et al.
Published: (2025)
by: Feng, Qianhan, et al.
Published: (2025)
Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs
by: Qiao, Yanyuan, et al.
Published: (2024)
by: Qiao, Yanyuan, et al.
Published: (2024)
RemoteZero: Geospatial Reasoning with Zero Human Annotations
by: Yao, Liang, et al.
Published: (2026)
by: Yao, Liang, et al.
Published: (2026)
Chain-of-Thought Degrades Visual Spatial Reasoning Capabilities of Multimodal LLMs
by: Kancheti, Sai Srinivas, et al.
Published: (2026)
by: Kancheti, Sai Srinivas, et al.
Published: (2026)
V-Zero: Self-Improving Multimodal Reasoning with Zero Annotation
by: Wang, Han, et al.
Published: (2026)
by: Wang, Han, et al.
Published: (2026)
DoubleTake: Contrastive Reasoning for Faithful Decision-Making in Medical Imaging
by: Patel, Daivik, et al.
Published: (2026)
by: Patel, Daivik, et al.
Published: (2026)
Decoding Decision Reasoning: A Counterfactual-Powered Model for Knowledge Discovery
by: Fang, Yingying, et al.
Published: (2024)
by: Fang, Yingying, et al.
Published: (2024)
AGIR: Assessing 3D Gait Impairment with Reasoning based on LLMs
by: Wang, Diwei, et al.
Published: (2025)
by: Wang, Diwei, et al.
Published: (2025)
ShredBench: Evaluating the Semantic Reasoning Capabilities of Multimodal LLMs in Document Reconstruction
by: Guo, Zichun, et al.
Published: (2026)
by: Guo, Zichun, et al.
Published: (2026)
Quantum Conflict Measurement in Decision Making for Out-of-Distribution Detection
by: Dong, Yilin, et al.
Published: (2025)
by: Dong, Yilin, et al.
Published: (2025)
Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios
by: Li, Chunxiao, et al.
Published: (2025)
by: Li, Chunxiao, et al.
Published: (2025)
CEOs, Information, and Decision Making: Scanning the Environment for Strategic Advantage.
by: Auster, Ethel, et al.
Published: (1994)
by: Auster, Ethel, et al.
Published: (1994)
Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key
by: Yang, Zhihe, et al.
Published: (2025)
by: Yang, Zhihe, et al.
Published: (2025)
SGDM: Static-Guided Dynamic Module Make Stronger Visual Models
by: Xing, Wenjie, et al.
Published: (2024)
by: Xing, Wenjie, et al.
Published: (2024)
RobuSTereo: Robust Zero-Shot Stereo Matching under Adverse Weather
by: Wang, Yuran, et al.
Published: (2025)
by: Wang, Yuran, et al.
Published: (2025)
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs
by: Qiao, Yuxuan, et al.
Published: (2024)
by: Qiao, Yuxuan, et al.
Published: (2024)
Object Navigation with Structure-Semantic Reasoning-Based Multi-level Map and Multimodal Decision-Making LLM
by: Yan, Chongshang, et al.
Published: (2025)
by: Yan, Chongshang, et al.
Published: (2025)
Multi-Agent Reinforcement Learning and Real-Time Decision-Making in Robotic Soccer for Virtual Environments
by: Taourirte, Aya, et al.
Published: (2025)
by: Taourirte, Aya, et al.
Published: (2025)
Video-MSR: Benchmarking Multi-hop Spatial Reasoning Capabilities of MLLMs
by: Zhu, Rui, et al.
Published: (2026)
by: Zhu, Rui, et al.
Published: (2026)
Similar Items
-
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
by: Wang, Sudong, et al.
Published: (2025) -
SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition
by: Xu, Peiran, et al.
Published: (2025) -
RS-OOD: A Vision-Language Augmented Framework for Out-of-Distribution Detection in Remote Sensing
by: Wang, Chenhao, et al.
Published: (2025) -
SOPSeg: Prompt-based Small Object Instance Segmentation in Remote Sensing Imagery
by: Wang, Chenhao, et al.
Published: (2025) -
An Efficient Framework for Enhancing Discriminative Models via Diffusion Techniques
by: Li, Chunxiao, et al.
Published: (2024)