Saved in:
| Main Author: | Zhou, Yunpeng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.25075 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Diagnosing Failure Modes of Shared-State Collaboration in Resource-Constrained Visual Agents
by: Zhou, Yunpeng
Published: (2026)
by: Zhou, Yunpeng
Published: (2026)
ClinCoT: Clinical-Aware Visual Chain-of-Thought for Medical Vision Language Models
by: Liu, Xiwei, et al.
Published: (2026)
by: Liu, Xiwei, et al.
Published: (2026)
A Resource-Rational Principle for Modeling Visual Attention Control
by: Bai, Yunpeng
Published: (2026)
by: Bai, Yunpeng
Published: (2026)
Counting Circuits: Mechanistic Interpretability of Visual Reasoning in Large Vision-Language Models
by: Che, Liwei, et al.
Published: (2026)
by: Che, Liwei, et al.
Published: (2026)
Language Model Circuits Are Sparse in the Neuron Basis
by: Arora, Aryaman, et al.
Published: (2026)
by: Arora, Aryaman, et al.
Published: (2026)
Balanced Thinking: Improving Chain of Thought Training in Vision Language Models
by: Perek, Shaked, et al.
Published: (2026)
by: Perek, Shaked, et al.
Published: (2026)
CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models
by: Zhao, Qingqing, et al.
Published: (2025)
by: Zhao, Qingqing, et al.
Published: (2025)
Moving Pictures of Thought: Extracting Visual Knowledge in Charles S. Peirce's Manuscripts with Vision-Language Models
by: Pedretti, Carlo Teo, et al.
Published: (2025)
by: Pedretti, Carlo Teo, et al.
Published: (2025)
Image-of-Thought Prompting for Visual Reasoning Refinement in Multimodal Large Language Models
by: Zhou, Qiji, et al.
Published: (2024)
by: Zhou, Qiji, et al.
Published: (2024)
VCoT-Grasp: Grasp Foundation Models with Visual Chain-of-Thought Reasoning for Language-driven Grasp Generation
by: Zhang, Haoran, et al.
Published: (2025)
by: Zhang, Haoran, et al.
Published: (2025)
AgroCoT: A Chain-of-Thought Benchmark for Evaluating Reasoning in Vision-Language Models for Agriculture
by: Wen, Yibin, et al.
Published: (2025)
by: Wen, Yibin, et al.
Published: (2025)
Structural Instability of Feature Composition
by: Zhou, Yunpeng
Published: (2026)
by: Zhou, Yunpeng
Published: (2026)
ImgCoT: Compressing Long Chain of Thought into Compact Visual Tokens for Efficient Reasoning of Large Language Model
by: Chen, Xiaoshu, et al.
Published: (2026)
by: Chen, Xiaoshu, et al.
Published: (2026)
Revis: Sparse Latent Steering to Mitigate Object Hallucination in Large Vision-Language Models
by: Wu, Jialin, et al.
Published: (2026)
by: Wu, Jialin, et al.
Published: (2026)
Sparse Probabilistic Graph Circuits
by: Rektoris, Martin, et al.
Published: (2025)
by: Rektoris, Martin, et al.
Published: (2025)
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
by: Marks, Samuel, et al.
Published: (2024)
by: Marks, Samuel, et al.
Published: (2024)
CoT-VLM4Tar: Chain-of-Thought Guided Vision-Language Models for Traffic Anomaly Resolution
by: Ren, Tianchi, et al.
Published: (2025)
by: Ren, Tianchi, et al.
Published: (2025)
GeoThought: A Dataset for Enhancing Mathematical Geometry Reasoning in Vision-Language Models
by: Shi, Nannan, et al.
Published: (2025)
by: Shi, Nannan, et al.
Published: (2025)
VisualScratchpad: Inference-time Visual Concepts Analysis in Vision Language Models
by: Lim, Hyesu, et al.
Published: (2026)
by: Lim, Hyesu, et al.
Published: (2026)
Interactive Reasoning: Visualizing and Controlling Chain-of-Thought Reasoning in Large Language Models
by: Pang, Rock Yuren, et al.
Published: (2025)
by: Pang, Rock Yuren, et al.
Published: (2025)
More Thought, Less Accuracy? On the Dual Nature of Reasoning in Vision-Language Models
by: Tian, Xinyu, et al.
Published: (2025)
by: Tian, Xinyu, et al.
Published: (2025)
Visual Distraction Undermines Moral Reasoning in Vision-Language Models
by: Yang, Xinyi, et al.
Published: (2026)
by: Yang, Xinyi, et al.
Published: (2026)
Pixelis: Reasoning in Pixels, from Seeing to Acting
by: Zhou, Yunpeng
Published: (2026)
by: Zhou, Yunpeng
Published: (2026)
ForgeVLA: Federated Vision-Language-Action Learning without Language Annotations
by: Zhou, Yuhao, et al.
Published: (2026)
by: Zhou, Yuhao, et al.
Published: (2026)
Exploring Spatial Representation to Enhance LLM Reasoning in Aerial Vision-Language Navigation
by: Gao, Yunpeng, et al.
Published: (2024)
by: Gao, Yunpeng, et al.
Published: (2024)
Joint Reward Modeling: Internalizing Chain-of-Thought for Efficient Visual Reward Models
by: Yang, Yankai, et al.
Published: (2026)
by: Yang, Yankai, et al.
Published: (2026)
Detecting Unfaithful Chain-of-Thought via Circuit-Guided Internal-External Discrepancy
by: Shen, Xu, et al.
Published: (2026)
by: Shen, Xu, et al.
Published: (2026)
How does Chain of Thought Think? Mechanistic Interpretability of Chain-of-Thought Reasoning with Sparse Autoencoding
by: Chen, Xi, et al.
Published: (2025)
by: Chen, Xi, et al.
Published: (2025)
Modeling Language as a Sequence of Thoughts
by: Borazjanizadeh, Nasim, et al.
Published: (2025)
by: Borazjanizadeh, Nasim, et al.
Published: (2025)
Cross-Modal Attention Analysis and Optimization in Vision-Language Models: A Study on Visual Reliability
by: Zhou, Lijie
Published: (2026)
by: Zhou, Lijie
Published: (2026)
Event-Grounded Sparse Autoencoders for Vision-Language-Action Policies
by: Jin, Xinchen, et al.
Published: (2026)
by: Jin, Xinchen, et al.
Published: (2026)
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models
by: Ye, Jiacheng, et al.
Published: (2024)
by: Ye, Jiacheng, et al.
Published: (2024)
CircuitFormer: A Circuit Language Model for Analog Topology Design from Natural Language Prompt
by: Islam, Md Touhidul, et al.
Published: (2026)
by: Islam, Md Touhidul, et al.
Published: (2026)
Synthesizing Visual Concepts as Vision-Language Programs
by: Wüst, Antonia, et al.
Published: (2025)
by: Wüst, Antonia, et al.
Published: (2025)
SAUCE: Selective Concept Unlearning in Vision-Language Models with Sparse Autoencoders
by: Li, Qing, et al.
Published: (2025)
by: Li, Qing, et al.
Published: (2025)
UrbanVLA: A Vision-Language-Action Model for Urban Micromobility
by: Li, Anqi, et al.
Published: (2025)
by: Li, Anqi, et al.
Published: (2025)
Latent Chain-of-Thought for Visual Reasoning
by: Sun, Guohao, et al.
Published: (2025)
by: Sun, Guohao, et al.
Published: (2025)
Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models
by: Pach, Mateusz, et al.
Published: (2025)
by: Pach, Mateusz, et al.
Published: (2025)
Thought of Search: Planning with Language Models Through The Lens of Efficiency
by: Katz, Michael, et al.
Published: (2024)
by: Katz, Michael, et al.
Published: (2024)
Chain-of-Thought in Large Language Models: Decoding, Projection, and Activation
by: Yang, Hao, et al.
Published: (2024)
by: Yang, Hao, et al.
Published: (2024)
Similar Items
-
Diagnosing Failure Modes of Shared-State Collaboration in Resource-Constrained Visual Agents
by: Zhou, Yunpeng
Published: (2026) -
ClinCoT: Clinical-Aware Visual Chain-of-Thought for Medical Vision Language Models
by: Liu, Xiwei, et al.
Published: (2026) -
A Resource-Rational Principle for Modeling Visual Attention Control
by: Bai, Yunpeng
Published: (2026) -
Counting Circuits: Mechanistic Interpretability of Visual Reasoning in Large Vision-Language Models
by: Che, Liwei, et al.
Published: (2026) -
Language Model Circuits Are Sparse in the Neuron Basis
by: Arora, Aryaman, et al.
Published: (2026)