Saved in:
| Main Authors: | Sriram, Ananth, Mokaria, Neel, Singh, Rajveer |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.19869 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Safety Assessment of Scaffolding on Construction Site using AI
by: Prabhu, Sameer, et al.
Published: (2025)
by: Prabhu, Sameer, et al.
Published: (2025)
VLM-R$^3$: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought
by: Jiang, Chaoya, et al.
Published: (2025)
by: Jiang, Chaoya, et al.
Published: (2025)
MonitorVLM:A Vision Language Framework for Safety Violation Detection in Mining Operations
by: Wu, Jiang, et al.
Published: (2025)
by: Wu, Jiang, et al.
Published: (2025)
PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario
by: Mandalika, Sriram, et al.
Published: (2025)
by: Mandalika, Sriram, et al.
Published: (2025)
GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training
by: Wei, Tong, et al.
Published: (2025)
by: Wei, Tong, et al.
Published: (2025)
Imitation Game for Adversarial Disillusion with Chain-of-Thought Reasoning in Generative AI
by: Chang, Ching-Chun, et al.
Published: (2025)
by: Chang, Ching-Chun, et al.
Published: (2025)
Construction Site Scaffolding Completeness Detection Based on Mask R-CNN and Hough Transform
by: Lin, Pei-Hsin, et al.
Published: (2025)
by: Lin, Pei-Hsin, et al.
Published: (2025)
An Organic Weed Control Prototype using Directed Energy and Deep Learning
by: Cao, Deng, et al.
Published: (2024)
by: Cao, Deng, et al.
Published: (2024)
Generative Adversarial Perturbations with Cross-paradigm Transferability on Localized Crowd Counting
by: Anisha, Alabi Mehzabin, et al.
Published: (2026)
by: Anisha, Alabi Mehzabin, et al.
Published: (2026)
Visual Self-Fulfilling Alignment: Shaping Safety-Oriented Personas via Threat-Related Images
by: Yang, Qishun, et al.
Published: (2026)
by: Yang, Qishun, et al.
Published: (2026)
DUET-VLM: Dual stage Unified Efficient Token reduction for VLM Training and Inference
by: Singh, Aditya Kumar, et al.
Published: (2026)
by: Singh, Aditya Kumar, et al.
Published: (2026)
Better Eyes, Better Thoughts: Why Vision Chain-of-Thought Fails in Medicine
by: Wu, Yuan, et al.
Published: (2026)
by: Wu, Yuan, et al.
Published: (2026)
RSVP: Reasoning Segmentation via Visual Prompting and Multi-modal Chain-of-Thought
by: Lu, Yi, et al.
Published: (2025)
by: Lu, Yi, et al.
Published: (2025)
VG-CoT: Towards Trustworthy Visual Reasoning via Grounded Chain-of-Thought
by: Lim, Byeonggeuk, et al.
Published: (2026)
by: Lim, Byeonggeuk, et al.
Published: (2026)
Reinforcing Structured Chain-of-Thought for Video Understanding
by: Wang, Peiyao, et al.
Published: (2026)
by: Wang, Peiyao, et al.
Published: (2026)
Controllable Navigation Instruction Generation with Chain of Thought Prompting
by: Kong, Xianghao, et al.
Published: (2024)
by: Kong, Xianghao, et al.
Published: (2024)
Interleaved-Modal Chain-of-Thought
by: Gao, Jun, et al.
Published: (2024)
by: Gao, Jun, et al.
Published: (2024)
Enhancing Construction Site Safety: A Lightweight Convolutional Network for Effective Helmet Detection
by: Alif, Mujadded Al Rabbani
Published: (2024)
by: Alif, Mujadded Al Rabbani
Published: (2024)
Multimodal Chain-of-Thought Reasoning in Language Models
by: Zhang, Zhuosheng, et al.
Published: (2023)
by: Zhang, Zhuosheng, et al.
Published: (2023)
VLM Agents Generate Their Own Memories: Distilling Experience into Embodied Programs of Thought
by: Sarch, Gabriel, et al.
Published: (2024)
by: Sarch, Gabriel, et al.
Published: (2024)
Let's Think with Images Efficiently! An Interleaved-Modal Chain-of-Thought Reasoning Framework with Dynamic and Precise Visual Thoughts
by: Liu, Xu, et al.
Published: (2026)
by: Liu, Xu, et al.
Published: (2026)
GeoChain: Multimodal Chain-of-Thought for Geographic Reasoning
by: Yerramilli, Sahiti, et al.
Published: (2025)
by: Yerramilli, Sahiti, et al.
Published: (2025)
Can Segmentation Models Understand the World? Towards Proactive Affordance Reasoning via Visual Chain-of-Thought
by: Guo, Yuchen, et al.
Published: (2026)
by: Guo, Yuchen, et al.
Published: (2026)
VISTA: Mitigating Semantic Inertia in Video-LLMs via Training-Free Dynamic Chain-of-Thought Routing
by: Jin, Hongbo, et al.
Published: (2025)
by: Jin, Hongbo, et al.
Published: (2025)
MultiMorph: On-demand Atlas Construction
by: Abulnaga, S. Mazdak, et al.
Published: (2025)
by: Abulnaga, S. Mazdak, et al.
Published: (2025)
VLM-Guard: Safeguarding Vision-Language Models via Fulfilling Safety Alignment Gap
by: Liu, Qin, et al.
Published: (2025)
by: Liu, Qin, et al.
Published: (2025)
Theorem-Validated Reverse Chain-of-Thought Problem Generation for Geometric Reasoning
by: Deng, Linger, et al.
Published: (2024)
by: Deng, Linger, et al.
Published: (2024)
Revisiting the Necessity of Lengthy Chain-of-Thought in Vision-centric Reasoning Generalization
by: Du, Yifan, et al.
Published: (2025)
by: Du, Yifan, et al.
Published: (2025)
Chain-of-Thought Degrades Visual Spatial Reasoning Capabilities of Multimodal LLMs
by: Kancheti, Sai Srinivas, et al.
Published: (2026)
by: Kancheti, Sai Srinivas, et al.
Published: (2026)
Thought-For-Food: Reasoning Chain Induced Food Visual Question Answering
by: Jain, Riddhi, et al.
Published: (2025)
by: Jain, Riddhi, et al.
Published: (2025)
Few-Shot VLM-Based G-Code and HMI Verification in CNC Machining
by: Pour, Yasaman Hashem, et al.
Published: (2025)
by: Pour, Yasaman Hashem, et al.
Published: (2025)
Improving Chain-of-Thought Efficiency for Autoregressive Image Generation
by: Gu, Zeqi, et al.
Published: (2025)
by: Gu, Zeqi, et al.
Published: (2025)
VLM-AutoDrive: Post-Training Vision-Language Models for Safety-Critical Autonomous Driving Events
by: Bhat, Mohammad Qazim, et al.
Published: (2026)
by: Bhat, Mohammad Qazim, et al.
Published: (2026)
PSA-VLM: Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment
by: Liu, Zhendong, et al.
Published: (2024)
by: Liu, Zhendong, et al.
Published: (2024)
Beyond Static Visual Tokens: Structured Sequential Visual Chain-of-Thought Reasoning
by: Guo, Guangfu, et al.
Published: (2026)
by: Guo, Guangfu, et al.
Published: (2026)
CoRGI: Verified Chain-of-Thought Reasoning with Post-hoc Visual Grounding
by: Yi, Shixin, et al.
Published: (2025)
by: Yi, Shixin, et al.
Published: (2025)
SceneCOT: Eliciting Grounded Chain-of-Thought Reasoning in 3D Scenes
by: Linghu, Xiongkun, et al.
Published: (2025)
by: Linghu, Xiongkun, et al.
Published: (2025)
Diversity Over Frequency: Rethinking Tool Use in Visual Chain-of-Thought Agents
by: Kim, Dong-Hee, et al.
Published: (2026)
by: Kim, Dong-Hee, et al.
Published: (2026)
Retrieval-Based Interleaved Visual Chain-of-Thought in Real-World Driving Scenarios
by: Corbière, Charles, et al.
Published: (2025)
by: Corbière, Charles, et al.
Published: (2025)
Chart-R1: Chain-of-Thought Supervision and Reinforcement for Advanced Chart Reasoner
by: Chen, Lei, et al.
Published: (2025)
by: Chen, Lei, et al.
Published: (2025)
Similar Items
-
Safety Assessment of Scaffolding on Construction Site using AI
by: Prabhu, Sameer, et al.
Published: (2025) -
VLM-R$^3$: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought
by: Jiang, Chaoya, et al.
Published: (2025) -
MonitorVLM:A Vision Language Framework for Safety Violation Detection in Mining Operations
by: Wu, Jiang, et al.
Published: (2025) -
PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario
by: Mandalika, Sriram, et al.
Published: (2025) -
GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training
by: Wei, Tong, et al.
Published: (2025)