:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sriram, Ananth, Mokaria, Neel, Singh, Rajveer
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.19869
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Safety Assessment of Scaffolding on Construction Site using AI
by: Prabhu, Sameer, et al.
Published: (2025)

VLM-R$^3$: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought
by: Jiang, Chaoya, et al.
Published: (2025)

MonitorVLM:A Vision Language Framework for Safety Violation Detection in Mining Operations
by: Wu, Jiang, et al.
Published: (2025)

PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario
by: Mandalika, Sriram, et al.
Published: (2025)

GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training
by: Wei, Tong, et al.
Published: (2025)

Imitation Game for Adversarial Disillusion with Chain-of-Thought Reasoning in Generative AI
by: Chang, Ching-Chun, et al.
Published: (2025)

Construction Site Scaffolding Completeness Detection Based on Mask R-CNN and Hough Transform
by: Lin, Pei-Hsin, et al.
Published: (2025)

An Organic Weed Control Prototype using Directed Energy and Deep Learning
by: Cao, Deng, et al.
Published: (2024)

Generative Adversarial Perturbations with Cross-paradigm Transferability on Localized Crowd Counting
by: Anisha, Alabi Mehzabin, et al.
Published: (2026)

Visual Self-Fulfilling Alignment: Shaping Safety-Oriented Personas via Threat-Related Images
by: Yang, Qishun, et al.
Published: (2026)

DUET-VLM: Dual stage Unified Efficient Token reduction for VLM Training and Inference
by: Singh, Aditya Kumar, et al.
Published: (2026)

Better Eyes, Better Thoughts: Why Vision Chain-of-Thought Fails in Medicine
by: Wu, Yuan, et al.
Published: (2026)

RSVP: Reasoning Segmentation via Visual Prompting and Multi-modal Chain-of-Thought
by: Lu, Yi, et al.
Published: (2025)

VG-CoT: Towards Trustworthy Visual Reasoning via Grounded Chain-of-Thought
by: Lim, Byeonggeuk, et al.
Published: (2026)

Reinforcing Structured Chain-of-Thought for Video Understanding
by: Wang, Peiyao, et al.
Published: (2026)

Controllable Navigation Instruction Generation with Chain of Thought Prompting
by: Kong, Xianghao, et al.
Published: (2024)

Interleaved-Modal Chain-of-Thought
by: Gao, Jun, et al.
Published: (2024)

Enhancing Construction Site Safety: A Lightweight Convolutional Network for Effective Helmet Detection
by: Alif, Mujadded Al Rabbani
Published: (2024)

Multimodal Chain-of-Thought Reasoning in Language Models
by: Zhang, Zhuosheng, et al.
Published: (2023)

VLM Agents Generate Their Own Memories: Distilling Experience into Embodied Programs of Thought
by: Sarch, Gabriel, et al.
Published: (2024)

Let's Think with Images Efficiently! An Interleaved-Modal Chain-of-Thought Reasoning Framework with Dynamic and Precise Visual Thoughts
by: Liu, Xu, et al.
Published: (2026)

GeoChain: Multimodal Chain-of-Thought for Geographic Reasoning
by: Yerramilli, Sahiti, et al.
Published: (2025)

Can Segmentation Models Understand the World? Towards Proactive Affordance Reasoning via Visual Chain-of-Thought
by: Guo, Yuchen, et al.
Published: (2026)

VISTA: Mitigating Semantic Inertia in Video-LLMs via Training-Free Dynamic Chain-of-Thought Routing
by: Jin, Hongbo, et al.
Published: (2025)

MultiMorph: On-demand Atlas Construction
by: Abulnaga, S. Mazdak, et al.
Published: (2025)

VLM-Guard: Safeguarding Vision-Language Models via Fulfilling Safety Alignment Gap
by: Liu, Qin, et al.
Published: (2025)

Theorem-Validated Reverse Chain-of-Thought Problem Generation for Geometric Reasoning
by: Deng, Linger, et al.
Published: (2024)

Revisiting the Necessity of Lengthy Chain-of-Thought in Vision-centric Reasoning Generalization
by: Du, Yifan, et al.
Published: (2025)

Chain-of-Thought Degrades Visual Spatial Reasoning Capabilities of Multimodal LLMs
by: Kancheti, Sai Srinivas, et al.
Published: (2026)

Thought-For-Food: Reasoning Chain Induced Food Visual Question Answering
by: Jain, Riddhi, et al.
Published: (2025)

Few-Shot VLM-Based G-Code and HMI Verification in CNC Machining
by: Pour, Yasaman Hashem, et al.
Published: (2025)

Improving Chain-of-Thought Efficiency for Autoregressive Image Generation
by: Gu, Zeqi, et al.
Published: (2025)

VLM-AutoDrive: Post-Training Vision-Language Models for Safety-Critical Autonomous Driving Events
by: Bhat, Mohammad Qazim, et al.
Published: (2026)

PSA-VLM: Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment
by: Liu, Zhendong, et al.
Published: (2024)

Beyond Static Visual Tokens: Structured Sequential Visual Chain-of-Thought Reasoning
by: Guo, Guangfu, et al.
Published: (2026)

CoRGI: Verified Chain-of-Thought Reasoning with Post-hoc Visual Grounding
by: Yi, Shixin, et al.
Published: (2025)

SceneCOT: Eliciting Grounded Chain-of-Thought Reasoning in 3D Scenes
by: Linghu, Xiongkun, et al.
Published: (2025)

Diversity Over Frequency: Rethinking Tool Use in Visual Chain-of-Thought Agents
by: Kim, Dong-Hee, et al.
Published: (2026)

Retrieval-Based Interleaved Visual Chain-of-Thought in Real-World Driving Scenarios
by: Corbière, Charles, et al.
Published: (2025)

Chart-R1: Chain-of-Thought Supervision and Reinforcement for Advanced Chart Reasoner
by: Chen, Lei, et al.
Published: (2025)