:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Bhattacharyya, Apratim, Xu, Bicheng, Haresh, Sanjay, Pourreza, Reza, Liu, Litian, Panchal, Sunny, Madan, Pulkit, Sigal, Leonid, Memisevic, Roland
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2511.21998
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Look, Remember and Reason: Grounded reasoning in videos with language models
by: Bhattacharyya, Apratim, et al.
Published: (2023)

Enhancing Hallucination Detection through Noise Injection
by: Liu, Litian, et al.
Published: (2025)

Can Vision-Language Models Answer Face to Face Questions in the Real-World?
by: Pourreza, Reza, et al.
Published: (2025)

Notes-to-Self: Scratchpad Augmented VLAs for Memory Dependent Manipulation Tasks
by: Haresh, Sanjay, et al.
Published: (2026)

ClevrSkills: Compositional Language and Visual Reasoning in Robotics
by: Haresh, Sanjay, et al.
Published: (2024)

What to Say and When to Say it: Live Fitness Coaching as a Testbed for Situated Interaction
by: Panchal, Sunny, et al.
Published: (2024)

Your Context Is Not an Array: Unveiling Random Access Limitations in Transformers
by: Ebrahimi, MohammadReza, et al.
Published: (2024)

From Out-of-Distribution Detection to Hallucination Detection: A Geometric View
by: Liu, Litian, et al.
Published: (2026)

Delayed Attention Training Improves Length Generalization in Transformer--RNN Hybrids
by: Phan, Buu, et al.
Published: (2025)

On the "Induction Bias" in Sequence Models
by: Ebrahimi, M. Reza, et al.
Published: (2026)

Revisiting Bi-Linear State Transitions in Recurrent Neural Networks
by: Ebrahimi, M. Reza, et al.
Published: (2025)

Joint Generative Modeling of Grounded Scene Graphs and Images via Diffusion Models
by: Xu, Bicheng, et al.
Published: (2024)

Multi-Draft Speculative Sampling: Canonical Decomposition and Theoretical Limits
by: Khisti, Ashish, et al.
Published: (2024)

Distilling Multi-modal Large Language Models for Autonomous Driving
by: Hegde, Deepti, et al.
Published: (2025)

Do LLMs Benefit from User and Item Embeddings in Recommendation Tasks?
by: Hossain, Mir Rayat Imtiaz, et al.
Published: (2026)

Replacing thinking with tool usage enables reasoning in small language models
by: Rainone, Corrado, et al.
Published: (2025)

Step-by-Step Video-to-Audio Synthesis via Negative Audio Guidance
by: Hayakawa, Akio, et al.
Published: (2025)

Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
by: Menon, Sachit, et al.
Published: (2024)

StepTool: Enhancing Multi-Step Tool Usage in LLMs via Step-Grained Reinforcement Learning
by: Yu, Yuanqing, et al.
Published: (2024)

MMFactory: A Universal Solution Search Engine for Vision-Language Tasks
by: Fan, Wan-Cyuan, et al.
Published: (2024)

Self-Evaluating LLMs for Multi-Step Tasks: Stepwise Confidence Estimation for Failure Detection
by: Mavi, Vaibhav, et al.
Published: (2025)

One Step Forward and K Steps Back: Better Reasoning with Denoising Recursion Models
by: Cameron, Chris, et al.
Published: (2026)

Can We Verify Step by Step for Incorrect Answer Detection?
by: Xu, Xin, et al.
Published: (2024)

Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning
by: Mishra, Venkatesh, et al.
Published: (2025)

Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning
by: Cao, Lang, et al.
Published: (2024)

One Step at a Time: Combining LLMs and Static Analysis to Generate Next-Step Hints for Programming Tasks
by: Birillo, Anastasiia, et al.
Published: (2024)

Transformers Can Navigate Mazes With Multi-Step Prediction
by: Nolte, Niklas, et al.
Published: (2024)

Live API-Bench: 2500+ Live APIs for Testing Multi-Step Tool Calling
by: Elder, Benjamin, et al.
Published: (2025)

Can Eccentric Binary Black Hole Signals Mimic Gravitational-Wave Microlensing?
by: Mishra, Anuj, et al.
Published: (2025)

Service Oriented Computing in Practice - An Agenda for Research into the Factors Influencing the Organizational Adoption of Service Oriented Architectures
by: Haresh Luthria
Published: (2009)

Culture Roots Vs. Modernity in Mahesh Elkunchwar’s Old Stone Mansion
by: Haresh Kakde
Published: (2017)

Aligning Robot Navigation Behaviors with Human Intentions and Preferences
by: Karnan, Haresh
Published: (2024)

Deconfinement to confinement by generalizing BRST symmetry on the sphere
by: Raval, Haresh
Published: (2024)

Step-by-Step Guidance to Differential Anemia Diagnosis with Real-World Data and Deep Reinforcement Learning
by: Muyama, Lillian, et al.
Published: (2024)

RoCA: Robust Cross-Domain End-to-End Autonomous Driving
by: Yasarla, Rajeev, et al.
Published: (2025)

Do LLMs Align with My Task? Evaluating Text-to-SQL via Dataset Alignment
by: Rafiei, Davood, et al.
Published: (2025)

Long-term evolution of spin and other properties of neutron star low-mass X-ray binaries: implications for millisecond X-ray pulsars
by: Kar, Abhijnan, et al.
Published: (2024)

Long-term evolution of Sco X-1: implications for the current spin frequency and ellipticity of the neutron star
by: Kar, Abhijnan, et al.
Published: (2025)

RAISE: Enhancing Scientific Reasoning in LLMs via Step-by-Step Retrieval
by: Oh, Minhae, et al.
Published: (2025)

Chain-of-Restoration: Multi-Task Image Restoration Models are Zero-Shot Step-by-Step Universal Image Restorers
by: Cao, Jin, et al.
Published: (2024)