Saved in:
| Main Authors: | Xu, Chen, Nguyen, Tony Khuong, Dixon, Emma, Rodriguez, Christopher, Miller, Patrick, Lee, Robert, Shah, Paarth, Ambrus, Rares, Nishimura, Haruki, Itkina, Masha |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.08558 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Is Your Imitation Learning Policy Better than Mine? Policy Comparison with Near-Optimal Stopping
by: Snyder, David, et al.
Published: (2025)
by: Snyder, David, et al.
Published: (2025)
SAFE: Multitask Failure Detection for Vision-Language-Action Models
by: Gu, Qiao, et al.
Published: (2025)
by: Gu, Qiao, et al.
Published: (2025)
How Generalizable Is My Behavior Cloning Policy? A Statistical Approach to Trustworthy Performance Evaluation
by: Vincent, Joseph A., et al.
Published: (2024)
by: Vincent, Joseph A., et al.
Published: (2024)
Impact of Different Failures on a Robot's Perceived Reliability
by: Violette, Andrew, et al.
Published: (2026)
by: Violette, Andrew, et al.
Published: (2026)
Video Generators are Robot Policies
by: Liang, Junbang, et al.
Published: (2025)
by: Liang, Junbang, et al.
Published: (2025)
STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation
by: Goli, Hossein, et al.
Published: (2025)
by: Goli, Hossein, et al.
Published: (2025)
Beyond Binary Success: Sample-Efficient and Statistically Rigorous Robot Policy Comparison
by: Snyder, David, et al.
Published: (2026)
by: Snyder, David, et al.
Published: (2026)
Reliably Detecting Model Failures in Deployment Without Labels
by: Nguyen, Viet, et al.
Published: (2025)
by: Nguyen, Viet, et al.
Published: (2025)
Failure Prediction at Runtime for Generative Robot Policies
by: Römer, Ralf, et al.
Published: (2025)
by: Römer, Ralf, et al.
Published: (2025)
Typicalness-Aware Learning for Failure Detection
by: Liu, Yijun, et al.
Published: (2024)
by: Liu, Yijun, et al.
Published: (2024)
Runtime Failure Hunting for Physics Engine Based Software Systems: How Far Can We Go?
by: Li, Shuqing, et al.
Published: (2025)
by: Li, Shuqing, et al.
Published: (2025)
Using Non-Expert Data to Robustify Imitation Learning via Offline Reinforcement Learning
by: Huang, Kevin, et al.
Published: (2025)
by: Huang, Kevin, et al.
Published: (2025)
Rewind-IL: Online Failure Detection and State Respawning for Imitation Learning
by: Zheng, Gehan, et al.
Published: (2026)
by: Zheng, Gehan, et al.
Published: (2026)
Unpacking Failure Modes of Generative Policies: Runtime Monitoring of Consistency and Progress
by: Agia, Christopher, et al.
Published: (2024)
by: Agia, Christopher, et al.
Published: (2024)
CUPID: Curating Data your Robot Loves with Influence Functions
by: Agia, Christopher, et al.
Published: (2025)
by: Agia, Christopher, et al.
Published: (2025)
Interpretable Failure Detection with Human-Level Concepts
by: Nguyen, Kien X., et al.
Published: (2025)
by: Nguyen, Kien X., et al.
Published: (2025)
RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning
by: Dai, Yinpei, et al.
Published: (2024)
by: Dai, Yinpei, et al.
Published: (2024)
Cost-Sensitive Uncertainty-Based Failure Recognition for Object Detection
by: Sbeyti, Moussa Kassem, et al.
Published: (2024)
by: Sbeyti, Moussa Kassem, et al.
Published: (2024)
Chapter 12 Economising Failure and Assembling a Failure Regime
by: Kurunmäki, Liisa, et al.
Published: (2023)
by: Kurunmäki, Liisa, et al.
Published: (2023)
Near-Miss: Latent Policy Failure Detection in Agentic Workflows
by: Rabinovich, Ella, et al.
Published: (2026)
by: Rabinovich, Ella, et al.
Published: (2026)
Bank Runs With and Without Bank Failure
by: Correia, Sergio, et al.
Published: (2026)
by: Correia, Sergio, et al.
Published: (2026)
RuntimeSlicer: Towards Generalizable Unified Runtime State Representation for Failure Management
by: Zhang, Lingzhe, et al.
Published: (2026)
by: Zhang, Lingzhe, et al.
Published: (2026)
Uncertainties of Failure in Bending Experiments
by: Silu Zhang, et al.
Published: (2026)
by: Silu Zhang, et al.
Published: (2026)
Evaluating Uncertainty-based Failure Detection for Closed-Loop LLM Planners
by: Zheng, Zhi, et al.
Published: (2024)
by: Zheng, Zhi, et al.
Published: (2024)
Failure Detection for Pinching-Antenna Systems
by: Ouyang, Chongjun, et al.
Published: (2026)
by: Ouyang, Chongjun, et al.
Published: (2026)
The Failure of Plagiarism Detection in Competitive Programming
by: Dickey, Ethan
Published: (2025)
by: Dickey, Ethan
Published: (2025)
LOPR: Latent Occupancy PRediction using Generative Models
by: Lange, Bernard, et al.
Published: (2022)
by: Lange, Bernard, et al.
Published: (2022)
Silent Failures in Stateless Systems: Rethinking Anomaly Detection for Serverless Computing
by: Nguyen, Chanh, et al.
Published: (2025)
by: Nguyen, Chanh, et al.
Published: (2025)
Low-Resource Safety Failures Are Action Failures, Not Representation Failures
by: Aziz, Rashad, et al.
Published: (2026)
by: Aziz, Rashad, et al.
Published: (2026)
The Impact of Class Uncertainty Propagation in Perception-Based Motion Planning
by: Shah, Jibran Iqbal, et al.
Published: (2026)
by: Shah, Jibran Iqbal, et al.
Published: (2026)
FailureMem: A Failure-Aware Multimodal Framework for Autonomous Software Repair
by: Ma, Ruize, et al.
Published: (2026)
by: Ma, Ruize, et al.
Published: (2026)
DeFIX: Detecting and Fixing Failure Scenarios with Reinforcement Learning in Imitation Learning Based Autonomous Driving
by: Dagdanov, Resul, et al.
Published: (2022)
by: Dagdanov, Resul, et al.
Published: (2022)
Log-Anomaly-Detection-and-Failure-Prediction-System
by: VAIBHAV CHAUHAN
Published: (2026)
by: VAIBHAV CHAUHAN
Published: (2026)
Adaptive Confidence Regularization for Multimodal Failure Detection
by: Liu, Moru, et al.
Published: (2026)
by: Liu, Moru, et al.
Published: (2026)
Hide-and-Seek in Trajectories: Discovering Failure Signals for VLA Runtime Monitoring
by: Park, Seongheon, et al.
Published: (2026)
by: Park, Seongheon, et al.
Published: (2026)
Failure-Centered Runtime Evaluation for Deployed Trilingual Public-Space Agents
by: Meng, M.
Published: (2026)
by: Meng, M.
Published: (2026)
Failure Identification in Imitation Learning Via Statistical and Semantic Filtering
by: Rolland, Quentin, et al.
Published: (2026)
by: Rolland, Quentin, et al.
Published: (2026)
Coherent Without Grounding, Grounded Without Success: Observability and Epistemic Failure
by: Sartori, Camilo Chacón
Published: (2026)
by: Sartori, Camilo Chacón
Published: (2026)
Distributed Empirical Likelihood Inference With or Without Byzantine Failures
by: Wang, Qihua, et al.
Published: (2024)
by: Wang, Qihua, et al.
Published: (2024)
Early Failure Detection in Autonomous Surgical Soft-Tissue Manipulation via Uncertainty Quantification
by: Thompson, Jordan, et al.
Published: (2025)
by: Thompson, Jordan, et al.
Published: (2025)
Similar Items
-
Is Your Imitation Learning Policy Better than Mine? Policy Comparison with Near-Optimal Stopping
by: Snyder, David, et al.
Published: (2025) -
SAFE: Multitask Failure Detection for Vision-Language-Action Models
by: Gu, Qiao, et al.
Published: (2025) -
How Generalizable Is My Behavior Cloning Policy? A Statistical Approach to Trustworthy Performance Evaluation
by: Vincent, Joseph A., et al.
Published: (2024) -
Impact of Different Failures on a Robot's Perceived Reliability
by: Violette, Andrew, et al.
Published: (2026) -
Video Generators are Robot Policies
by: Liang, Junbang, et al.
Published: (2025)