:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Schwarzer, Will, Niekum, Scott
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2605.15134
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Evaluation-Aware Reinforcement Learning
by: Deshmukh, Shripad Vilasrao, et al.
Published: (2025)

Supervised Reward Inference
by: Schwarzer, Will, et al.
Published: (2025)

Reinforcement Learning from Human Feedback with High-Confidence Safety Constraints
by: Chittepu, Yaswanth, et al.
Published: (2025)

Bayesian Robust Optimization for Imitation Learning
by: Brown, Daniel S., et al.
Published: (2020)

Regularized Latent Dynamics Prediction is a Strong Baseline For Behavioral Foundation Models
by: Jajoo, Pranaya, et al.
Published: (2026)

On the Benefits of Inducing Local Lipschitzness for Robust Generative Adversarial Imitation Learning
by: Memarian, Farzan, et al.
Published: (2021)

Pairwise or Pointwise? Evaluating Feedback Protocols for Bias in LLM-Based Evaluation
by: Tripathi, Tuhina, et al.
Published: (2025)

Safe RLHF Beyond Expectation: Stochastic Dominance for Universal Spectral Risk Control
by: Chittepu, Yaswanth, et al.
Published: (2026)

Dual RL: Unification and New Methods for Reinforcement and Imitation Learning
by: Sikchi, Harshit, et al.
Published: (2023)

A Dual Approach to Imitation Learning from Observations with Offline Datasets
by: Sikchi, Harshit, et al.
Published: (2024)

Adaptive Margin RLHF via Preference over Preferences
by: Chittepu, Yaswanth, et al.
Published: (2025)

Pareto-Optimal Learning from Preferences with Hidden Context
by: Bahlous-Boldi, Ryan, et al.
Published: (2024)

An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning
by: Xu, Haoran, et al.
Published: (2025)

SMORE: Score Models for Offline Goal-Conditioned Reinforcement Learning
by: Sikchi, Harshit, et al.
Published: (2023)

A Descriptive and Normative Theory of Human Beliefs in RLHF
by: Dandekar, Sylee, et al.
Published: (2025)

Quantile Activation: Correcting a Failure Mode of ML Models
by: Challa, Aditya, et al.
Published: (2024)

Learning Action-based Representations Using Invariance
by: Rudolph, Max, et al.
Published: (2024)

Automated Discovery of Functional Actual Causes in Complex Environments
by: Chuck, Caleb, et al.
Published: (2024)

Impact of ML Optimization Tactics on Greener Pre-Trained ML Models
by: Álvarez, Alexandra González, et al.
Published: (2024)

Null Counterfactual Factor Interactions for Goal-Conditioned Reinforcement Learning
by: Chuck, Caleb, et al.
Published: (2025)

Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
by: Rafailov, Rafael, et al.
Published: (2024)

Fast Adaptation with Behavioral Foundation Models
by: Sikchi, Harshit, et al.
Published: (2025)

Contrastive Preference Learning: Learning from Human Feedback without RL
by: Hejna, Joey, et al.
Published: (2023)

SpinML: Customized Synthetic Data Generation for Private Training of Specialized ML Models
by: Zhang, Jiang, et al.
Published: (2025)

SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions
by: Wang, Zizhao, et al.
Published: (2024)

Are Deep Speech Denoising Models Robust to Adversarial Noise?
by: Schwarzer, Will, et al.
Published: (2025)

Chronic Diseases Prediction Using ML
by: Mulakala, Sri Varsha, et al.
Published: (2025)

ML Inference Scheduling with Predictable Latency
by: Zhao, Haidong, et al.
Published: (2025)

Optimizing ML Training with Metagradient Descent
by: Engstrom, Logan, et al.
Published: (2025)

Bayesian Joint Model of Multi-Sensor and Failure Event Data for Multi-Mode Failure Prediction
by: Fard, Sina Aghaee Dabaghan, et al.
Published: (2025)

Fault Tolerant ML: Efficient Meta-Aggregation and Synchronous Training
by: Dahan, Tehila, et al.
Published: (2024)

Defining error accumulation in ML atmospheric simulators
by: Parthipan, Raghul, et al.
Published: (2024)

Chunky Post-Training: Data Driven Failures of Generalization
by: Murray, Seoirse, et al.
Published: (2026)

SnatchML: Hijacking ML models without Training Access
by: Ghorbel, Mahmoud, et al.
Published: (2024)

RLZero: Direct Policy Inference from Language Without In-Domain Supervision
by: Sikchi, Harshit, et al.
Published: (2024)

Development and Deployment of Hybrid ML Models for Critical Heat Flux Prediction in Annulus Geometries
by: Furlong, Aidan, et al.
Published: (2025)

CubicML: Automated ML for Large ML Systems Co-design with ML Prediction of Performance
by: Wen, Wei, et al.
Published: (2024)

Predicting Cascading Failures with a Hyperparametric Diffusion Model
by: Xiang, Bin, et al.
Published: (2024)

Reducing Hyperparameter Tuning Costs in ML, Vision and Language Model Training Pipelines via Memoization-Awareness
by: Essofi, Abdelmajid, et al.
Published: (2024)

A Frugal Model for Accurate Early Student Failure Prediction
by: Gagaoua, Ikram, et al.
Published: (2025)