:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sahu, Sharan, Wells, Martin T.
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2509.19104
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

On the Provable Suboptimality of Momentum SGD in Nonstationary Stochastic Optimization
by: Sahu, Sharan, et al.
Published: (2026)

Adapt or Forget: Provable Tradeoffs Between Adam and SGD in Nonstationary Optimization
by: Sahu, Sharan, et al.
Published: (2026)

Towards Optimal Differentially Private Regret Bounds in Linear MDPs
by: Sahu, Sharan
Published: (2025)

Provably Reliable Classifier Guidance via Cross-Entropy Control
by: Sahu, Sharan, et al.
Published: (2026)

Robust LLM Alignment via Distributionally Robust Direct Preference Optimization
by: Xu, Zaiyan, et al.
Published: (2025)

Robust Reward Alignment via Hypothesis Space Batch Cutting
by: Xie, Zhixian, et al.
Published: (2025)

Distribution-Aware Reward: Reinforcement Learning over Predictive Distributions for LLM Regression
by: Park, Jungsoo, et al.
Published: (2026)

Bayesian Reward Models for LLM Alignment
by: Yang, Adam X., et al.
Published: (2024)

Democratic Preference Alignment via Sortition-Weighted RLHF
by: Sana, Suvadip, et al.
Published: (2026)

REBEL: Reinforcement Learning via Regressing Relative Rewards
by: Gao, Zhaolin, et al.
Published: (2024)

Online Distributional Regression
by: Hirsch, Simon, et al.
Published: (2024)

Reward-Based Online LLM Routing via NeuralUCB
by: Tsai, Ming-Hua, et al.
Published: (2026)

Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions
by: Matrenok, Simon, et al.
Published: (2025)

Quantile Regression for Distributional Reward Models in RLHF
by: Dorka, Nicolai
Published: (2024)

Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
by: Lu, Keming, et al.
Published: (2024)

Sail into the Headwind: Alignment via Robust Rewards and Dynamic Labels against Reward Hacking
by: Rashidinejad, Paria, et al.
Published: (2024)

On the Robustness of Reward Models for Language Model Alignment
by: Hong, Jiwoo, et al.
Published: (2025)

Online and Offline Robust Multivariate Linear Regression
by: Godichon-Baggioni, Antoine, et al.
Published: (2024)

MIRA: Towards Mitigating Reward Hacking in Inference-Time Alignment of T2I Diffusion Models
by: Zhai, Kevin, et al.
Published: (2025)

Energy-Based Reward Models for Robust Language Model Alignment
by: Lochab, Anamika, et al.
Published: (2025)

Revisiting Robustness for LLM Safety Alignment via Selective Geometry Control
by: Yang, Yonghui, et al.
Published: (2026)

Oracle-Robust Online Alignment for Large Language Models
by: Li, Zimeng, et al.
Published: (2026)

Distributionally and Adversarially Robust Logistic Regression via Intersecting Wasserstein Balls
by: Selvi, Aras, et al.
Published: (2024)

Distributionally Robust Active Learning for Gaussian Process Regression
by: Takeno, Shion, et al.
Published: (2025)

Similarity as Reward Alignment: Robust and Versatile Preference-based Reinforcement Learning
by: Rajaram, Sara, et al.
Published: (2025)

Robust Reward Modeling via Causal Rubrics
by: Srivastava, Pragya, et al.
Published: (2025)

A Spectral View of Adversarially Robust Features
by: Garg, Shivam, et al.
Published: (2018)

Inverse Reinforcement Learning with Dynamic Reward Scaling for LLM Alignment
by: Cheng, Ruoxi, et al.
Published: (2025)

Robust Multi-Objective Preference Alignment with Online DPO
by: Gupta, Raghav, et al.
Published: (2025)

Online Linear Regression in Dynamic Environments via Discounting
by: Jacobsen, Andrew, et al.
Published: (2024)

Safeguarding LLM Fine-tuning via Push-Pull Distributional Alignment
by: Wang, Haozhong, et al.
Published: (2026)

Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes
by: Kobalczyk, Katarzyna, et al.
Published: (2024)

PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation
by: Liu, Runze, et al.
Published: (2023)

Variable Clustering via Distributionally Robust Nodewise Regression
by: Wang, Kaizheng, et al.
Published: (2022)

ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization
by: Zhang, Chen Bo Calvin, et al.
Published: (2024)

Wasserstein Distributionally Robust Online Learning
by: Chen, Guixian, et al.
Published: (2026)

Transformers Learn to Achieve Second-Order Convergence Rates for In-Context Linear Regression
by: Fu, Deqing, et al.
Published: (2023)

Adversarial Preference Learning for Robust LLM Alignment
by: Wang, Yuanfu, et al.
Published: (2025)

Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
by: Zhang, Yuheng, et al.
Published: (2025)

Mixed-feature Logistic Regression Robust to Distribution Shifts
by: Sun, Qingshi, et al.
Published: (2025)