Saved in:
| Main Authors: | Rajaram, Sara, Cotton, R. James, Sinz, Fabian H. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.12529 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Residual Reward Models for Preference-based Reinforcement Learning
by: Cao, Chenyang, et al.
Published: (2025)
by: Cao, Chenyang, et al.
Published: (2025)
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning
by: Choi, Heewoong, et al.
Published: (2024)
by: Choi, Heewoong, et al.
Published: (2024)
Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards
by: Metcalf, Katherine, et al.
Published: (2024)
by: Metcalf, Katherine, et al.
Published: (2024)
RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences
by: Cheng, Jie, et al.
Published: (2024)
by: Cheng, Jie, et al.
Published: (2024)
Adversarial Preference Learning for Robust LLM Alignment
by: Wang, Yuanfu, et al.
Published: (2025)
by: Wang, Yuanfu, et al.
Published: (2025)
In-Context Reward Adaptation for Robust Preference Modeling
by: Sun, Zhenyu, et al.
Published: (2026)
by: Sun, Zhenyu, et al.
Published: (2026)
Provable Reward-Agnostic Preference-Based Reinforcement Learning
by: Zhan, Wenhao, et al.
Published: (2023)
by: Zhan, Wenhao, et al.
Published: (2023)
Rectifying Shortcut Behaviors in Preference-based Reward Learning
by: Ye, Wenqian, et al.
Published: (2025)
by: Ye, Wenqian, et al.
Published: (2025)
From Demonstrations to Rewards: Alignment Without Explicit Human Preferences
by: Zeng, Siliang, et al.
Published: (2025)
by: Zeng, Siliang, et al.
Published: (2025)
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
by: Zhang, Shenao, et al.
Published: (2024)
by: Zhang, Shenao, et al.
Published: (2024)
A Generalized Acquisition Function for Preference-based Reward Learning
by: Ellis, Evan, et al.
Published: (2024)
by: Ellis, Evan, et al.
Published: (2024)
Hindsight Preference Learning for Offline Preference-based Reinforcement Learning
by: Gao, Chen-Xiao, et al.
Published: (2024)
by: Gao, Chen-Xiao, et al.
Published: (2024)
On the Robustness of Reward Models for Language Model Alignment
by: Hong, Jiwoo, et al.
Published: (2025)
by: Hong, Jiwoo, et al.
Published: (2025)
APLOT: Robust Reward Modeling via Adaptive Preference Learning with Optimal Transport
by: Li, Zhuo, et al.
Published: (2025)
by: Li, Zhuo, et al.
Published: (2025)
Causally Robust Reward Learning from Reason-Augmented Preference Feedback
by: Hwang, Minjune, et al.
Published: (2026)
by: Hwang, Minjune, et al.
Published: (2026)
Larger or Smaller Reward Margins to Select Preferences for Alignment?
by: Huang, Kexin, et al.
Published: (2025)
by: Huang, Kexin, et al.
Published: (2025)
Reward Learning From Preference With Ties
by: Liu, Jinsong, et al.
Published: (2024)
by: Liu, Jinsong, et al.
Published: (2024)
PREFINE: Preference-Based Implicit Reward and Cost Fine-Tuning for Safety Alignment
by: Verma, Richa, et al.
Published: (2026)
by: Verma, Richa, et al.
Published: (2026)
Inverse Reinforcement Learning with Dynamic Reward Scaling for LLM Alignment
by: Cheng, Ruoxi, et al.
Published: (2025)
by: Cheng, Ruoxi, et al.
Published: (2025)
Reinforced Compressive Neural Architecture Search for Versatile Adversarial Robustness
by: Wang, Dingrong, et al.
Published: (2024)
by: Wang, Dingrong, et al.
Published: (2024)
Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data
by: Zhao, Shuai, et al.
Published: (2025)
by: Zhao, Shuai, et al.
Published: (2025)
Robust LLM Alignment via Distributionally Robust Direct Preference Optimization
by: Xu, Zaiyan, et al.
Published: (2025)
by: Xu, Zaiyan, et al.
Published: (2025)
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards
by: Wang, Haoxiang, et al.
Published: (2024)
by: Wang, Haoxiang, et al.
Published: (2024)
Efficient $Q$-Learning and Actor-Critic Methods for Robust Average Reward Reinforcement Learning
by: Xu, Yang, et al.
Published: (2025)
by: Xu, Yang, et al.
Published: (2025)
PARM: Multi-Objective Test-Time Alignment via Preference-Aware Autoregressive Reward Model
by: Lin, Baijiong, et al.
Published: (2025)
by: Lin, Baijiong, et al.
Published: (2025)
Exploring and Addressing Reward Confusion in Offline Preference Learning
by: Chen, Xin, et al.
Published: (2024)
by: Chen, Xin, et al.
Published: (2024)
Decision-Focused Model-based Reinforcement Learning for Reward Transfer
by: Sharma, Abhishek, et al.
Published: (2023)
by: Sharma, Abhishek, et al.
Published: (2023)
Subgoal-based Reward Shaping to Improve Efficiency in Reinforcement Learning
by: Okudo, Takato, et al.
Published: (2021)
by: Okudo, Takato, et al.
Published: (2021)
PROF: An LLM-based Reward Code Preference Optimization Framework for Offline Imitation Learning
by: Sun, Shengjie, et al.
Published: (2025)
by: Sun, Shengjie, et al.
Published: (2025)
Robust Offline Reinforcement learning with Heavy-Tailed Rewards
by: Zhu, Jin, et al.
Published: (2023)
by: Zhu, Jin, et al.
Published: (2023)
Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective Reinforcement Learning for Pluralistic AI
by: Harland, Hadassah, et al.
Published: (2024)
by: Harland, Hadassah, et al.
Published: (2024)
Preference as Reward, Maximum Preference Optimization with Importance Sampling
by: Jiang, Zaifan, et al.
Published: (2023)
by: Jiang, Zaifan, et al.
Published: (2023)
TGPO: Tree-Guided Preference Optimization for Robust Web Agent Reinforcement Learning
by: Chen, Ziyuan, et al.
Published: (2025)
by: Chen, Ziyuan, et al.
Published: (2025)
$i$REPO: $i$mplicit Reward Pairwise Difference based Empirical Preference Optimization
by: Le, Long Tan, et al.
Published: (2024)
by: Le, Long Tan, et al.
Published: (2024)
Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning
by: Kang, Hyungkyu, et al.
Published: (2025)
by: Kang, Hyungkyu, et al.
Published: (2025)
RLVMR: Reinforcement Learning with Verifiable Meta-Reasoning Rewards for Robust Long-Horizon Agents
by: Zhang, Zijing, et al.
Published: (2025)
by: Zhang, Zijing, et al.
Published: (2025)
Evaluating Feature Dependent Noise in Preference-based Reinforcement Learning
by: Li, Yuxuan, et al.
Published: (2026)
by: Li, Yuxuan, et al.
Published: (2026)
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment
by: Yang, Rui, et al.
Published: (2024)
by: Yang, Rui, et al.
Published: (2024)
Hybrid Reward-Driven Reinforcement Learning for Efficient Quantum Circuit Synthesis
by: Giordano, Sara, et al.
Published: (2025)
by: Giordano, Sara, et al.
Published: (2025)
Reinforcement Learning with Stochastic Reward Machines
by: Corazza, Jan, et al.
Published: (2025)
by: Corazza, Jan, et al.
Published: (2025)
Similar Items
-
Residual Reward Models for Preference-based Reinforcement Learning
by: Cao, Chenyang, et al.
Published: (2025) -
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning
by: Choi, Heewoong, et al.
Published: (2024) -
Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards
by: Metcalf, Katherine, et al.
Published: (2024) -
RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences
by: Cheng, Jie, et al.
Published: (2024) -
Adversarial Preference Learning for Robust LLM Alignment
by: Wang, Yuanfu, et al.
Published: (2025)