Saved in:
| Main Author: | Wu, Xiaobao |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.02686 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
WildReward: Learning Reward Models from In-the-Wild Human Interactions
by: Peng, Hao, et al.
Published: (2026)
by: Peng, Hao, et al.
Published: (2026)
Text2Reward: Reward Shaping with Language Models for Reinforcement Learning
by: Xie, Tianbao, et al.
Published: (2023)
by: Xie, Tianbao, et al.
Published: (2023)
OpenReward: Learning to Reward Long-form Agentic Tasks via Reinforcement Learning
by: Hu, Ziyou, et al.
Published: (2025)
by: Hu, Ziyou, et al.
Published: (2025)
A Survey on Progress in LLM Alignment from the Perspective of Reward Design
by: Ji, Miaomiao, et al.
Published: (2025)
by: Ji, Miaomiao, et al.
Published: (2025)
Preference Poisoning Attacks on Reward Model Learning
by: Wu, Junlin, et al.
Published: (2024)
by: Wu, Junlin, et al.
Published: (2024)
R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
by: Zhang, Yi-Fan, et al.
Published: (2025)
by: Zhang, Yi-Fan, et al.
Published: (2025)
Agentic Reinforcement Learning with Implicit Step Rewards
by: Liu, Xiaoqian, et al.
Published: (2025)
by: Liu, Xiaoqian, et al.
Published: (2025)
Reward Reasoning Model
by: Guo, Jiaxin, et al.
Published: (2025)
by: Guo, Jiaxin, et al.
Published: (2025)
Reward Model Perspectives: Whose Opinions Do Reward Models Reward?
by: Elle
Published: (2025)
by: Elle
Published: (2025)
Full-Step-DPO: Self-Supervised Preference Optimization with Step-wise Rewards for Mathematical Reasoning
by: Xu, Huimin, et al.
Published: (2025)
by: Xu, Huimin, et al.
Published: (2025)
Libra: Assessing and Improving Reward Model by Learning to Think
by: Zhou, Meng, et al.
Published: (2025)
by: Zhou, Meng, et al.
Published: (2025)
RRM: Robust Reward Model Training Mitigates Reward Hacking
by: Liu, Tianqi, et al.
Published: (2024)
by: Liu, Tianqi, et al.
Published: (2024)
Learning Goal-Conditioned Representations for Language Reward Models
by: Nath, Vaskar, et al.
Published: (2024)
by: Nath, Vaskar, et al.
Published: (2024)
Learning Ordinal Probabilistic Reward from Preferences
by: Chen, Longze, et al.
Published: (2026)
by: Chen, Longze, et al.
Published: (2026)
Rewarding Creativity: A Human-Aligned Generative Reward Model for Reinforcement Learning in Storytelling
by: Li, Zhaoyan, et al.
Published: (2026)
by: Li, Zhaoyan, et al.
Published: (2026)
RewardBench 2: Advancing Reward Model Evaluation
by: Malik, Saumya, et al.
Published: (2025)
by: Malik, Saumya, et al.
Published: (2025)
Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment
by: Wu, Zhaofeng, et al.
Published: (2024)
by: Wu, Zhaofeng, et al.
Published: (2024)
Tool-Augmented Reward Modeling
by: Li, Lei, et al.
Published: (2023)
by: Li, Lei, et al.
Published: (2023)
Dynamic Reward Adjustment in Multi-Reward Reinforcement Learning for Counselor Reflection Generation
by: Min, Do June, et al.
Published: (2024)
by: Min, Do June, et al.
Published: (2024)
DocReward: A Document Reward Model for Structuring and Stylizing
by: Liu, Junpeng, et al.
Published: (2025)
by: Liu, Junpeng, et al.
Published: (2025)
Reward Models Can Improve Themselves: Reward-Guided Adversarial Failure Mode Discovery for Robust Reward Modeling
by: Pathmanathan, Pankayaraj, et al.
Published: (2025)
by: Pathmanathan, Pankayaraj, et al.
Published: (2025)
Enhancing Large Language Model Reasoning with Reward Models: An Analytical Survey
by: Liu, Qiyuan, et al.
Published: (2025)
by: Liu, Qiyuan, et al.
Published: (2025)
GRAM: A Generative Foundation Reward Model for Reward Generalization
by: Wang, Chenglong, et al.
Published: (2025)
by: Wang, Chenglong, et al.
Published: (2025)
Generalizing Reward Modeling for Out-of-Distribution Preference Learning
by: Jia, Chen
Published: (2024)
by: Jia, Chen
Published: (2024)
Process Rewards with Learned Reliability
by: Li, Jinyuan, et al.
Published: (2026)
by: Li, Jinyuan, et al.
Published: (2026)
Learning to Reason without External Rewards
by: Zhao, Xuandong, et al.
Published: (2025)
by: Zhao, Xuandong, et al.
Published: (2025)
Self-Evolved Reward Learning for LLMs
by: Huang, Chenghua, et al.
Published: (2024)
by: Huang, Chenghua, et al.
Published: (2024)
Multi-Objective and Mixed-Reward Reinforcement Learning via Reward-Decorrelated Policy Optimization
by: Bai, Yang, et al.
Published: (2026)
by: Bai, Yang, et al.
Published: (2026)
Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs
by: Liu, Chris Yuhao, et al.
Published: (2024)
by: Liu, Chris Yuhao, et al.
Published: (2024)
IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards
by: Guo, Xu, et al.
Published: (2025)
by: Guo, Xu, et al.
Published: (2025)
EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing
by: Wu, Keming, et al.
Published: (2025)
by: Wu, Keming, et al.
Published: (2025)
A Survey on Neural Topic Models: Methods, Applications, and Challenges
by: Wu, Xiaobao, et al.
Published: (2024)
by: Wu, Xiaobao, et al.
Published: (2024)
Remedy: Learning Machine Translation Evaluation from Human Preferences with Reward Modeling
by: Tan, Shaomu, et al.
Published: (2025)
by: Tan, Shaomu, et al.
Published: (2025)
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
by: Zhong, Jialun, et al.
Published: (2025)
by: Zhong, Jialun, et al.
Published: (2025)
Towards Robust Process Reward Modeling via Noise-aware Learning
by: Xie, Bin, et al.
Published: (2026)
by: Xie, Bin, et al.
Published: (2026)
Reinforcement Learning Amplifies Emergent Misalignment from Harmless Rewards
by: Jørgenvåg, Magnus, et al.
Published: (2026)
by: Jørgenvåg, Magnus, et al.
Published: (2026)
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
by: Yin, Yueqin, et al.
Published: (2025)
by: Yin, Yueqin, et al.
Published: (2025)
Bayesian Preference Learning for Test-Time Steerable Reward Models
by: Hong, Jiwoo, et al.
Published: (2026)
by: Hong, Jiwoo, et al.
Published: (2026)
Gradient Regularization Prevents Reward Hacking in Reinforcement Learning from Human Feedback and Verifiable Rewards
by: Ackermann, Johannes, et al.
Published: (2026)
by: Ackermann, Johannes, et al.
Published: (2026)
Rethinking Sample Polarity in Reinforcement Learning with Verifiable Rewards
by: Tang, Xinyu, et al.
Published: (2025)
by: Tang, Xinyu, et al.
Published: (2025)
Similar Items
-
WildReward: Learning Reward Models from In-the-Wild Human Interactions
by: Peng, Hao, et al.
Published: (2026) -
Text2Reward: Reward Shaping with Language Models for Reinforcement Learning
by: Xie, Tianbao, et al.
Published: (2023) -
OpenReward: Learning to Reward Long-form Agentic Tasks via Reinforcement Learning
by: Hu, Ziyou, et al.
Published: (2025) -
A Survey on Progress in LLM Alignment from the Perspective of Reward Design
by: Ji, Miaomiao, et al.
Published: (2025) -
Preference Poisoning Attacks on Reward Model Learning
by: Wu, Junlin, et al.
Published: (2024)