:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Wu, Xiaobao
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2505.02686
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

WildReward: Learning Reward Models from In-the-Wild Human Interactions
by: Peng, Hao, et al.
Published: (2026)

Text2Reward: Reward Shaping with Language Models for Reinforcement Learning
by: Xie, Tianbao, et al.
Published: (2023)

OpenReward: Learning to Reward Long-form Agentic Tasks via Reinforcement Learning
by: Hu, Ziyou, et al.
Published: (2025)

A Survey on Progress in LLM Alignment from the Perspective of Reward Design
by: Ji, Miaomiao, et al.
Published: (2025)

Preference Poisoning Attacks on Reward Model Learning
by: Wu, Junlin, et al.
Published: (2024)

R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
by: Zhang, Yi-Fan, et al.
Published: (2025)

Agentic Reinforcement Learning with Implicit Step Rewards
by: Liu, Xiaoqian, et al.
Published: (2025)

Reward Reasoning Model
by: Guo, Jiaxin, et al.
Published: (2025)

Reward Model Perspectives: Whose Opinions Do Reward Models Reward?
by: Elle
Published: (2025)

Full-Step-DPO: Self-Supervised Preference Optimization with Step-wise Rewards for Mathematical Reasoning
by: Xu, Huimin, et al.
Published: (2025)

Libra: Assessing and Improving Reward Model by Learning to Think
by: Zhou, Meng, et al.
Published: (2025)

RRM: Robust Reward Model Training Mitigates Reward Hacking
by: Liu, Tianqi, et al.
Published: (2024)

Learning Goal-Conditioned Representations for Language Reward Models
by: Nath, Vaskar, et al.
Published: (2024)

Learning Ordinal Probabilistic Reward from Preferences
by: Chen, Longze, et al.
Published: (2026)

Rewarding Creativity: A Human-Aligned Generative Reward Model for Reinforcement Learning in Storytelling
by: Li, Zhaoyan, et al.
Published: (2026)

RewardBench 2: Advancing Reward Model Evaluation
by: Malik, Saumya, et al.
Published: (2025)

Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment
by: Wu, Zhaofeng, et al.
Published: (2024)

Tool-Augmented Reward Modeling
by: Li, Lei, et al.
Published: (2023)

Dynamic Reward Adjustment in Multi-Reward Reinforcement Learning for Counselor Reflection Generation
by: Min, Do June, et al.
Published: (2024)

DocReward: A Document Reward Model for Structuring and Stylizing
by: Liu, Junpeng, et al.
Published: (2025)

Reward Models Can Improve Themselves: Reward-Guided Adversarial Failure Mode Discovery for Robust Reward Modeling
by: Pathmanathan, Pankayaraj, et al.
Published: (2025)

Enhancing Large Language Model Reasoning with Reward Models: An Analytical Survey
by: Liu, Qiyuan, et al.
Published: (2025)

GRAM: A Generative Foundation Reward Model for Reward Generalization
by: Wang, Chenglong, et al.
Published: (2025)

Generalizing Reward Modeling for Out-of-Distribution Preference Learning
by: Jia, Chen
Published: (2024)

Process Rewards with Learned Reliability
by: Li, Jinyuan, et al.
Published: (2026)

Learning to Reason without External Rewards
by: Zhao, Xuandong, et al.
Published: (2025)

Self-Evolved Reward Learning for LLMs
by: Huang, Chenghua, et al.
Published: (2024)

Multi-Objective and Mixed-Reward Reinforcement Learning via Reward-Decorrelated Policy Optimization
by: Bai, Yang, et al.
Published: (2026)

Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs
by: Liu, Chris Yuhao, et al.
Published: (2024)

IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards
by: Guo, Xu, et al.
Published: (2025)

EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing
by: Wu, Keming, et al.
Published: (2025)

A Survey on Neural Topic Models: Methods, Applications, and Challenges
by: Wu, Xiaobao, et al.
Published: (2024)

Remedy: Learning Machine Translation Evaluation from Human Preferences with Reward Modeling
by: Tan, Shaomu, et al.
Published: (2025)

A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
by: Zhong, Jialun, et al.
Published: (2025)

Towards Robust Process Reward Modeling via Noise-aware Learning
by: Xie, Bin, et al.
Published: (2026)

Reinforcement Learning Amplifies Emergent Misalignment from Harmless Rewards
by: Jørgenvåg, Magnus, et al.
Published: (2026)

Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
by: Yin, Yueqin, et al.
Published: (2025)

Bayesian Preference Learning for Test-Time Steerable Reward Models
by: Hong, Jiwoo, et al.
Published: (2026)

Gradient Regularization Prevents Reward Hacking in Reinforcement Learning from Human Feedback and Verifiable Rewards
by: Ackermann, Johannes, et al.
Published: (2026)

Rethinking Sample Polarity in Reinforcement Learning with Verifiable Rewards
by: Tang, Xinyu, et al.
Published: (2025)