:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Feng, Yunzhen, Kwiatkowski, Ariel, Zheng, Kunhao, Kempe, Julia, Duan, Yaqi
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2502.04270
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Don't Waste Mistakes: Leveraging Negative RL-Groups via Confidence Reweighting
by: Feng, Yunzhen, et al.
Published: (2025)

Model Collapse Demystified: The Case of Regression
by: Dohmatob, Elvis, et al.
Published: (2024)

Strong Model Collapse
by: Dohmatob, Elvis, et al.
Published: (2024)

Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification
by: Feng, Yunzhen, et al.
Published: (2024)

What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT
by: Feng, Yunzhen, et al.
Published: (2025)

A Tale of Tails: Model Collapse as a Change of Scaling Laws
by: Dohmatob, Elvis, et al.
Published: (2024)

Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
by: Sundaram, Shobhita, et al.
Published: (2026)

Soft Tokens, Hard Truths
by: Butt, Natasha, et al.
Published: (2025)

Attacking Bayes: On the Adversarial Robustness of Bayesian Neural Networks
by: Feng, Yunzhen, et al.
Published: (2024)

Leveraging Sparsity for Sample-Efficient Preference Learning: A Theoretical Perspective
by: Yao, Yunzhen, et al.
Published: (2025)

Likelihood-Based Reward Designs for General LLM Reasoning
by: Kwiatkowski, Ariel, et al.
Published: (2026)

APLOT: Robust Reward Modeling via Adaptive Preference Learning with Optimal Transport
by: Li, Zhuo, et al.
Published: (2025)

Preference as Reward, Maximum Preference Optimization with Importance Sampling
by: Jiang, Zaifan, et al.
Published: (2023)

Emergent properties with repeated examples
by: Charton, François, et al.
Published: (2024)

Optimal Transport for LLM Reward Modeling from Noisy Preference
by: Pan, Licheng, et al.
Published: (2026)

In-Context Reward Adaptation for Robust Preference Modeling
by: Sun, Zhenyu, et al.
Published: (2026)

Optimizing Language Models for Inference Time Objectives using Reinforcement Learning
by: Tang, Yunhao, et al.
Published: (2025)

DreamReward: Text-to-3D Generation with Human Preference
by: Ye, Junliang, et al.
Published: (2024)

Capturing Individual Human Preferences with Reward Features
by: Barreto, André, et al.
Published: (2025)

IRPM: Intergroup Relative Preference Modeling for Pointwise Generative Reward Models
by: Song, Haonan, et al.
Published: (2026)

Outcome-based Exploration for LLM Reasoning
by: Song, Yuda, et al.
Published: (2025)

Deconstructing the Goldilocks Zone of Neural Network Initialization
by: Vysogorets, Artem, et al.
Published: (2024)

Optimal Design for Human Preference Elicitation
by: Mukherjee, Subhojyoti, et al.
Published: (2024)

Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input
by: Peng, Andi, et al.
Published: (2024)

How Reinforcement Learning After Next-Token Prediction Facilitates Learning
by: Tsilivis, Nikolaos, et al.
Published: (2025)

Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks
by: Tsilivis, Nikolaos, et al.
Published: (2024)

The Price of Implicit Bias in Adversarially Robust Generalization
by: Tsilivis, Nikolaos, et al.
Published: (2024)

From Concepts to Components: Concept-Agnostic Attention Module Discovery in Transformers
by: Su, Jingtong, et al.
Published: (2025)

Mission Impossible: A Statistical Perspective on Jailbreaking LLMs
by: Su, Jingtong, et al.
Published: (2024)

DRoP: Distributionally Robust Data Pruning
by: Vysogorets, Artem, et al.
Published: (2024)

From Demonstrations to Rewards: Alignment Without Explicit Human Preferences
by: Zeng, Siliang, et al.
Published: (2025)

Non-Asymptotic Analysis of Efficiency in Conformalized Regression
by: Yao, Yunzhen, et al.
Published: (2025)

Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards
by: Metcalf, Katherine, et al.
Published: (2024)

Incorporating Human Flexibility through Reward Preferences in Human-AI Teaming
by: Bhambri, Siddhant, et al.
Published: (2023)

Rectifying Shortcut Behaviors in Preference-based Reward Learning
by: Ye, Wenqian, et al.
Published: (2025)

Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Bootstrapping
by: Yang, Pu, et al.
Published: (2025)

Batch Active Learning of Reward Functions from Human Preferences
by: Bıyık, Erdem, et al.
Published: (2024)

Explicit Preference Optimization: No Need for an Implicit Reward Model
by: Hu, Xiangkun, et al.
Published: (2025)

Hindsight PRIORs for Reward Learning from Human Preferences
by: Verma, Mudit, et al.
Published: (2024)

On the Robustness of Neural Collapse and the Neural Collapse of Robustness
by: Su, Jingtong, et al.
Published: (2023)