Saved in:
| Main Author: | Williams, Marcus |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.07295 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Multi-turn Reinforcement Learning from Preference Human Feedback
by: Shani, Lior, et al.
Published: (2024)
by: Shani, Lior, et al.
Published: (2024)
Reinforcement Learning from Multi-level and Episodic Human Feedback
by: Elahi, Muhammad Qasim, et al.
Published: (2025)
by: Elahi, Muhammad Qasim, et al.
Published: (2025)
RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
by: Lee, Harrison, et al.
Published: (2023)
by: Lee, Harrison, et al.
Published: (2023)
On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback
by: Williams, Marcus, et al.
Published: (2024)
by: Williams, Marcus, et al.
Published: (2024)
The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback
by: Chen, Ruitao, et al.
Published: (2024)
by: Chen, Ruitao, et al.
Published: (2024)
Reinforcement Learning from Human Feedback
by: Lambert, Nathan
Published: (2025)
by: Lambert, Nathan
Published: (2025)
On The Expressivity of Objective-Specification Formalisms in Reinforcement Learning
by: Subramani, Rohan, et al.
Published: (2023)
by: Subramani, Rohan, et al.
Published: (2023)
Reinforcement Learning with Segment Feedback
by: Du, Yihan, et al.
Published: (2025)
by: Du, Yihan, et al.
Published: (2025)
Strategyproof Reinforcement Learning from Human Feedback
by: Buening, Thomas Kleine, et al.
Published: (2025)
by: Buening, Thomas Kleine, et al.
Published: (2025)
Corruption-robust Offline Multi-agent Reinforcement Learning From Human Feedback
by: Nika, Andi, et al.
Published: (2026)
by: Nika, Andi, et al.
Published: (2026)
Better than Your Teacher: LLM Agents that learn from Privileged AI Feedback
by: Choudhury, Sanjiban, et al.
Published: (2024)
by: Choudhury, Sanjiban, et al.
Published: (2024)
Reinforcement Learning from Denoising Feedback
by: He, Qi, et al.
Published: (2026)
by: He, Qi, et al.
Published: (2026)
Provable Multi-Party Reinforcement Learning with Diverse Human Feedback
by: Zhong, Huiying, et al.
Published: (2024)
by: Zhong, Huiying, et al.
Published: (2024)
Combinatorial Reinforcement Learning with Preference Feedback
by: Lee, Joongkyu, et al.
Published: (2025)
by: Lee, Joongkyu, et al.
Published: (2025)
Sharing to learn and learning to share; Fitting together Meta-Learning, Multi-Task Learning, and Transfer Learning: A meta review
by: Upadhyay, Richa, et al.
Published: (2021)
by: Upadhyay, Richa, et al.
Published: (2021)
Robust Reinforcement Learning from Corrupted Human Feedback
by: Bukharin, Alexander, et al.
Published: (2024)
by: Bukharin, Alexander, et al.
Published: (2024)
Safe RLHF-V: Safe Reinforcement Learning from Multi-modal Human Feedback
by: Ji, Jiaming, et al.
Published: (2025)
by: Ji, Jiaming, et al.
Published: (2025)
Enhancing LLMs for Physics Problem-Solving using Reinforcement Learning with Human-AI Feedback
by: Anand, Avinash, et al.
Published: (2024)
by: Anand, Avinash, et al.
Published: (2024)
HRLAIF: Improvements in Helpfulness and Harmlessness in Open-domain Reinforcement Learning From AI Feedback
by: Li, Ang, et al.
Published: (2024)
by: Li, Ang, et al.
Published: (2024)
A Contractive Feedback Semantics for Reinforcement Learning
by: Zhang, Zuyuan
Published: (2026)
by: Zhang, Zuyuan
Published: (2026)
RLAF: Reinforcement Learning from Automaton Feedback
by: Alinejad, Mahyar, et al.
Published: (2025)
by: Alinejad, Mahyar, et al.
Published: (2025)
Safe Multi-Agent Navigation guided by Goal-Conditioned Safe Reinforcement Learning
by: Feng, Meng, et al.
Published: (2025)
by: Feng, Meng, et al.
Published: (2025)
Reinforcement Learning from LLM Feedback to Counteract Goal Misgeneralization
by: Barj, Houda Nait El, et al.
Published: (2024)
by: Barj, Houda Nait El, et al.
Published: (2024)
Dual Active Learning for Reinforcement Learning from Human Feedback
by: Liu, Pangpang, et al.
Published: (2024)
by: Liu, Pangpang, et al.
Published: (2024)
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
by: Swamy, Gokul, et al.
Published: (2024)
by: Swamy, Gokul, et al.
Published: (2024)
Dense Reward for Free in Reinforcement Learning from Human Feedback
by: Chan, Alex J., et al.
Published: (2024)
by: Chan, Alex J., et al.
Published: (2024)
Reinforcement Learning from Human Feedback: A Statistical Perspective
by: Liu, Pangpang, et al.
Published: (2026)
by: Liu, Pangpang, et al.
Published: (2026)
Expanding the Capabilities of Reinforcement Learning via Text Feedback
by: Song, Yuda, et al.
Published: (2026)
by: Song, Yuda, et al.
Published: (2026)
Utility-Based Reinforcement Learning: Unifying Single-objective and Multi-objective Reinforcement Learning
by: Vamplew, Peter, et al.
Published: (2024)
by: Vamplew, Peter, et al.
Published: (2024)
Low-Rank Contextual Reinforcement Learning from Heterogeneous Human Feedback
by: Lee, Seong Jin, et al.
Published: (2024)
by: Lee, Seong Jin, et al.
Published: (2024)
Data-dependent Exploration for Online Reinforcement Learning from Human Feedback
by: Zhang, Zhen-Yu, et al.
Published: (2026)
by: Zhang, Zhen-Yu, et al.
Published: (2026)
The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback
by: Lambert, Nathan, et al.
Published: (2023)
by: Lambert, Nathan, et al.
Published: (2023)
Provable Reinforcement Learning from Human Feedback with an Unknown Link Function
by: Zhang, Qining, et al.
Published: (2025)
by: Zhang, Qining, et al.
Published: (2025)
CANDERE-COACH: Reinforcement Learning from Noisy Feedback
by: Li, Yuxuan, et al.
Published: (2024)
by: Li, Yuxuan, et al.
Published: (2024)
Scalable Multi-Objective Robot Reinforcement Learning through Gradient Conflict Resolution
by: Munn, Humphrey, et al.
Published: (2025)
by: Munn, Humphrey, et al.
Published: (2025)
Reinforcement Learning with Backtracking Feedback
by: Sel, Bilgehan, et al.
Published: (2026)
by: Sel, Bilgehan, et al.
Published: (2026)
M3HF: Multi-agent Reinforcement Learning from Multi-phase Human Feedback of Mixed Quality
by: Wang, Ziyan, et al.
Published: (2025)
by: Wang, Ziyan, et al.
Published: (2025)
Trust, Don't Trust, or Flip: Robust Preference-Based Reinforcement Learning with Multi-Expert Feedback
by: Hosseini, Seyed Amir, et al.
Published: (2026)
by: Hosseini, Seyed Amir, et al.
Published: (2026)
Online Iterative Reinforcement Learning from Human Feedback with General Preference Model
by: Ye, Chenlu, et al.
Published: (2024)
by: Ye, Chenlu, et al.
Published: (2024)
ARES: Alternating Reinforcement Learning and Supervised Fine-Tuning for Enhanced Multi-Modal Chain-of-Thought Reasoning Through Diverse AI Feedback
by: Byun, Ju-Seung, et al.
Published: (2024)
by: Byun, Ju-Seung, et al.
Published: (2024)
Similar Items
-
Multi-turn Reinforcement Learning from Preference Human Feedback
by: Shani, Lior, et al.
Published: (2024) -
Reinforcement Learning from Multi-level and Episodic Human Feedback
by: Elahi, Muhammad Qasim, et al.
Published: (2025) -
RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
by: Lee, Harrison, et al.
Published: (2023) -
On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback
by: Williams, Marcus, et al.
Published: (2024) -
The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback
by: Chen, Ruitao, et al.
Published: (2024)