Saved in:
| Main Authors: | Laleh, Alireza Rashidi, Ahmadabadi, Majid Nili |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.13410 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Exploiting Expertise of Non-Expert and Diverse Agents in Social Bandit Learning: A Free Energy Approach
by: Mirzaei, Erfan, et al.
Published: (2026)
by: Mirzaei, Erfan, et al.
Published: (2026)
Risk Sensitivity in Markov Games and Multi-Agent Reinforcement Learning: A Systematic Review
by: Ghaemi, Hafez, et al.
Published: (2024)
by: Ghaemi, Hafez, et al.
Published: (2024)
Subgoal Discovery Using a Free Energy Paradigm and State Aggregations
by: Mesbah, Amirhossein, et al.
Published: (2024)
by: Mesbah, Amirhossein, et al.
Published: (2024)
AI-powered Digital Framework for Personalized Economical Quality Learning at Scale
by: VatandoustMohammadieh, Mrzieh, et al.
Published: (2024)
by: VatandoustMohammadieh, Mrzieh, et al.
Published: (2024)
Risk-Sensitive Multi-Agent Reinforcement Learning in Network Aggregative Markov Games
by: Ghaemi, Hafez, et al.
Published: (2024)
by: Ghaemi, Hafez, et al.
Published: (2024)
Feature Aggregation in Joint Sound Classification and Localization Neural Networks
by: Healy, Brendan, et al.
Published: (2023)
by: Healy, Brendan, et al.
Published: (2023)
The use of the Extended Generalized Lambda Distribution for controlling the statistical process in individual measurements
by: Noorian, Sajad, et al.
Published: (2018)
by: Noorian, Sajad, et al.
Published: (2018)
CoCoP: Enhancing Text Classification with LLM through Code Completion Prompt
by: Mohajeri, Mohammad Mahdi, et al.
Published: (2024)
by: Mohajeri, Mohammad Mahdi, et al.
Published: (2024)
A Survey of Reinforcement Learning from Human Feedback
by: Kaufmann, Timo, et al.
Published: (2023)
by: Kaufmann, Timo, et al.
Published: (2023)
Reinforcement Learning from Human Feedback
by: Lambert, Nathan
Published: (2025)
by: Lambert, Nathan
Published: (2025)
Strategyproof Reinforcement Learning from Human Feedback
by: Buening, Thomas Kleine, et al.
Published: (2025)
by: Buening, Thomas Kleine, et al.
Published: (2025)
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
by: Swamy, Gokul, et al.
Published: (2024)
by: Swamy, Gokul, et al.
Published: (2024)
Reinforcement Learning from Human Feedback: A Statistical Perspective
by: Liu, Pangpang, et al.
Published: (2026)
by: Liu, Pangpang, et al.
Published: (2026)
A Comprehensive Survey of Reinforcement Learning: From Algorithms to Practical Challenges
by: Ghasemi, Majid, et al.
Published: (2024)
by: Ghasemi, Majid, et al.
Published: (2024)
Robust Reinforcement Learning from Corrupted Human Feedback
by: Bukharin, Alexander, et al.
Published: (2024)
by: Bukharin, Alexander, et al.
Published: (2024)
Dual Active Learning for Reinforcement Learning from Human Feedback
by: Liu, Pangpang, et al.
Published: (2024)
by: Liu, Pangpang, et al.
Published: (2024)
Reinforcement Learning from LLM Feedback to Counteract Goal Misgeneralization
by: Barj, Houda Nait El, et al.
Published: (2024)
by: Barj, Houda Nait El, et al.
Published: (2024)
Dense Reward for Free in Reinforcement Learning from Human Feedback
by: Chan, Alex J., et al.
Published: (2024)
by: Chan, Alex J., et al.
Published: (2024)
Multi-turn Reinforcement Learning from Preference Human Feedback
by: Shani, Lior, et al.
Published: (2024)
by: Shani, Lior, et al.
Published: (2024)
Reinforcement Learning from Multi-level and Episodic Human Feedback
by: Elahi, Muhammad Qasim, et al.
Published: (2025)
by: Elahi, Muhammad Qasim, et al.
Published: (2025)
The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback
by: Chen, Ruitao, et al.
Published: (2024)
by: Chen, Ruitao, et al.
Published: (2024)
Low-Rank Contextual Reinforcement Learning from Heterogeneous Human Feedback
by: Lee, Seong Jin, et al.
Published: (2024)
by: Lee, Seong Jin, et al.
Published: (2024)
Data-dependent Exploration for Online Reinforcement Learning from Human Feedback
by: Zhang, Zhen-Yu, et al.
Published: (2026)
by: Zhang, Zhen-Yu, et al.
Published: (2026)
The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback
by: Lambert, Nathan, et al.
Published: (2023)
by: Lambert, Nathan, et al.
Published: (2023)
Provable Reinforcement Learning from Human Feedback with an Unknown Link Function
by: Zhang, Qining, et al.
Published: (2025)
by: Zhang, Qining, et al.
Published: (2025)
Enhancing LLMs for Physics Problem-Solving using Reinforcement Learning with Human-AI Feedback
by: Anand, Avinash, et al.
Published: (2024)
by: Anand, Avinash, et al.
Published: (2024)
Enhancing Safety in Reinforcement Learning with Human Feedback via Rectified Policy Optimization
by: Peng, Xiyue, et al.
Published: (2024)
by: Peng, Xiyue, et al.
Published: (2024)
Parameter Efficient Reinforcement Learning from Human Feedback
by: Sidahmed, Hakim, et al.
Published: (2024)
by: Sidahmed, Hakim, et al.
Published: (2024)
Introduction to Reinforcement Learning
by: Ghasemi, Majid, et al.
Published: (2024)
by: Ghasemi, Majid, et al.
Published: (2024)
Adaptive Preference Scaling for Reinforcement Learning with Human Feedback
by: Hong, Ilgee, et al.
Published: (2024)
by: Hong, Ilgee, et al.
Published: (2024)
Corruption Robust Offline Reinforcement Learning with Human Feedback
by: Mandal, Debmalya, et al.
Published: (2024)
by: Mandal, Debmalya, et al.
Published: (2024)
PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback
by: Chakraborty, Souradip, et al.
Published: (2023)
by: Chakraborty, Souradip, et al.
Published: (2023)
Distributionally Robust Reinforcement Learning with Human Feedback
by: Mandal, Debmalya, et al.
Published: (2025)
by: Mandal, Debmalya, et al.
Published: (2025)
A Dual-Axis Taxonomy of Knowledge Editing for LLMs: From Mechanisms to Functions
by: Salehoof, Amir Mohammad, et al.
Published: (2025)
by: Salehoof, Amir Mohammad, et al.
Published: (2025)
Online Iterative Reinforcement Learning from Human Feedback with General Preference Model
by: Ye, Chenlu, et al.
Published: (2024)
by: Ye, Chenlu, et al.
Published: (2024)
TIC-GRPO: Provable and Efficient Optimization for Reinforcement Learning from Human Feedback
by: Pang, Lei, et al.
Published: (2025)
by: Pang, Lei, et al.
Published: (2025)
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback
by: Shen, Wei, et al.
Published: (2025)
by: Shen, Wei, et al.
Published: (2025)
LLM-Augmented Symbolic Reinforcement Learning with Landmark-Based Task Decomposition
by: Kheirandish, Alireza, et al.
Published: (2024)
by: Kheirandish, Alireza, et al.
Published: (2024)
Swap-guided Preference Learning for Personalized Reinforcement Learning from Human Feedback
by: Kim, Gihoon, et al.
Published: (2026)
by: Kim, Gihoon, et al.
Published: (2026)
RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
by: Lee, Harrison, et al.
Published: (2023)
by: Lee, Harrison, et al.
Published: (2023)
Similar Items
-
Exploiting Expertise of Non-Expert and Diverse Agents in Social Bandit Learning: A Free Energy Approach
by: Mirzaei, Erfan, et al.
Published: (2026) -
Risk Sensitivity in Markov Games and Multi-Agent Reinforcement Learning: A Systematic Review
by: Ghaemi, Hafez, et al.
Published: (2024) -
Subgoal Discovery Using a Free Energy Paradigm and State Aggregations
by: Mesbah, Amirhossein, et al.
Published: (2024) -
AI-powered Digital Framework for Personalized Economical Quality Learning at Scale
by: VatandoustMohammadieh, Mrzieh, et al.
Published: (2024) -
Risk-Sensitive Multi-Agent Reinforcement Learning in Network Aggregative Markov Games
by: Ghaemi, Hafez, et al.
Published: (2024)