Saved in:
| Main Authors: | Rahman, Tasnia, Kumar, Sathish A. P., Jha, Sumit, Ramanathan, Arvind |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.04657 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MORAL: A Multimodal Reinforcement Learning Framework for Decision Making in Autonomous Laboratories
by: Tirabassi, Natalie, et al.
Published: (2025)
by: Tirabassi, Natalie, et al.
Published: (2025)
DML-RAM: Deep Multimodal Learning Framework for Robotic Arm Manipulation using Pre-trained Models
by: Kumar, Sathish, et al.
Published: (2025)
by: Kumar, Sathish, et al.
Published: (2025)
Enhanced Penalty-based Bidirectional Reinforcement Learning Algorithms
by: Pula, Sai Gana Sandeep, et al.
Published: (2025)
by: Pula, Sai Gana Sandeep, et al.
Published: (2025)
RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
by: Lee, Harrison, et al.
Published: (2023)
by: Lee, Harrison, et al.
Published: (2023)
MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
by: Chai, Yekun, et al.
Published: (2024)
by: Chai, Yekun, et al.
Published: (2024)
ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback
by: Hou, Zhenyu, et al.
Published: (2024)
by: Hou, Zhenyu, et al.
Published: (2024)
Aligning Crowd-sourced Human Feedback for Reinforcement Learning on Code Generation by Large Language Models
by: Wong, Man Fai, et al.
Published: (2025)
by: Wong, Man Fai, et al.
Published: (2025)
RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs
by: Chaudhari, Shreyas, et al.
Published: (2024)
by: Chaudhari, Shreyas, et al.
Published: (2024)
Safe RLHF-V: Safe Reinforcement Learning from Multi-modal Human Feedback
by: Ji, Jiaming, et al.
Published: (2025)
by: Ji, Jiaming, et al.
Published: (2025)
Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback
by: Yuan, Yifu, et al.
Published: (2024)
by: Yuan, Yifu, et al.
Published: (2024)
Evaluating Defences against Unsafe Feedback in RLHF
by: Rosati, Domenic, et al.
Published: (2024)
by: Rosati, Domenic, et al.
Published: (2024)
UICoder: Finetuning Large Language Models to Generate User Interface Code through Automated Feedback
by: Wu, Jason, et al.
Published: (2024)
by: Wu, Jason, et al.
Published: (2024)
Sequence to Sequence Reward Modeling: Improving RLHF by Language Feedback
by: Zhou, Jiayi, et al.
Published: (2024)
by: Zhou, Jiayi, et al.
Published: (2024)
CRScore++: Reinforcement Learning with Verifiable Tool and AI Feedback for Code Review
by: Kapadnis, Manav Nitin, et al.
Published: (2025)
by: Kapadnis, Manav Nitin, et al.
Published: (2025)
Fine-Tuning Models for Automated Code Review Feedback
by: Kumar, Smitha S, et al.
Published: (2026)
by: Kumar, Smitha S, et al.
Published: (2026)
SAFE: Stable Alignment Finetuning with Entropy-Aware Predictive Control for Reinforcement Learning from Human Feedback (RLHF)
by: Maity, Dipan
Published: (2026)
by: Maity, Dipan
Published: (2026)
FeedbackEval: A Benchmark for Evaluating Large Language Models in Feedback-Driven Code Repair Tasks
by: Dai, Dekun, et al.
Published: (2025)
by: Dai, Dekun, et al.
Published: (2025)
Improving Small Language Models for Code Generation with Reinforcement Learning from Verification Feedback
by: Skopin, Egor, et al.
Published: (2026)
by: Skopin, Egor, et al.
Published: (2026)
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
by: Ye, Kai, et al.
Published: (2025)
by: Ye, Kai, et al.
Published: (2025)
RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human Feedback in Large Language Models
by: Wang, Jiongxiao, et al.
Published: (2023)
by: Wang, Jiongxiao, et al.
Published: (2023)
CoTran: An LLM-based Code Translator using Reinforcement Learning with Feedback from Compiler and Symbolic Execution
by: Jana, Prithwish, et al.
Published: (2023)
by: Jana, Prithwish, et al.
Published: (2023)
Humanizing Automated Programming Feedback: Fine-Tuning Generative Models with Student-Written Feedback
by: Pădurean, Victor-Alexandru, et al.
Published: (2025)
by: Pădurean, Victor-Alexandru, et al.
Published: (2025)
Circuit Partitioning Using Large Language Models for Quantum Compilation and Simulations
by: Sinha, Pranav, et al.
Published: (2025)
by: Sinha, Pranav, et al.
Published: (2025)
Assessing Large Language Models for Automated Feedback Generation in Learning Programming Problem Solving
by: Silva, Priscylla, et al.
Published: (2025)
by: Silva, Priscylla, et al.
Published: (2025)
Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments
by: Ye, Junjie, et al.
Published: (2025)
by: Ye, Junjie, et al.
Published: (2025)
Reinforcement Learning without Human Feedback for Last Mile Fine-Tuning of Large Language Models
by: Solway, Alec
Published: (2024)
by: Solway, Alec
Published: (2024)
Reinforcement Learning from Human Feedback
by: Lambert, Nathan
Published: (2025)
by: Lambert, Nathan
Published: (2025)
Enhancing Image Caption Generation Using Reinforcement Learning with Human Feedback
by: L, Adarsh N, et al.
Published: (2024)
by: L, Adarsh N, et al.
Published: (2024)
Modeling Thin‐Layer Drying Kinetics of Justicia adhatoda Leaves Using Different Drying Techniques and Its Quality Evaluation
by: Sudarshan Ramanathan, et al.
Published: (2025)
by: Sudarshan Ramanathan, et al.
Published: (2025)
Large Language Models Enable Automated Formative Feedback in Human-Robot Interaction Tasks
by: Jensen, Emily, et al.
Published: (2024)
by: Jensen, Emily, et al.
Published: (2024)
GLLM: Self-Corrective G-Code Generation using Large Language Models with User Feedback
by: Abdelaal, Mohamed, et al.
Published: (2025)
by: Abdelaal, Mohamed, et al.
Published: (2025)
REvolve: Reward Evolution with Large Language Models using Human Feedback
by: Hazra, Rishi, et al.
Published: (2024)
by: Hazra, Rishi, et al.
Published: (2024)
Ambiguity Resolution with Human Feedback for Code Writing Tasks
by: Nandan, Aditey, et al.
Published: (2025)
by: Nandan, Aditey, et al.
Published: (2025)
LAPP: Large Language Model Feedback for Preference-Driven Reinforcement Learning
by: Jian, Pingcheng, et al.
Published: (2025)
by: Jian, Pingcheng, et al.
Published: (2025)
Evaluating Large Language Models with Human Feedback: Establishing a Swedish Benchmark
by: Moell, Birger
Published: (2024)
by: Moell, Birger
Published: (2024)
Generative AI for CAD Automation: Leveraging Large Language Models for 3D Modelling
by: Kumar, Sumit, et al.
Published: (2025)
by: Kumar, Sumit, et al.
Published: (2025)
Sakshm AI: Advancing AI-Assisted Coding Education for Engineering Students in India Through Socratic Tutoring and Comprehensive Feedback
by: Gupta, Raj, et al.
Published: (2025)
by: Gupta, Raj, et al.
Published: (2025)
Reinforcement Learning for Optimizing Large Qubit Array based Quantum Sensor Circuits
by: Attisara, Laxmisha Ashok, et al.
Published: (2025)
by: Attisara, Laxmisha Ashok, et al.
Published: (2025)
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint
by: Xiong, Wei, et al.
Published: (2023)
by: Xiong, Wei, et al.
Published: (2023)
Online Iterative Reinforcement Learning from Human Feedback with General Preference Model
by: Ye, Chenlu, et al.
Published: (2024)
by: Ye, Chenlu, et al.
Published: (2024)
Similar Items
-
MORAL: A Multimodal Reinforcement Learning Framework for Decision Making in Autonomous Laboratories
by: Tirabassi, Natalie, et al.
Published: (2025) -
DML-RAM: Deep Multimodal Learning Framework for Robotic Arm Manipulation using Pre-trained Models
by: Kumar, Sathish, et al.
Published: (2025) -
Enhanced Penalty-based Bidirectional Reinforcement Learning Algorithms
by: Pula, Sai Gana Sandeep, et al.
Published: (2025) -
RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
by: Lee, Harrison, et al.
Published: (2023) -
MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
by: Chai, Yekun, et al.
Published: (2024)