Guardado en:
| Autores principales: | Wong, Man Fai, Tan, Chee Wei |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2503.15129 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Aligning Crowd Feedback via Distributional Preference Reward Modeling
por: Li, Dexun, et al.
Publicado: (2024)
por: Li, Dexun, et al.
Publicado: (2024)
HLS-Seek: QoR-Aware Code Generation for High-Level Synthesis via Proxy Comparative Reward Reinforcement Learning
por: Zou, Qingyun, et al.
Publicado: (2026)
por: Zou, Qingyun, et al.
Publicado: (2026)
Wisdom of the Crowd: Reinforcement Learning from Coevolutionary Collective Feedback
por: Yuan, Wenzhen, et al.
Publicado: (2025)
por: Yuan, Wenzhen, et al.
Publicado: (2025)
Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback
por: Kim, Suzie, et al.
Publicado: (2025)
por: Kim, Suzie, et al.
Publicado: (2025)
Reinforcement Learning from Implicit Neural Feedback for Human-Aligned Robot Control
por: Kim, Suzie
Publicado: (2025)
por: Kim, Suzie
Publicado: (2025)
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
por: Ye, Kai, et al.
Publicado: (2025)
por: Ye, Kai, et al.
Publicado: (2025)
From Code Generation to Software Testing: AI Copilot with Context-Based RAG
por: Wang, Yuchen, et al.
Publicado: (2025)
por: Wang, Yuchen, et al.
Publicado: (2025)
Nemobot Games: Crafting Strategic AI Gaming Agents for Interactive Learning with Large Language Models
por: Tan, Chee Wei, et al.
Publicado: (2026)
por: Tan, Chee Wei, et al.
Publicado: (2026)
TrumorGPT: Graph-Based Retrieval-Augmented Large Language Model for Fact-Checking
por: Hang, Ching Nam, et al.
Publicado: (2025)
por: Hang, Ching Nam, et al.
Publicado: (2025)
MAD: Modality-Adaptive Decoding for Mitigating Cross-Modal Hallucinations in Multimodal Large Language Models
por: Chung, Sangyun, et al.
Publicado: (2026)
por: Chung, Sangyun, et al.
Publicado: (2026)
Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models
por: Wang, Rui, et al.
Publicado: (2025)
por: Wang, Rui, et al.
Publicado: (2025)
Anchor-based Large Language Models
por: Pang, Jianhui, et al.
Publicado: (2024)
por: Pang, Jianhui, et al.
Publicado: (2024)
Learning to Align Human Code Preferences
por: Yin, Xin, et al.
Publicado: (2025)
por: Yin, Xin, et al.
Publicado: (2025)
Aligning Large Language Models from Self-Reference AI Feedback with one General Principle
por: Bao, Rong, et al.
Publicado: (2024)
por: Bao, Rong, et al.
Publicado: (2024)
Large Language Model for Verilog Generation with Code-Structure-Guided Reinforcement Learning
por: Wang, Ning, et al.
Publicado: (2024)
por: Wang, Ning, et al.
Publicado: (2024)
RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human Feedback in Large Language Models
por: Wang, Jiongxiao, et al.
Publicado: (2023)
por: Wang, Jiongxiao, et al.
Publicado: (2023)
Enabling Energy-Efficient Deployment of Large Language Models on Memristor Crossbar: A Synergy of Large and Small
por: Wang, Zhehui, et al.
Publicado: (2024)
por: Wang, Zhehui, et al.
Publicado: (2024)
CHAI for LLMs: Improving Code-Mixed Translation in Large Language Models through Reinforcement Learning with AI Feedback
por: Zhang, Wenbo, et al.
Publicado: (2024)
por: Zhang, Wenbo, et al.
Publicado: (2024)
A Critical Evaluation of AI Feedback for Aligning Large Language Models
por: Sharma, Archit, et al.
Publicado: (2024)
por: Sharma, Archit, et al.
Publicado: (2024)
Large Language Models are Highly Aligned with Human Ratings of Emotional Stimuli
por: Ogg, Mattson, et al.
Publicado: (2025)
por: Ogg, Mattson, et al.
Publicado: (2025)
Aligning Large Vision-Language Models by Deep Reinforcement Learning and Direct Preference Optimization
por: Nguyen, Thanh Thi, et al.
Publicado: (2025)
por: Nguyen, Thanh Thi, et al.
Publicado: (2025)
Task Abstention for Large Language Models in Code Generation
por: Zhou, Yanke, et al.
Publicado: (2026)
por: Zhou, Yanke, et al.
Publicado: (2026)
Reward Modeling with Ordinal Feedback: Wisdom of the Crowd
por: Liu, Shang, et al.
Publicado: (2024)
por: Liu, Shang, et al.
Publicado: (2024)
Can Reinforcement Learning Unlock the Hidden Dangers in Aligned Large Language Models?
por: Karkevandi, Mohammad Bahrami, et al.
Publicado: (2024)
por: Karkevandi, Mohammad Bahrami, et al.
Publicado: (2024)
Aligning Large Language Model Behavior with Human Citation Preferences
por: Ando, Kenichiro, et al.
Publicado: (2026)
por: Ando, Kenichiro, et al.
Publicado: (2026)
Efficiently Aligning Language Models with Online Natural Language Feedback
por: Ye, Christine, et al.
Publicado: (2026)
por: Ye, Christine, et al.
Publicado: (2026)
Contextual Augmented Multi-Model Programming (CAMP): A Hybrid Local-Cloud Copilot Framework
por: Wang, Yuchen, et al.
Publicado: (2024)
por: Wang, Yuchen, et al.
Publicado: (2024)
Post-Training Large Language Models via Reinforcement Learning from Self-Feedback
por: van Niekerk, Carel, et al.
Publicado: (2025)
por: van Niekerk, Carel, et al.
Publicado: (2025)
DrugGen: Advancing Drug Discovery with Large Language Models and Reinforcement Learning Feedback
por: Sheikholeslami, Mahsa, et al.
Publicado: (2024)
por: Sheikholeslami, Mahsa, et al.
Publicado: (2024)
Towards Aligning Language Models with Textual Feedback
por: Lloret, Saüc Abadal, et al.
Publicado: (2024)
por: Lloret, Saüc Abadal, et al.
Publicado: (2024)
Peering Through Preferences: Unraveling Feedback Acquisition for Aligning Large Language Models
por: Bansal, Hritik, et al.
Publicado: (2023)
por: Bansal, Hritik, et al.
Publicado: (2023)
Structure-Aware Corpus Construction and User-Perception-Aligned Metrics for Large-Language-Model Code Completion
por: Liu, Dengfeng, et al.
Publicado: (2025)
por: Liu, Dengfeng, et al.
Publicado: (2025)
Reinforcement Learning with Token-level Feedback for Controllable Text Generation
por: Li, Wendi, et al.
Publicado: (2024)
por: Li, Wendi, et al.
Publicado: (2024)
FedDRL: A Trustworthy Federated Learning Model Fusion Method Based on Staged Reinforcement Learning
por: Chen, Leiming, et al.
Publicado: (2023)
por: Chen, Leiming, et al.
Publicado: (2023)
Reanalyzing L2 Preposition Learning with Bayesian Mixed Effects and a Pretrained Language Model
por: Prange, Jakob, et al.
Publicado: (2023)
por: Prange, Jakob, et al.
Publicado: (2023)
Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
por: Shen, Wei, et al.
Publicado: (2024)
por: Shen, Wei, et al.
Publicado: (2024)
Rewarding Creativity: A Human-Aligned Generative Reward Model for Reinforcement Learning in Storytelling
por: Li, Zhaoyan, et al.
Publicado: (2026)
por: Li, Zhaoyan, et al.
Publicado: (2026)
GALLa: Graph Aligned Large Language Models for Improved Source Code Understanding
por: Zhang, Ziyin, et al.
Publicado: (2024)
por: Zhang, Ziyin, et al.
Publicado: (2024)
The Real, the Better: Aligning Large Language Models with Online Human Behaviors
por: Jiang, Guanying, et al.
Publicado: (2024)
por: Jiang, Guanying, et al.
Publicado: (2024)
Aligning to Illusions: Choice Blindness in Human and AI Feedback
por: Wu, Wenbin
Publicado: (2026)
por: Wu, Wenbin
Publicado: (2026)
Ejemplares similares
-
Aligning Crowd Feedback via Distributional Preference Reward Modeling
por: Li, Dexun, et al.
Publicado: (2024) -
HLS-Seek: QoR-Aware Code Generation for High-Level Synthesis via Proxy Comparative Reward Reinforcement Learning
por: Zou, Qingyun, et al.
Publicado: (2026) -
Wisdom of the Crowd: Reinforcement Learning from Coevolutionary Collective Feedback
por: Yuan, Wenzhen, et al.
Publicado: (2025) -
Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback
por: Kim, Suzie, et al.
Publicado: (2025) -
Reinforcement Learning from Implicit Neural Feedback for Human-Aligned Robot Control
por: Kim, Suzie
Publicado: (2025)