:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Cai, Yuang, Yuan, Yuyu, Shi, Jinsheng, Lin, Qinhong
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2411.09341
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Inverse-RLignment: Large Language Model Alignment from Demonstrations through Inverse Reinforcement Learning
by: Sun, Hao, et al.
Published: (2024)

$\mathcal{X}$-KD: General Experiential Knowledge Distillation for Large Language Models
by: Cai, Yuang, et al.
Published: (2026)

Kernel Density Bayesian Inverse Reinforcement Learning
by: Mandyam, Aishwarya, et al.
Published: (2023)

Walking the Values in Bayesian Inverse Reinforcement Learning
by: Bajgar, Ondrej, et al.
Published: (2024)

Variational Linearized Laplace Approximation for Bayesian Deep Learning
by: Ortega, Luis A., et al.
Published: (2023)

Bayesian Inverse Reinforcement Learning for Non-Markovian Rewards
by: Topper, Noah, et al.
Published: (2024)

A Bayesian Approach to Robust Inverse Reinforcement Learning
by: Wei, Ran, et al.
Published: (2023)

PAC Apprenticeship Learning with Bayesian Active Inverse Reinforcement Learning
by: Bajgar, Ondrej, et al.
Published: (2025)

Offline Regularised Reinforcement Learning for Large Language Models Alignment
by: Richemond, Pierre Harvey, et al.
Published: (2024)

Rethinking Inverse Reinforcement Learning: from Data Alignment to Task Alignment
by: Zhou, Weichao, et al.
Published: (2024)

Score-Based Variational Inference for Inverse Problems
by: Xue, Zhipeng, et al.
Published: (2024)

Autonomous Assessment of Demonstration Sufficiency via Bayesian Inverse Reinforcement Learning
by: Trinh, Tu, et al.
Published: (2022)

SLaB: Sparse-Lowrank-Binary Decomposition for Efficient Large Language Models
by: Li, Ziwei, et al.
Published: (2026)

Bayesian Meta-Reinforcement Learning with Laplace Variational Recurrent Networks
by: de Vries, Joery A., et al.
Published: (2025)

Interactionless Inverse Reinforcement Learning: A Data-Centric Framework for Durable Alignment
by: Malomgré, Elias, et al.
Published: (2026)

Reinforcement Learning Finetunes Small Subnetworks in Large Language Models
by: Mukherjee, Sagnik, et al.
Published: (2025)

Label-Confidence-Aware Uncertainty Estimation in Natural Language Generation
by: Lin, Qinhong, et al.
Published: (2024)

Uncertainty Quantification of Large Language Models using Approximate Bayesian Computation
by: Sharma, Mridul, et al.
Published: (2025)

SALMON: Self-Alignment with Instructable Reward Models
by: Sun, Zhiqing, et al.
Published: (2023)

Inverse Reinforcement Learning with Dynamic Reward Scaling for LLM Alignment
by: Cheng, Ruoxi, et al.
Published: (2025)

Mixed-Precision Federated Learning via Multi-Precision Over-The-Air Aggregation
by: Yuan, Jinsheng, et al.
Published: (2024)

Bayesian Reward Models for LLM Alignment
by: Yang, Adam X., et al.
Published: (2024)

Inverse Reinforcement Learning without Reinforcement Learning
by: Swamy, Gokul, et al.
Published: (2023)

On Predictability of Reinforcement Learning Dynamics for Large Language Models
by: Cai, Yuchen, et al.
Published: (2025)

EARL: Efficient Agentic Reinforcement Learning Systems for Large Language Models
by: Tan, Zheyue, et al.
Published: (2025)

RAG-based User Profiling for Precision Planning in Mixed-precision Over-the-Air Federated Learning
by: Yuan, Jinsheng, et al.
Published: (2025)

GRACE: A Language Model Framework for Explainable Inverse Reinforcement Learning
by: Sapora, Silvia, et al.
Published: (2025)

Reinforced Collaboration in Multi-Agent Flow Networks
by: Wang, Zheng, et al.
Published: (2026)

Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities
by: Sun, Hao, et al.
Published: (2025)

Doubly Robust Alignment for Large Language Models
by: Xu, Erhan, et al.
Published: (2025)

Distributional Inverse Reinforcement Learning
by: Wu, Feiyang, et al.
Published: (2025)

Rank-1 Approximation of Inverse Fisher for Natural Policy Gradients in Deep Reinforcement Learning
by: Huo, Yingxiao, et al.
Published: (2026)

Internalizing Meta-Experience into Memory for Guided Reinforcement Learning in Large Language Models
by: Huang, Shiting, et al.
Published: (2026)

Model-Free Approximate Bayesian Learning for Large-Scale Conversion Funnel Optimization
by: Iyengar, Garud, et al.
Published: (2024)

Latent-IMH: Efficient Bayesian Inference for Inverse Problems with Approximate Operators
by: Chen, Youguang, et al.
Published: (2026)

Accelerated Preference Optimization for Large Language Model Alignment
by: He, Jiafan, et al.
Published: (2024)

Goal-Guided Efficient Exploration via Large Language Model in Reinforcement Learning
by: Qi, Yajie, et al.
Published: (2025)

Learning To Sample From Diffusion Models Via Inverse Reinforcement Learning
by: Bourdrez, Constant, et al.
Published: (2026)

Directional Alignment Mitigates Reward Hacking in Reinforcement Learning for Language Models
by: Deng, Wenlong, et al.
Published: (2026)

Towards Generalized Inverse Reinforcement Learning
by: Dong, Chaosheng, et al.
Published: (2024)