:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Qi, Han, Yang, Haochen, Zhang, Qiaosheng, Yang, Zhuoran
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2502.05434
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning
by: Zhang, Qiaosheng, et al.
Published: (2024)

Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency
by: Cai, Qi, et al.
Published: (2022)

Graph Feedback Bandits on Similar Arms: With and Without Graph Structures
by: Qi, Han, et al.
Published: (2025)

Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation
by: He, Jianliang, et al.
Published: (2024)

Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency
by: Wang, Lingxiao, et al.
Published: (2022)

On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games
by: Altabaa, Awni, et al.
Published: (2024)

Sample-Efficient Policy Constraint Offline Deep Reinforcement Learning based on Sample Filtering
by: Chen, Yuanhao, et al.
Published: (2025)

Sample Efficient Reinforcement Learning by Automatically Learning to Compose Subtasks
by: Han, Shuai, et al.
Published: (2024)

Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
by: Shen, Han, et al.
Published: (2024)

GCHR : Goal-Conditioned Hindsight Regularization for Sample-Efficient Reinforcement Learning
by: Lei, Xing, et al.
Published: (2025)

More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling
by: Ishfaq, Haque, et al.
Published: (2024)

Replay Failures as Successes: Sample-Efficient Reinforcement Learning for Instruction Following
by: Zhang, Kongcheng, et al.
Published: (2025)

The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability
by: Hu, Jiachen, et al.
Published: (2025)

Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
by: Qiu, Shuang, et al.
Published: (2022)

Efficient Reinforcement Learning from Human Feedback via Bayesian Preference Inference
by: Cercola, Matteo, et al.
Published: (2025)

Parameter Efficient Reinforcement Learning from Human Feedback
by: Sidahmed, Hakim, et al.
Published: (2024)

Provably Sample-Efficient Robust Reinforcement Learning with Average Reward
by: Roch, Zachary, et al.
Published: (2025)

Community Detection for Contextual-LSBM: Theoretical Limitations of Misclassification Rate and Efficient Algorithms
by: Jin, Dian, et al.
Published: (2025)

Optimistic Information Directed Sampling
by: Neu, Gergely, et al.
Published: (2024)

Adaptive Client Sampling in Federated Learning via Online Learning with Bandit Feedback
by: Zhao, Boxin, et al.
Published: (2021)

Reinforcement Learning from Human Feedback
by: Lambert, Nathan
Published: (2025)

SEAR: Sample Efficient Action Chunking Reinforcement Learning
by: Nagy, C. F. Maximilian, et al.
Published: (2026)

Robust Reinforcement Learning from Corrupted Human Feedback
by: Bukharin, Alexander, et al.
Published: (2024)

Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation
by: Lee, Donghoon, et al.
Published: (2025)

Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation
by: Zhao, Runze, et al.
Published: (2025)

On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond
by: Nguyen-Tang, Thanh, et al.
Published: (2024)

Sample-Efficient Tabular Self-Play for Offline Robust Reinforcement Learning
by: Li, Na, et al.
Published: (2025)

CoPRIS: Efficient and Stable Reinforcement Learning via Concurrency-Controlled Partial Rollout with Importance Sampling
by: Qu, Zekai, et al.
Published: (2025)

Sparse Optimistic Information Directed Sampling
by: Schwartz, Ludovic, et al.
Published: (2025)

Reinforcing Human Behavior Simulation via Verbal Feedback
by: Sun, Weiwei, et al.
Published: (2026)

TIC-GRPO: Provable and Efficient Optimization for Reinforcement Learning from Human Feedback
by: Pang, Lei, et al.
Published: (2025)

Tool-R1: Sample-Efficient Reinforcement Learning for Agentic Tool Use
by: Zhang, Yabo, et al.
Published: (2025)

Aligning AI Agents via Information-Directed Sampling
by: Jeon, Hong Jun, et al.
Published: (2024)

Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
by: Liu, Xu-Hui, et al.
Published: (2024)

Reinforced Interactive Continual Learning via Real-time Noisy Human Feedback
by: Yang, Yutao, et al.
Published: (2025)

Strategyproof Reinforcement Learning from Human Feedback
by: Buening, Thomas Kleine, et al.
Published: (2025)

Adaptive Preference Scaling for Reinforcement Learning with Human Feedback
by: Hong, Ilgee, et al.
Published: (2024)

From Generative to Episodic: Sample-Efficient Replicable Reinforcement Learning
by: Hopkins, Max, et al.
Published: (2025)

Ensemble Successor Representations for Task Generalization in Offline-to-Online Reinforcement Learning
by: Wang, Changhong, et al.
Published: (2024)

Truncated Rectified Flow Policy for Reinforcement Learning with One-Step Sampling
by: Zhou, Xubin, et al.
Published: (2026)