Saved in:
| Main Authors: | Qi, Han, Yang, Haochen, Zhang, Qiaosheng, Yang, Zhuoran |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.05434 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning
by: Zhang, Qiaosheng, et al.
Published: (2024)
by: Zhang, Qiaosheng, et al.
Published: (2024)
Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency
by: Cai, Qi, et al.
Published: (2022)
by: Cai, Qi, et al.
Published: (2022)
Graph Feedback Bandits on Similar Arms: With and Without Graph Structures
by: Qi, Han, et al.
Published: (2025)
by: Qi, Han, et al.
Published: (2025)
Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation
by: He, Jianliang, et al.
Published: (2024)
by: He, Jianliang, et al.
Published: (2024)
Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency
by: Wang, Lingxiao, et al.
Published: (2022)
by: Wang, Lingxiao, et al.
Published: (2022)
On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games
by: Altabaa, Awni, et al.
Published: (2024)
by: Altabaa, Awni, et al.
Published: (2024)
Sample-Efficient Policy Constraint Offline Deep Reinforcement Learning based on Sample Filtering
by: Chen, Yuanhao, et al.
Published: (2025)
by: Chen, Yuanhao, et al.
Published: (2025)
Sample Efficient Reinforcement Learning by Automatically Learning to Compose Subtasks
by: Han, Shuai, et al.
Published: (2024)
by: Han, Shuai, et al.
Published: (2024)
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
by: Shen, Han, et al.
Published: (2024)
by: Shen, Han, et al.
Published: (2024)
GCHR : Goal-Conditioned Hindsight Regularization for Sample-Efficient Reinforcement Learning
by: Lei, Xing, et al.
Published: (2025)
by: Lei, Xing, et al.
Published: (2025)
More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling
by: Ishfaq, Haque, et al.
Published: (2024)
by: Ishfaq, Haque, et al.
Published: (2024)
Replay Failures as Successes: Sample-Efficient Reinforcement Learning for Instruction Following
by: Zhang, Kongcheng, et al.
Published: (2025)
by: Zhang, Kongcheng, et al.
Published: (2025)
The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability
by: Hu, Jiachen, et al.
Published: (2025)
by: Hu, Jiachen, et al.
Published: (2025)
Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
by: Qiu, Shuang, et al.
Published: (2022)
by: Qiu, Shuang, et al.
Published: (2022)
Efficient Reinforcement Learning from Human Feedback via Bayesian Preference Inference
by: Cercola, Matteo, et al.
Published: (2025)
by: Cercola, Matteo, et al.
Published: (2025)
Parameter Efficient Reinforcement Learning from Human Feedback
by: Sidahmed, Hakim, et al.
Published: (2024)
by: Sidahmed, Hakim, et al.
Published: (2024)
Provably Sample-Efficient Robust Reinforcement Learning with Average Reward
by: Roch, Zachary, et al.
Published: (2025)
by: Roch, Zachary, et al.
Published: (2025)
Community Detection for Contextual-LSBM: Theoretical Limitations of Misclassification Rate and Efficient Algorithms
by: Jin, Dian, et al.
Published: (2025)
by: Jin, Dian, et al.
Published: (2025)
Optimistic Information Directed Sampling
by: Neu, Gergely, et al.
Published: (2024)
by: Neu, Gergely, et al.
Published: (2024)
Adaptive Client Sampling in Federated Learning via Online Learning with Bandit Feedback
by: Zhao, Boxin, et al.
Published: (2021)
by: Zhao, Boxin, et al.
Published: (2021)
Reinforcement Learning from Human Feedback
by: Lambert, Nathan
Published: (2025)
by: Lambert, Nathan
Published: (2025)
SEAR: Sample Efficient Action Chunking Reinforcement Learning
by: Nagy, C. F. Maximilian, et al.
Published: (2026)
by: Nagy, C. F. Maximilian, et al.
Published: (2026)
Robust Reinforcement Learning from Corrupted Human Feedback
by: Bukharin, Alexander, et al.
Published: (2024)
by: Bukharin, Alexander, et al.
Published: (2024)
Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation
by: Lee, Donghoon, et al.
Published: (2025)
by: Lee, Donghoon, et al.
Published: (2025)
Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation
by: Zhao, Runze, et al.
Published: (2025)
by: Zhao, Runze, et al.
Published: (2025)
On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond
by: Nguyen-Tang, Thanh, et al.
Published: (2024)
by: Nguyen-Tang, Thanh, et al.
Published: (2024)
Sample-Efficient Tabular Self-Play for Offline Robust Reinforcement Learning
by: Li, Na, et al.
Published: (2025)
by: Li, Na, et al.
Published: (2025)
CoPRIS: Efficient and Stable Reinforcement Learning via Concurrency-Controlled Partial Rollout with Importance Sampling
by: Qu, Zekai, et al.
Published: (2025)
by: Qu, Zekai, et al.
Published: (2025)
Sparse Optimistic Information Directed Sampling
by: Schwartz, Ludovic, et al.
Published: (2025)
by: Schwartz, Ludovic, et al.
Published: (2025)
Reinforcing Human Behavior Simulation via Verbal Feedback
by: Sun, Weiwei, et al.
Published: (2026)
by: Sun, Weiwei, et al.
Published: (2026)
TIC-GRPO: Provable and Efficient Optimization for Reinforcement Learning from Human Feedback
by: Pang, Lei, et al.
Published: (2025)
by: Pang, Lei, et al.
Published: (2025)
Tool-R1: Sample-Efficient Reinforcement Learning for Agentic Tool Use
by: Zhang, Yabo, et al.
Published: (2025)
by: Zhang, Yabo, et al.
Published: (2025)
Aligning AI Agents via Information-Directed Sampling
by: Jeon, Hong Jun, et al.
Published: (2024)
by: Jeon, Hong Jun, et al.
Published: (2024)
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
by: Liu, Xu-Hui, et al.
Published: (2024)
by: Liu, Xu-Hui, et al.
Published: (2024)
Reinforced Interactive Continual Learning via Real-time Noisy Human Feedback
by: Yang, Yutao, et al.
Published: (2025)
by: Yang, Yutao, et al.
Published: (2025)
Strategyproof Reinforcement Learning from Human Feedback
by: Buening, Thomas Kleine, et al.
Published: (2025)
by: Buening, Thomas Kleine, et al.
Published: (2025)
Adaptive Preference Scaling for Reinforcement Learning with Human Feedback
by: Hong, Ilgee, et al.
Published: (2024)
by: Hong, Ilgee, et al.
Published: (2024)
From Generative to Episodic: Sample-Efficient Replicable Reinforcement Learning
by: Hopkins, Max, et al.
Published: (2025)
by: Hopkins, Max, et al.
Published: (2025)
Ensemble Successor Representations for Task Generalization in Offline-to-Online Reinforcement Learning
by: Wang, Changhong, et al.
Published: (2024)
by: Wang, Changhong, et al.
Published: (2024)
Truncated Rectified Flow Policy for Reinforcement Learning with One-Step Sampling
by: Zhou, Xubin, et al.
Published: (2026)
by: Zhou, Xubin, et al.
Published: (2026)
Similar Items
-
Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning
by: Zhang, Qiaosheng, et al.
Published: (2024) -
Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency
by: Cai, Qi, et al.
Published: (2022) -
Graph Feedback Bandits on Similar Arms: With and Without Graph Structures
by: Qi, Han, et al.
Published: (2025) -
Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation
by: He, Jianliang, et al.
Published: (2024) -
Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency
by: Wang, Lingxiao, et al.
Published: (2022)