:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Tan, Kevin, Xu, Ziping
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2403.09701
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Online learning in bandits with predicted context
by: Guo, Yongyi, et al.
Published: (2023)

Augmenting Online RL with Offline Data is All You Need: A Unified Hybrid RL Algorithm Design and Analysis
by: Huang, Ruiquan, et al.
Published: (2025)

Online Algorithms with Limited Data Retention
by: Immorlica, Nicole, et al.
Published: (2024)

RL Token: Bootstrapping Online RL with Vision-Language-Action Models
by: Xu, Charles, et al.
Published: (2026)

Bayesian Online Natural Gradient (BONG)
by: Jones, Matt, et al.
Published: (2024)

A Review of Online Diffusion Policy RL Algorithms for Scalable Robotic Control
by: Choi, Wonhyeok, et al.
Published: (2026)

Accelerating Transformers in Online RL
by: Zelezetsky, Daniil, et al.
Published: (2025)

A Benchmark Study of Deep-RL Methods for Maximum Coverage Problems over Graphs
by: Liang, Zhicheng, et al.
Published: (2024)

The Fallacy of Minimizing Cumulative Regret in the Sequential Task Setting
by: Xu, Ziping, et al.
Published: (2024)

Robust Policy Expansion for Offline-to-Online RL under Diverse Data Corruption
by: He, Longxiang, et al.
Published: (2025)

Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks
by: Xu, Ziping, et al.
Published: (2024)

$π_\texttt{RL}$: Online RL Fine-tuning for Flow-based Vision-Language-Action Models
by: Chen, Kang, et al.
Published: (2025)

Coverage-Validity-Aware Algorithmic Recourse
by: Bui, Ngoc, et al.
Published: (2023)

H2O+: An Improved Framework for Hybrid Offline-and-Online RL with Dynamics Gaps
by: Niu, Haoyi, et al.
Published: (2023)

Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone
by: Mark, Max Sobol, et al.
Published: (2024)

Reveal the Mystery of DPO: The Connection between DPO and RL Algorithms
by: Su, Xuerui, et al.
Published: (2025)

Reflect-RL: Two-Player Online RL Fine-Tuning for LMs
by: Zhou, Runlong, et al.
Published: (2024)

An Empirical Study on the Effectiveness of Incorporating Offline RL As Online RL Subroutines
by: Su, Jianhai, et al.
Published: (2025)

Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPs
by: Tan, Kevin, et al.
Published: (2024)

Generalized Linear Markov Decision Process
by: Zhang, Sinian, et al.
Published: (2025)

Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits
by: Chen, Fan, et al.
Published: (2025)

Training-Conditional Coverage Bounds for Uniformly Stable Learning Algorithms
by: Pournaderi, Mehrdad, et al.
Published: (2024)

Enhancing Adversarial Example Detection Through Model Explanation
by: Ma, Qian, et al.
Published: (2025)

On Entropy Control in LLM-RL Algorithms
by: Shen, Han
Published: (2025)

SortedRL: Accelerating RL Training for LLMs through Online Length-Aware Scheduling
by: Zhang, Yiqi, et al.
Published: (2026)

Active Measuring in Reinforcement Learning With Delayed Negative Effects
by: Gao, Daiqi, et al.
Published: (2025)

On the Limits of Tabular Hardness Metrics for Deep RL: A Study with the Pharos Benchmark
by: Conserva, Michelangelo, et al.
Published: (2025)

RL's Razor: Why Online Reinforcement Learning Forgets Less
by: Shenfeld, Idan, et al.
Published: (2025)

Algorithmic Guarantees for Distilling Supervised and Offline RL Datasets
by: Gupta, Aaryan, et al.
Published: (2025)

Behavior-Adaptive Q-Learning: A Unifying Framework for Offline-to-Online RL
by: Zu, Lipeng, et al.
Published: (2025)

MobileRL: Online Agentic Reinforcement Learning for Mobile GUI Agents
by: Xu, Yifan, et al.
Published: (2025)

Natural Policy Gradient for Average Reward Non-Stationary RL
by: Jali, Neharika, et al.
Published: (2025)

Curiosity-driven RL for symbolic equation solving
by: O'Keeffe, Kevin P.
Published: (2025)

Optimal Single-Policy Sample Complexity and Transient Coverage for Average-Reward Offline RL
by: Zurek, Matthew, et al.
Published: (2025)

A Theoretical Lens for RL-Tuned Language Models via Energy-Based Models
by: Tan, Zhiquan, et al.
Published: (2025)

Hybrid Deep Learning Modeling Approach to Predict Natural Gas Consumption of Home Subscribers on Limited Data
by: Firoozeh, Milad, et al.
Published: (2025)

RL Grokking Recipe: How Does RL Unlock and Transfer New Algorithms in LLMs?
by: Sun, Yiyou, et al.
Published: (2025)

Online Finetuning Decision Transformers with Pure RL Gradients
by: Luo, Junkai, et al.
Published: (2026)

Accelerating Goal-Conditioned RL Algorithms and Research
by: Bortkiewicz, Michał, et al.
Published: (2024)

Scalable Policy-Based RL Algorithms for POMDPs
by: Anjarlekar, Ameya, et al.
Published: (2025)