:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ding, Zihan, Jin, Chi
Format:	Preprint
Published:	2023
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2309.16984
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Generative Diffusion Modeling: A Practical Handbook
by: Ding, Zihan, et al.
Published: (2024)

FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning
by: Li, Wenzhe, et al.
Published: (2024)

Feasibility Consistent Representation Learning for Safe Reinforcement Learning
by: Cen, Zhepeng, et al.
Published: (2024)

Single-stream Policy Optimization
by: Xu, Zhongwen, et al.
Published: (2025)

Pure Exploration for a Good Policy in Reinforcement Learning with Bandit Feedback
by: Li, Zitian, et al.
Published: (2026)

Reinforcement Learning in High-frequency Market Making
by: Zheng, Yuheng, et al.
Published: (2024)

Diffusion World Model: Future Modeling Beyond Step-by-Step Rollout for Offline Reinforcement Learning
by: Ding, Zihan, et al.
Published: (2024)

ReinforceGen: Hybrid Skill Policies with Automated Data Generation and Reinforcement Learning
by: Zhou, Zihan, et al.
Published: (2025)

Efficient Multi-Policy Evaluation for Reinforcement Learning
by: Liu, Shuze Daniel, et al.
Published: (2024)

Efficient Online Reinforcement Learning for Diffusion Policy
by: Ma, Haitong, et al.
Published: (2025)

Provably Efficient Exploration in Policy Optimization
by: Cai, Qi, et al.
Published: (2019)

FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning
by: Ding, Yuyang, et al.
Published: (2025)

Pausing Policy Learning in Non-stationary Reinforcement Learning
by: Lee, Hyunin, et al.
Published: (2024)

LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra
by: Karten, Seth, et al.
Published: (2025)

Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
by: Chen, Claire, et al.
Published: (2024)

GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning
by: Ding, Shutong, et al.
Published: (2025)

Constrained Policy Optimization with Explicit Behavior Density for Offline Reinforcement Learning
by: Zhang, Jing, et al.
Published: (2023)

SAC Flow: Sample-Efficient Reinforcement Learning of Flow-Based Policies via Velocity-Reparameterized Sequential Modeling
by: Zhang, Yixian, et al.
Published: (2025)

Tighter Regret Bounds for Contextual Action-Set Reinforcement Learning
by: Chen, Zijun, et al.
Published: (2026)

Iterative Refinement of Flow Policies in Probability Space for Online Reinforcement Learning
by: Sun, Mingyang, et al.
Published: (2025)

Provably Sample-Efficient Robust Reinforcement Learning with Average Reward
by: Roch, Zachary, et al.
Published: (2025)

Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
by: Woo, Jiin, et al.
Published: (2024)

Extreme Value Policy Optimization for Safe Reinforcement Learning
by: Gao, Shiqing, et al.
Published: (2026)

Rich-Observation Reinforcement Learning with Continuous Latent Dynamics
by: Song, Yuda, et al.
Published: (2024)

Optimization Solution Functions as Deterministic Policies for Offline Reinforcement Learning
by: Khattar, Vanshaj, et al.
Published: (2024)

Frozen Policy Iteration: Computationally Efficient RL under Linear $Q^π$ Realizability for Deterministic Dynamics
by: Ke, Yijing, et al.
Published: (2026)

Class-Balanced and Reinforced Active Learning on Graphs
by: Yu, Chengcheng, et al.
Published: (2024)

When is Offline Policy Selection Sample Efficient for Reinforcement Learning?
by: Liu, Vincent, et al.
Published: (2023)

On-Policy Policy Gradient Reinforcement Learning Without On-Policy Sampling
by: Corrado, Nicholas E., et al.
Published: (2023)

Policy Improvement Reinforcement Learning
by: Wang, Huaiyang, et al.
Published: (2026)

Model-Free, Regret-Optimal Best Policy Identification in Online CMDPs
by: Zhou, Zihan, et al.
Published: (2023)

Personalized Reinforcement Learning with a Budget of Policies
by: Ivanov, Dmitry, et al.
Published: (2024)

Behavior-Consistent Deep Reinforcement Learning
by: Hussing, Marcel, et al.
Published: (2026)

Consistency Trajectory Planning: High-Quality and Efficient Trajectory Optimization for Offline Model-Based Reinforcement Learning
by: Wang, Guanquan, et al.
Published: (2025)

Succeed or Learn Slowly: Sample Efficient Off-Policy Reinforcement Learning for Mobile App Control
by: Papoudakis, Georgios, et al.
Published: (2025)

Odysseus: Scaling VLMs to 100+ Turn Decision-Making in Games via Reinforcement Learning
by: Shi, Chengshuai, et al.
Published: (2026)

Settling the Sample Complexity of Online Reinforcement Learning
by: Zhang, Zihan, et al.
Published: (2023)

Federated Natural Policy Gradient and Actor Critic Methods for Multi-task Reinforcement Learning
by: Yang, Tong, et al.
Published: (2023)

Constraint-Conditioned Policy Optimization for Versatile Safe Reinforcement Learning
by: Yao, Yihang, et al.
Published: (2023)

Consistent Estimation of a Class of Distances Between Covariance Matrices
by: Pereira, Roberto, et al.
Published: (2024)