Saved in:
| Main Authors: | Ding, Zihan, Jin, Chi |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2309.16984 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Generative Diffusion Modeling: A Practical Handbook
by: Ding, Zihan, et al.
Published: (2024)
by: Ding, Zihan, et al.
Published: (2024)
FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning
by: Li, Wenzhe, et al.
Published: (2024)
by: Li, Wenzhe, et al.
Published: (2024)
Feasibility Consistent Representation Learning for Safe Reinforcement Learning
by: Cen, Zhepeng, et al.
Published: (2024)
by: Cen, Zhepeng, et al.
Published: (2024)
Single-stream Policy Optimization
by: Xu, Zhongwen, et al.
Published: (2025)
by: Xu, Zhongwen, et al.
Published: (2025)
Pure Exploration for a Good Policy in Reinforcement Learning with Bandit Feedback
by: Li, Zitian, et al.
Published: (2026)
by: Li, Zitian, et al.
Published: (2026)
Reinforcement Learning in High-frequency Market Making
by: Zheng, Yuheng, et al.
Published: (2024)
by: Zheng, Yuheng, et al.
Published: (2024)
Diffusion World Model: Future Modeling Beyond Step-by-Step Rollout for Offline Reinforcement Learning
by: Ding, Zihan, et al.
Published: (2024)
by: Ding, Zihan, et al.
Published: (2024)
ReinforceGen: Hybrid Skill Policies with Automated Data Generation and Reinforcement Learning
by: Zhou, Zihan, et al.
Published: (2025)
by: Zhou, Zihan, et al.
Published: (2025)
Efficient Multi-Policy Evaluation for Reinforcement Learning
by: Liu, Shuze Daniel, et al.
Published: (2024)
by: Liu, Shuze Daniel, et al.
Published: (2024)
Efficient Online Reinforcement Learning for Diffusion Policy
by: Ma, Haitong, et al.
Published: (2025)
by: Ma, Haitong, et al.
Published: (2025)
Provably Efficient Exploration in Policy Optimization
by: Cai, Qi, et al.
Published: (2019)
by: Cai, Qi, et al.
Published: (2019)
FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning
by: Ding, Yuyang, et al.
Published: (2025)
by: Ding, Yuyang, et al.
Published: (2025)
Pausing Policy Learning in Non-stationary Reinforcement Learning
by: Lee, Hyunin, et al.
Published: (2024)
by: Lee, Hyunin, et al.
Published: (2024)
LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra
by: Karten, Seth, et al.
Published: (2025)
by: Karten, Seth, et al.
Published: (2025)
Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
by: Chen, Claire, et al.
Published: (2024)
by: Chen, Claire, et al.
Published: (2024)
GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning
by: Ding, Shutong, et al.
Published: (2025)
by: Ding, Shutong, et al.
Published: (2025)
Constrained Policy Optimization with Explicit Behavior Density for Offline Reinforcement Learning
by: Zhang, Jing, et al.
Published: (2023)
by: Zhang, Jing, et al.
Published: (2023)
SAC Flow: Sample-Efficient Reinforcement Learning of Flow-Based Policies via Velocity-Reparameterized Sequential Modeling
by: Zhang, Yixian, et al.
Published: (2025)
by: Zhang, Yixian, et al.
Published: (2025)
Tighter Regret Bounds for Contextual Action-Set Reinforcement Learning
by: Chen, Zijun, et al.
Published: (2026)
by: Chen, Zijun, et al.
Published: (2026)
Iterative Refinement of Flow Policies in Probability Space for Online Reinforcement Learning
by: Sun, Mingyang, et al.
Published: (2025)
by: Sun, Mingyang, et al.
Published: (2025)
Provably Sample-Efficient Robust Reinforcement Learning with Average Reward
by: Roch, Zachary, et al.
Published: (2025)
by: Roch, Zachary, et al.
Published: (2025)
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
by: Woo, Jiin, et al.
Published: (2024)
by: Woo, Jiin, et al.
Published: (2024)
Extreme Value Policy Optimization for Safe Reinforcement Learning
by: Gao, Shiqing, et al.
Published: (2026)
by: Gao, Shiqing, et al.
Published: (2026)
Rich-Observation Reinforcement Learning with Continuous Latent Dynamics
by: Song, Yuda, et al.
Published: (2024)
by: Song, Yuda, et al.
Published: (2024)
Optimization Solution Functions as Deterministic Policies for Offline Reinforcement Learning
by: Khattar, Vanshaj, et al.
Published: (2024)
by: Khattar, Vanshaj, et al.
Published: (2024)
Frozen Policy Iteration: Computationally Efficient RL under Linear $Q^π$ Realizability for Deterministic Dynamics
by: Ke, Yijing, et al.
Published: (2026)
by: Ke, Yijing, et al.
Published: (2026)
Class-Balanced and Reinforced Active Learning on Graphs
by: Yu, Chengcheng, et al.
Published: (2024)
by: Yu, Chengcheng, et al.
Published: (2024)
When is Offline Policy Selection Sample Efficient for Reinforcement Learning?
by: Liu, Vincent, et al.
Published: (2023)
by: Liu, Vincent, et al.
Published: (2023)
On-Policy Policy Gradient Reinforcement Learning Without On-Policy Sampling
by: Corrado, Nicholas E., et al.
Published: (2023)
by: Corrado, Nicholas E., et al.
Published: (2023)
Policy Improvement Reinforcement Learning
by: Wang, Huaiyang, et al.
Published: (2026)
by: Wang, Huaiyang, et al.
Published: (2026)
Model-Free, Regret-Optimal Best Policy Identification in Online CMDPs
by: Zhou, Zihan, et al.
Published: (2023)
by: Zhou, Zihan, et al.
Published: (2023)
Personalized Reinforcement Learning with a Budget of Policies
by: Ivanov, Dmitry, et al.
Published: (2024)
by: Ivanov, Dmitry, et al.
Published: (2024)
Behavior-Consistent Deep Reinforcement Learning
by: Hussing, Marcel, et al.
Published: (2026)
by: Hussing, Marcel, et al.
Published: (2026)
Consistency Trajectory Planning: High-Quality and Efficient Trajectory Optimization for Offline Model-Based Reinforcement Learning
by: Wang, Guanquan, et al.
Published: (2025)
by: Wang, Guanquan, et al.
Published: (2025)
Succeed or Learn Slowly: Sample Efficient Off-Policy Reinforcement Learning for Mobile App Control
by: Papoudakis, Georgios, et al.
Published: (2025)
by: Papoudakis, Georgios, et al.
Published: (2025)
Odysseus: Scaling VLMs to 100+ Turn Decision-Making in Games via Reinforcement Learning
by: Shi, Chengshuai, et al.
Published: (2026)
by: Shi, Chengshuai, et al.
Published: (2026)
Settling the Sample Complexity of Online Reinforcement Learning
by: Zhang, Zihan, et al.
Published: (2023)
by: Zhang, Zihan, et al.
Published: (2023)
Federated Natural Policy Gradient and Actor Critic Methods for Multi-task Reinforcement Learning
by: Yang, Tong, et al.
Published: (2023)
by: Yang, Tong, et al.
Published: (2023)
Constraint-Conditioned Policy Optimization for Versatile Safe Reinforcement Learning
by: Yao, Yihang, et al.
Published: (2023)
by: Yao, Yihang, et al.
Published: (2023)
Consistent Estimation of a Class of Distances Between Covariance Matrices
by: Pereira, Roberto, et al.
Published: (2024)
by: Pereira, Roberto, et al.
Published: (2024)
Similar Items
-
Generative Diffusion Modeling: A Practical Handbook
by: Ding, Zihan, et al.
Published: (2024) -
FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning
by: Li, Wenzhe, et al.
Published: (2024) -
Feasibility Consistent Representation Learning for Safe Reinforcement Learning
by: Cen, Zhepeng, et al.
Published: (2024) -
Single-stream Policy Optimization
by: Xu, Zhongwen, et al.
Published: (2025) -
Pure Exploration for a Good Policy in Reinforcement Learning with Bandit Feedback
by: Li, Zitian, et al.
Published: (2026)