Saved in:
| Main Authors: | Ding, Shutong, Hu, Ke, Zhong, Shan, Luo, Haoyang, Zhang, Weinan, Wang, Jingya, Wang, Jun, Shi, Ye |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.18763 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization
by: Ding, Shutong, et al.
Published: (2024)
by: Ding, Shutong, et al.
Published: (2024)
Distributional Reinforcement Learning with Diffusion Bridge Critics
by: Ding, Shutong, et al.
Published: (2026)
by: Ding, Shutong, et al.
Published: (2026)
Sample-Efficient Diffusion-based Reinforcement Learning with Critic Guidance
by: Ding, Shutong, et al.
Published: (2026)
by: Ding, Shutong, et al.
Published: (2026)
Guidance with Spherical Gaussian Constraint for Conditional Diffusion
by: Yang, Lingxiao, et al.
Published: (2024)
by: Yang, Lingxiao, et al.
Published: (2024)
DreamPolicy: A Unified World-model Policy for Scalable Humanoid Locomotion
by: Fan, Yahao, et al.
Published: (2025)
by: Fan, Yahao, et al.
Published: (2025)
Score-Based Diffusion Policy Compatible with Reinforcement Learning via Optimal Transport
by: Sun, Mingyang, et al.
Published: (2025)
by: Sun, Mingyang, et al.
Published: (2025)
Path-Space Mirror Descent for On-Policy Reinforcement Learning under the Generalized Schrödinger Bridge
by: Gong, Yuehu, et al.
Published: (2026)
by: Gong, Yuehu, et al.
Published: (2026)
FlowCritic: Bridging Value Estimation with Flow Matching in Reinforcement Learning
by: Zhong, Shan, et al.
Published: (2025)
by: Zhong, Shan, et al.
Published: (2025)
Iterative Refinement of Flow Policies in Probability Space for Online Reinforcement Learning
by: Sun, Mingyang, et al.
Published: (2025)
by: Sun, Mingyang, et al.
Published: (2025)
Diffusion-based learning framework for Constrained Nonconvex Optimization with Weighted Bootstrapped Refinement
by: Ding, Shutong, et al.
Published: (2025)
by: Ding, Shutong, et al.
Published: (2025)
Harmonizing Generalization and Personalization in Federated Prompt Learning
by: Cui, Tianyu, et al.
Published: (2024)
by: Cui, Tianyu, et al.
Published: (2024)
A Review of Online Diffusion Policy RL Algorithms for Scalable Robotic Control
by: Choi, Wonhyeok, et al.
Published: (2026)
by: Choi, Wonhyeok, et al.
Published: (2026)
Sample from What You See: Visuomotor Policy Learning via Diffusion Bridge with Observation-Embedded Stochastic Differential Equation
by: Liu, Zhaoyang, et al.
Published: (2025)
by: Liu, Zhaoyang, et al.
Published: (2025)
Diffusion Models for Reinforcement Learning: A Survey
by: Zhu, Zhengbang, et al.
Published: (2023)
by: Zhu, Zhengbang, et al.
Published: (2023)
Reinforcing Language Agents via Policy Optimization with Action Decomposition
by: Wen, Muning, et al.
Published: (2024)
by: Wen, Muning, et al.
Published: (2024)
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching
by: Li, Guanghe, et al.
Published: (2024)
by: Li, Guanghe, et al.
Published: (2024)
Stabilizing Reinforcement Learning for Diffusion Language Models
by: Zhong, Jianyuan, et al.
Published: (2026)
by: Zhong, Jianyuan, et al.
Published: (2026)
CausalGDP: Causality-Guided Diffusion Policies for Reinforcement Learning
by: Xiao, Xiaofeng, et al.
Published: (2026)
by: Xiao, Xiaofeng, et al.
Published: (2026)
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement
by: Wen, Muning, et al.
Published: (2024)
by: Wen, Muning, et al.
Published: (2024)
Bringing Value Models Back: Generative Critics for Value Modeling in LLM Reinforcement Learning
by: Shan, Zikang, et al.
Published: (2026)
by: Shan, Zikang, et al.
Published: (2026)
Global and Local Prompts Cooperation via Optimal Transport for Federated Learning
by: Li, Hongxia, et al.
Published: (2024)
by: Li, Hongxia, et al.
Published: (2024)
ReinforceGen: Hybrid Skill Policies with Automated Data Generation and Reinforcement Learning
by: Zhou, Zihan, et al.
Published: (2025)
by: Zhou, Zihan, et al.
Published: (2025)
Model-based Multi-agent Reinforcement Learning: Recent Progress and Prospects
by: Wang, Xihuai, et al.
Published: (2022)
by: Wang, Xihuai, et al.
Published: (2022)
DriveGen: Towards Infinite Diverse Traffic Scenarios with Large Models
by: Zhang, Shenyu, et al.
Published: (2025)
by: Zhang, Shenyu, et al.
Published: (2025)
RIDER: 3D RNA Inverse Design with Reinforcement Learning-Guided Diffusion
by: Hu, Tianmeng, et al.
Published: (2026)
by: Hu, Tianmeng, et al.
Published: (2026)
Invariant Graph Learning Meets Information Bottleneck for Out-of-Distribution Generalization
by: Mao, Wenyu, et al.
Published: (2024)
by: Mao, Wenyu, et al.
Published: (2024)
Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning
by: Mao, Liyuan, et al.
Published: (2024)
by: Mao, Liyuan, et al.
Published: (2024)
One-Shot Federated Learning with Classifier-Free Diffusion Models
by: Zaland, Obaidullah, et al.
Published: (2025)
by: Zaland, Obaidullah, et al.
Published: (2025)
On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting
by: Zhang, Wenhao, et al.
Published: (2025)
by: Zhang, Wenhao, et al.
Published: (2025)
Free Draft-and-Verification: Toward Lossless Parallel Decoding for Diffusion Large Language Models
by: Wu, Shutong, et al.
Published: (2025)
by: Wu, Shutong, et al.
Published: (2025)
Efficient Online Reinforcement Learning for Diffusion Policy
by: Ma, Haitong, et al.
Published: (2025)
by: Ma, Haitong, et al.
Published: (2025)
Diffusion Policies for Risk-Averse Behavior Modeling in Offline Reinforcement Learning
by: Chen, Xiaocong, et al.
Published: (2024)
by: Chen, Xiaocong, et al.
Published: (2024)
Wavelet Fourier Diffuser: Frequency-Aware Diffusion Model for Reinforcement Learning
by: Luo, Yifu, et al.
Published: (2025)
by: Luo, Yifu, et al.
Published: (2025)
Extreme Value Policy Optimization for Safe Reinforcement Learning
by: Gao, Shiqing, et al.
Published: (2026)
by: Gao, Shiqing, et al.
Published: (2026)
MARFT: Multi-Agent Reinforcement Fine-Tuning
by: Liao, Junwei, et al.
Published: (2025)
by: Liao, Junwei, et al.
Published: (2025)
DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for Offline Reinforcement Learning
by: Hu, Xuemin, et al.
Published: (2024)
by: Hu, Xuemin, et al.
Published: (2024)
From Human Labels to Literature: Semi-Supervised Learning of NMR Chemical Shifts at Scale
by: Jin, Yongqi, et al.
Published: (2026)
by: Jin, Yongqi, et al.
Published: (2026)
DPO Meets PPO: Reinforced Token Optimization for RLHF
by: Zhong, Han, et al.
Published: (2024)
by: Zhong, Han, et al.
Published: (2024)
Query-Policy Misalignment in Preference-Based Reinforcement Learning
by: Hu, Xiao, et al.
Published: (2023)
by: Hu, Xiao, et al.
Published: (2023)
EvolveGen: Algorithmic Level Hardware Model Checking Benchmark Generation through Reinforcement Learning
by: Hu, Guangyu, et al.
Published: (2026)
by: Hu, Guangyu, et al.
Published: (2026)
Similar Items
-
Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization
by: Ding, Shutong, et al.
Published: (2024) -
Distributional Reinforcement Learning with Diffusion Bridge Critics
by: Ding, Shutong, et al.
Published: (2026) -
Sample-Efficient Diffusion-based Reinforcement Learning with Critic Guidance
by: Ding, Shutong, et al.
Published: (2026) -
Guidance with Spherical Gaussian Constraint for Conditional Diffusion
by: Yang, Lingxiao, et al.
Published: (2024) -
DreamPolicy: A Unified World-model Policy for Scalable Humanoid Locomotion
by: Fan, Yahao, et al.
Published: (2025)