Inhaltsangabe: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Ding, Shutong, Zhong, Zejia, Wang, Zhongyi, Hu, Ke, Pan, Bikang, Wang, Jingya, Shi, Ye
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Robotics Machine Learning
Online-Zugang:	https://arxiv.org/abs/2605.30056
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Inhaltsangabe:

Recent advances in reinforcement learning (RL) have achieved great successes by leveraging the multimodality and exploration capability of diffusion policies. Among these approaches, one representative branch focuses on the sampling-based policy optimization. This design enables better exploration capability of the diffusion model, particularly at the beginning of training, but suffer from low exploitation in Q-value information, resulting in a slow policy convergence. Another branch pays attention to gradient-based policy optimization, which sufficiently exploits the gradient of the Q function yet tends to collapse into a unimodal policy with low diversity. To address this issue, we propose CGPO, \textbf{C}ritic-\textbf{G}uided diffusion \textbf{P}olicy \textbf{O}ptimization, which effectively balances exploration and exploitation with the training-free guidance technique integrated into the denoising process of diffusion policy. Concretely, CGPO steers action generation toward high-value regions defined by the critic network and uses the guided actions as regression objectives. In this manner, CGPO reduces the time required to obtain high-quality actions and improves final performance with better balance between the exploration-exploitation tradeoff. We validate the effectiveness of CGPO on 5 MuJoCo locomotion tasks, and CGPO achieves state-of-the-art performance compared with existing diffusion-based RL methods. Notably, CGPO is the first success to incorporate diffusion policy into real-world RL, with its superior performance on Franka robot arm grasping tasks. Our official page is released at https://dingsht.tech/cgpo-webpage.

Ähnliche Einträge