Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Cao, Wenjun
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2504.18766
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915259848065024
author	Cao, Wenjun
author_facet	Cao, Wenjun
contents	Reinforcement learning (RL) suffers from severe sample inefficiency, especially during early training, requiring extensive environmental interactions to perform competently. Existing methods tend to solve this by incorporating prior knowledge, but introduce significant architectural and implementation complexity. We propose Dynamic Action Interpolation (DAI), a universal yet straightforward framework that interpolates expert and RL actions via a time-varying weight $α(t)$, integrating into any Actor-Critic algorithm with just a few lines of code and without auxiliary networks or additional losses. Our theoretical analysis shows that DAI reshapes state visitation distributions to accelerate value function learning while preserving convergence guarantees. Empirical evaluations across MuJoCo continuous control tasks demonstrate that DAI improves early-stage performance by over 160\% on average and final performance by more than 50\%, with the Humanoid task showing a 4$\times$ improvement early on and a 2$\times$ gain at convergence. These results challenge the assumption that complex architectural modifications are necessary for sample-efficient reinforcement learning.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_18766
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Dynamic Action Interpolation: A Universal Approach for Accelerating Reinforcement Learning with Expert Guidance Cao, Wenjun Machine Learning Artificial Intelligence Reinforcement learning (RL) suffers from severe sample inefficiency, especially during early training, requiring extensive environmental interactions to perform competently. Existing methods tend to solve this by incorporating prior knowledge, but introduce significant architectural and implementation complexity. We propose Dynamic Action Interpolation (DAI), a universal yet straightforward framework that interpolates expert and RL actions via a time-varying weight $α(t)$, integrating into any Actor-Critic algorithm with just a few lines of code and without auxiliary networks or additional losses. Our theoretical analysis shows that DAI reshapes state visitation distributions to accelerate value function learning while preserving convergence guarantees. Empirical evaluations across MuJoCo continuous control tasks demonstrate that DAI improves early-stage performance by over 160\% on average and final performance by more than 50\%, with the Humanoid task showing a 4$\times$ improvement early on and a 2$\times$ gain at convergence. These results challenge the assumption that complex architectural modifications are necessary for sample-efficient reinforcement learning.
title	Dynamic Action Interpolation: A Universal Approach for Accelerating Reinforcement Learning with Expert Guidance
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2504.18766

Similar Items