Salvato in:
Dettagli Bibliografici
Autori principali: Liu, Wenzhang, Jin, Lianjun, Ren, Lu, Mu, Chaoxu, Sun, Changyin
Natura: Preprint
Pubblicazione: 2025
Soggetti:
Accesso online:https://arxiv.org/abs/2501.14543
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
Sommario:
  • Intelligent decision-making within large and redundant action spaces remains challenging in deep reinforcement learning. Considering similar but ineffective actions at each step can lead to repetitive and unproductive trials. Existing methods attempt to improve agent exploration by reducing or penalizing redundant actions, yet they fail to provide quantitative and reliable evidence to determine redundancy. In this paper, we propose a method to improve exploration efficiency by estimating the causal effects of actions. Unlike prior methods, our approach offers quantitative results regarding the causality of actions for one-step transitions. We first pre-train an inverse dynamics model to serve as prior knowledge of the environment. Subsequently, we classify actions across the entire action space at each time step and estimate the causal effect of each action to suppress redundant actions during exploration. We provide a theoretical analysis to demonstrate the effectiveness of our method and present empirical results from simulations in environments with redundant actions to evaluate its performance. Our implementation is available at https://github.com/agi-brain/cee.git.