Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Li, Jiamian
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence Multiagent Systems
Online Access:	https://arxiv.org/abs/2410.11642
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913560687280128
author	Li, Jiamian
author_facet	Li, Jiamian
contents	Reinforcement learning has achieved remarkable success in perfect information games such as Go and Atari, enabling agents to compete at the highest levels against human players. However, research in reinforcement learning for imperfect information games has been relatively limited due to the more complex game structures and randomness. Traditional methods face challenges in training and improving performance in imperfect information games due to issues like inaccurate Q value estimation and reward sparsity. In this paper, we focus on Uno, an imperfect information game, and aim to address these problems by reducing Q value overestimation and reshaping reward function. We propose a novel algorithm that utilizes Monte Carlo Tree Search to average the value estimations in Q function. Even though we choose Double Deep Q Learning as the foundational framework in this paper, our method can be generalized and used in any algorithm which needs Q value estimation, such as the Actor-Critic. Additionally, we employ Monte Carlo Tree Search to reshape the reward structure in the game environment. We compare our algorithm with several traditional methods applied to games such as Double Deep Q Learning, Deep Monte Carlo and Neural Fictitious Self Play, and the experiments demonstrate that our algorithm consistently outperforms these approaches, especially as the number of players in Uno increases, indicating a higher level of difficulty.
format	Preprint
id	arxiv_https___arxiv_org_abs_2410_11642
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Improve Value Estimation of Q Function and Reshape Reward with Monte Carlo Tree Search Li, Jiamian Machine Learning Artificial Intelligence Multiagent Systems Reinforcement learning has achieved remarkable success in perfect information games such as Go and Atari, enabling agents to compete at the highest levels against human players. However, research in reinforcement learning for imperfect information games has been relatively limited due to the more complex game structures and randomness. Traditional methods face challenges in training and improving performance in imperfect information games due to issues like inaccurate Q value estimation and reward sparsity. In this paper, we focus on Uno, an imperfect information game, and aim to address these problems by reducing Q value overestimation and reshaping reward function. We propose a novel algorithm that utilizes Monte Carlo Tree Search to average the value estimations in Q function. Even though we choose Double Deep Q Learning as the foundational framework in this paper, our method can be generalized and used in any algorithm which needs Q value estimation, such as the Actor-Critic. Additionally, we employ Monte Carlo Tree Search to reshape the reward structure in the game environment. We compare our algorithm with several traditional methods applied to games such as Double Deep Q Learning, Deep Monte Carlo and Neural Fictitious Self Play, and the experiments demonstrate that our algorithm consistently outperforms these approaches, especially as the number of players in Uno increases, indicating a higher level of difficulty.
title	Improve Value Estimation of Q Function and Reshape Reward with Monte Carlo Tree Search
topic	Machine Learning Artificial Intelligence Multiagent Systems
url	https://arxiv.org/abs/2410.11642

Similar Items