Enregistré dans:
Détails bibliographiques
Auteurs principaux: Song, Zhenyu, Zheng, Ronghao, Zhang, Senlin, Liu, Meiqin
Format: Preprint
Publié: 2024
Sujets:
Accès en ligne:https://arxiv.org/abs/2407.10403
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866916327593082880
author Song, Zhenyu
Zheng, Ronghao
Zhang, Senlin
Liu, Meiqin
author_facet Song, Zhenyu
Zheng, Ronghao
Zhang, Senlin
Liu, Meiqin
contents The primary objective of Multi-Agent Pathfinding (MAPF) is to plan efficient and conflict-free paths for all agents. Traditional multi-agent path planning algorithms struggle to achieve efficient distributed path planning for multiple agents. In contrast, Multi-Agent Reinforcement Learning (MARL) has been demonstrated as an effective approach to achieve this objective. By modeling the MAPF problem as a MARL problem, agents can achieve efficient path planning and collision avoidance through distributed strategies under partial observation. However, MARL strategies often lack cooperation among agents due to the absence of global information, which subsequently leads to reduced MAPF efficiency. To address this challenge, this letter introduces a unique reward shaping technique based on Independent Q-Learning (IQL). The aim of this method is to evaluate the influence of one agent on its neighbors and integrate such an interaction into the reward function, leading to active cooperation among agents. This reward shaping method facilitates cooperation among agents while operating in a distributed manner. The proposed approach has been evaluated through experiments across various scenarios with different scales and agent counts. The results are compared with those from other state-of-the-art (SOTA) planners. The evidence suggests that the approach proposed in this letter parallels other planners in numerous aspects, and outperforms them in scenarios featuring a large number of agents.
format Preprint
id arxiv_https___arxiv_org_abs_2407_10403
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Cooperative Reward Shaping for Multi-Agent Pathfinding
Song, Zhenyu
Zheng, Ronghao
Zhang, Senlin
Liu, Meiqin
Artificial Intelligence
Robotics
The primary objective of Multi-Agent Pathfinding (MAPF) is to plan efficient and conflict-free paths for all agents. Traditional multi-agent path planning algorithms struggle to achieve efficient distributed path planning for multiple agents. In contrast, Multi-Agent Reinforcement Learning (MARL) has been demonstrated as an effective approach to achieve this objective. By modeling the MAPF problem as a MARL problem, agents can achieve efficient path planning and collision avoidance through distributed strategies under partial observation. However, MARL strategies often lack cooperation among agents due to the absence of global information, which subsequently leads to reduced MAPF efficiency. To address this challenge, this letter introduces a unique reward shaping technique based on Independent Q-Learning (IQL). The aim of this method is to evaluate the influence of one agent on its neighbors and integrate such an interaction into the reward function, leading to active cooperation among agents. This reward shaping method facilitates cooperation among agents while operating in a distributed manner. The proposed approach has been evaluated through experiments across various scenarios with different scales and agent counts. The results are compared with those from other state-of-the-art (SOTA) planners. The evidence suggests that the approach proposed in this letter parallels other planners in numerous aspects, and outperforms them in scenarios featuring a large number of agents.
title Cooperative Reward Shaping for Multi-Agent Pathfinding
topic Artificial Intelligence
Robotics
url https://arxiv.org/abs/2407.10403