Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Aydeniz, Ayhan Alp, Loftin, Robert, Tumer, Kagan
Format:	Preprint
Publié:	2026
Sujets:	Multiagent Systems Robotics
Accès en ligne:	https://arxiv.org/abs/2602.11740
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866911442606751744
author	Aydeniz, Ayhan Alp Loftin, Robert Tumer, Kagan
author_facet	Aydeniz, Ayhan Alp Loftin, Robert Tumer, Kagan
contents	Efficient exploration is critical for multiagent systems to discover coordinated strategies, particularly in open-ended domains such as search and rescue or planetary surveying. However, when exploration is encouraged only at the individual agent level, it often leads to redundancy, as agents act without awareness of how their teammates are exploring. In this work, we introduce Counterfactual Conditional Likelihood (CCL) rewards, which score each agent's exploration by isolating its unique contribution to team exploration. Unlike prior methods that reward agents solely for the novelty of their individual observations, CCL emphasizes observations that are informative with respect to the joint exploration of the team. Experiments in continuous multiagent domains show that CCL rewards accelerate learning for domains with sparse team rewards, where most joint actions yield zero rewards, and are particularly effective in tasks that require tight coordination among agents.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_11740
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Counterfactual Conditional Likelihood Rewards for Multiagent Exploration Aydeniz, Ayhan Alp Loftin, Robert Tumer, Kagan Multiagent Systems Robotics Efficient exploration is critical for multiagent systems to discover coordinated strategies, particularly in open-ended domains such as search and rescue or planetary surveying. However, when exploration is encouraged only at the individual agent level, it often leads to redundancy, as agents act without awareness of how their teammates are exploring. In this work, we introduce Counterfactual Conditional Likelihood (CCL) rewards, which score each agent's exploration by isolating its unique contribution to team exploration. Unlike prior methods that reward agents solely for the novelty of their individual observations, CCL emphasizes observations that are informative with respect to the joint exploration of the team. Experiments in continuous multiagent domains show that CCL rewards accelerate learning for domains with sparse team rewards, where most joint actions yield zero rewards, and are particularly effective in tasks that require tight coordination among agents.
title	Counterfactual Conditional Likelihood Rewards for Multiagent Exploration
topic	Multiagent Systems Robotics
url	https://arxiv.org/abs/2602.11740

Documents similaires