MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Weis, Marissa A., Wołczyk, Maciej, Nasser, Rajai, Saurous, Rif A., Arcas, Blaise Agüera y, Sacramento, João, Meulemans, Alexander
Natura:	Preprint
Pubblicazione:	2026
Soggetti:	Artificial Intelligence
Accesso online:	https://arxiv.org/abs/2602.16301
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866910025682780160
author	Weis, Marissa A. Wołczyk, Maciej Nasser, Rajai Saurous, Rif A. Arcas, Blaise Agüera y Sacramento, João Meulemans, Alexander
author_facet	Weis, Marissa A. Wołczyk, Maciej Nasser, Rajai Saurous, Rif A. Arcas, Blaise Agüera y Sacramento, João Meulemans, Alexander
contents	Achieving cooperation among self-interested agents remains a fundamental challenge in multi-agent reinforcement learning. Recent work showed that mutual cooperation can be induced between "learning-aware" agents that account for and shape the learning dynamics of their co-players. However, existing approaches typically rely on hardcoded, often inconsistent, assumptions about co-player learning rules or enforce a strict separation between "naive learners" updating on fast timescales and "meta-learners" observing these updates. Here, we demonstrate that the in-context learning capabilities of sequence models allow for co-player learning awareness without requiring hardcoded assumptions or explicit timescale separation. We show that training sequence model agents against a diverse distribution of co-players naturally induces in-context best-response strategies, effectively functioning as learning algorithms on the fast intra-episode timescale. We find that the cooperative mechanism identified in prior work-where vulnerability to extortion drives mutual shaping-emerges naturally in this setting: in-context adaptation renders agents vulnerable to extortion, and the resulting mutual pressure to shape the opponent's in-context learning dynamics resolves into the learning of cooperative behavior. Our results suggest that standard decentralized reinforcement learning on sequence models combined with co-player diversity provides a scalable path to learning cooperative behaviors.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_16301
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Multi-agent cooperation through in-context co-player inference Weis, Marissa A. Wołczyk, Maciej Nasser, Rajai Saurous, Rif A. Arcas, Blaise Agüera y Sacramento, João Meulemans, Alexander Artificial Intelligence Achieving cooperation among self-interested agents remains a fundamental challenge in multi-agent reinforcement learning. Recent work showed that mutual cooperation can be induced between "learning-aware" agents that account for and shape the learning dynamics of their co-players. However, existing approaches typically rely on hardcoded, often inconsistent, assumptions about co-player learning rules or enforce a strict separation between "naive learners" updating on fast timescales and "meta-learners" observing these updates. Here, we demonstrate that the in-context learning capabilities of sequence models allow for co-player learning awareness without requiring hardcoded assumptions or explicit timescale separation. We show that training sequence model agents against a diverse distribution of co-players naturally induces in-context best-response strategies, effectively functioning as learning algorithms on the fast intra-episode timescale. We find that the cooperative mechanism identified in prior work-where vulnerability to extortion drives mutual shaping-emerges naturally in this setting: in-context adaptation renders agents vulnerable to extortion, and the resulting mutual pressure to shape the opponent's in-context learning dynamics resolves into the learning of cooperative behavior. Our results suggest that standard decentralized reinforcement learning on sequence models combined with co-player diversity provides a scalable path to learning cooperative behaviors.
title	Multi-agent cooperation through in-context co-player inference
topic	Artificial Intelligence
url	https://arxiv.org/abs/2602.16301

Documenti analoghi