Saved in:
Bibliographic Details
Main Authors: Taha, Feras Al, Rokade, Kiran, Parise, Francesca
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2408.06253
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910563715514368
author Taha, Feras Al
Rokade, Kiran
Parise, Francesca
author_facet Taha, Feras Al
Rokade, Kiran
Parise, Francesca
contents In this paper, we present a framework for multi-agent learning in a nonstationary dynamic network environment. More specifically, we examine projected gradient play in smooth monotone repeated network games in which the agents' participation and connectivity vary over time. We model this changing system with a stochastic network which takes a new independent realization at each repetition. We show that the strategy profile learned by the agents through projected gradient dynamics over the sequence of network realizations converges to a Nash equilibrium of the game in which players minimize their expected cost, almost surely and in the mean-square sense. We then show that the learned strategy profile is an almost Nash equilibrium of the game played by the agents at each stage of the repeated game with high probability. Using these two results, we derive non-asymptotic bounds on the regret incurred by the agents.
format Preprint
id arxiv_https___arxiv_org_abs_2408_06253
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Learning in Time-Varying Monotone Network Games with Dynamic Populations
Taha, Feras Al
Rokade, Kiran
Parise, Francesca
Computer Science and Game Theory
Systems and Control
Dynamical Systems
In this paper, we present a framework for multi-agent learning in a nonstationary dynamic network environment. More specifically, we examine projected gradient play in smooth monotone repeated network games in which the agents' participation and connectivity vary over time. We model this changing system with a stochastic network which takes a new independent realization at each repetition. We show that the strategy profile learned by the agents through projected gradient dynamics over the sequence of network realizations converges to a Nash equilibrium of the game in which players minimize their expected cost, almost surely and in the mean-square sense. We then show that the learned strategy profile is an almost Nash equilibrium of the game played by the agents at each stage of the repeated game with high probability. Using these two results, we derive non-asymptotic bounds on the regret incurred by the agents.
title Learning in Time-Varying Monotone Network Games with Dynamic Populations
topic Computer Science and Game Theory
Systems and Control
Dynamical Systems
url https://arxiv.org/abs/2408.06253