Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Chatterjee, Krishnendu, Lurie, David, Saona, Raimundo, Ziliotto, Bruno
Formato:	Preprint
Publicado:	2026
Materias:	Optimization and Control
Acceso en línea:	https://arxiv.org/abs/2602.06480
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866911427448537088
author	Chatterjee, Krishnendu Lurie, David Saona, Raimundo Ziliotto, Bruno
author_facet	Chatterjee, Krishnendu Lurie, David Saona, Raimundo Ziliotto, Bruno
contents	In \emph{zero-sum two-player hidden stochastic games}, players observe partial information about the state. We address: $(i)$ the existence of the \emph{uniform value}, i.e., a limiting average payoff that both players can guarantee for sufficiently long durations, and $(ii)$ the existence of an algorithm to approximate it. Previous work shows that, in the general case, the uniform value may fail to exist, and, even when it does, there need not exist an algorithm to compute or approximate it. Therefore, we consider the \emph{Doeblin condition} in hidden stochastic games, requiring that, after a sufficiently long time, the posterior beliefs have a uniformly positive probability of resetting to one of finitely many neighborhoods in the belief space. We prove the existence of the uniform value and provide an algorithm to approximate it. We identify sufficient conditions, namely \emph{ergodicity} in the blind setting (when the signal is uninformative) and \emph{primitivity} in the hidden setting (when there are multiple signals). Moreover, we show that, in the hidden setting, ergodicity does not guarantee the Doeblin condition. Our results are new even for the one-player setting, i.e., partially observable Markov decision processes.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_06480
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Approximating the Uniform Value in Hidden Stochastic Games with Doeblin Conditions Chatterjee, Krishnendu Lurie, David Saona, Raimundo Ziliotto, Bruno Optimization and Control In \emph{zero-sum two-player hidden stochastic games}, players observe partial information about the state. We address: $(i)$ the existence of the \emph{uniform value}, i.e., a limiting average payoff that both players can guarantee for sufficiently long durations, and $(ii)$ the existence of an algorithm to approximate it. Previous work shows that, in the general case, the uniform value may fail to exist, and, even when it does, there need not exist an algorithm to compute or approximate it. Therefore, we consider the \emph{Doeblin condition} in hidden stochastic games, requiring that, after a sufficiently long time, the posterior beliefs have a uniformly positive probability of resetting to one of finitely many neighborhoods in the belief space. We prove the existence of the uniform value and provide an algorithm to approximate it. We identify sufficient conditions, namely \emph{ergodicity} in the blind setting (when the signal is uninformative) and \emph{primitivity} in the hidden setting (when there are multiple signals). Moreover, we show that, in the hidden setting, ergodicity does not guarantee the Doeblin condition. Our results are new even for the one-player setting, i.e., partially observable Markov decision processes.
title	Approximating the Uniform Value in Hidden Stochastic Games with Doeblin Conditions
topic	Optimization and Control
url	https://arxiv.org/abs/2602.06480

Ejemplares similares