Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Kamel, Adam, Rastogi, Tanish, Ma, Michael, Ranganathan, Kailash, Zhu, Kevin
Formato:	Preprint
Publicado:	2025
Materias:	Computation and Language
Acceso en línea:	https://arxiv.org/abs/2512.23722
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866911344549167104
author	Kamel, Adam Rastogi, Tanish Ma, Michael Ranganathan, Kailash Zhu, Kevin
author_facet	Kamel, Adam Rastogi, Tanish Ma, Michael Ranganathan, Kailash Zhu, Kevin
contents	Transformer-based large language models (LLMs) have demonstrated strong reasoning abilities across diverse fields, from solving programming challenges to competing in strategy-intensive games such as chess. Prior work has shown that LLMs can develop emergent world models in games of perfect information, where internal representations correspond to latent states of the environment. In this paper, we extend this line of investigation to domains of incomplete information, focusing on poker as a canonical partially observable Markov decision process (POMDP). We pretrain a GPT-style model on Poker Hand History (PHH) data and probe its internal activations. Our results demonstrate that the model learns both deterministic structure, such as hand ranks, and stochastic features, such as equity, without explicit instruction. Furthermore, by using primarily nonlinear probes, we demonstrated that these representations are decodeable and correlate with theoretical belief states, suggesting that LLMs are learning their own representation of the stochastic environment of Texas Hold'em Poker.
format	Preprint
id	arxiv_https___arxiv_org_abs_2512_23722
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Emergent World Beliefs: Exploring Transformers in Stochastic Games Kamel, Adam Rastogi, Tanish Ma, Michael Ranganathan, Kailash Zhu, Kevin Computation and Language Transformer-based large language models (LLMs) have demonstrated strong reasoning abilities across diverse fields, from solving programming challenges to competing in strategy-intensive games such as chess. Prior work has shown that LLMs can develop emergent world models in games of perfect information, where internal representations correspond to latent states of the environment. In this paper, we extend this line of investigation to domains of incomplete information, focusing on poker as a canonical partially observable Markov decision process (POMDP). We pretrain a GPT-style model on Poker Hand History (PHH) data and probe its internal activations. Our results demonstrate that the model learns both deterministic structure, such as hand ranks, and stochastic features, such as equity, without explicit instruction. Furthermore, by using primarily nonlinear probes, we demonstrated that these representations are decodeable and correlate with theoretical belief states, suggesting that LLMs are learning their own representation of the stochastic environment of Texas Hold'em Poker.
title	Emergent World Beliefs: Exploring Transformers in Stochastic Games
topic	Computation and Language
url	https://arxiv.org/abs/2512.23722

Ejemplares similares