Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Pechon-Elkins, Mateo, Chun, Jon
Format:	Preprint
Published:	2026
Subjects:	Computer Science and Game Theory I.2.1; J.4
Online Access:	https://arxiv.org/abs/2603.10029
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

Theory of Mind benchmarks for large language models typically produce aggregate scores without theoretical grounding, making it unclear whether high performance reflects strategic reasoning or surface-level heuristics. We introduce a game-theoretic evaluation framework grounded in quantal response equilibrium (QRE). We derive closed-form equilibria for four strategic games, each targeting a distinct cognitive capability. We estimate QRE rationality parameters lambda that place model behavior on a continuous scale calibrated against human data (lambda_human in [1.0, 2.5]), and establish finite-sample convergence bounds via martingale concentration. Validation across 1,855 games with seven frontier models (plus four expansion models) confirms predictions: bluff rates converge to within 4% of equilibrium, lambda estimates range from 0.05 to 1.10 across games and models with substantial cross-model variation, and capability profiles differ across cognitive axes. Robustness analyses reveal high sensitivity to prompt framing and version instability in QRE rankings, highlighting the need for standardized protocols.

Similar Items