Saved in:
Bibliographic Details
Main Authors: Pechon-Elkins, Mateo, Chun, Jon
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.10029
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915851587813376
author Pechon-Elkins, Mateo
Chun, Jon
author_facet Pechon-Elkins, Mateo
Chun, Jon
contents Theory of Mind benchmarks for large language models typically produce aggregate scores without theoretical grounding, making it unclear whether high performance reflects strategic reasoning or surface-level heuristics. We introduce a game-theoretic evaluation framework grounded in quantal response equilibrium (QRE). We derive closed-form equilibria for four strategic games, each targeting a distinct cognitive capability. We estimate QRE rationality parameters lambda that place model behavior on a continuous scale calibrated against human data (lambda_human in [1.0, 2.5]), and establish finite-sample convergence bounds via martingale concentration. Validation across 1,855 games with seven frontier models (plus four expansion models) confirms predictions: bluff rates converge to within 4% of equilibrium, lambda estimates range from 0.05 to 1.10 across games and models with substantial cross-model variation, and capability profiles differ across cognitive axes. Robustness analyses reveal high sensitivity to prompt framing and version instability in QRE rankings, highlighting the need for standardized protocols.
format Preprint
id arxiv_https___arxiv_org_abs_2603_10029
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Quantal Response Equilibrium as a Measure of Strategic Sophistication: Theory and Validation for LLM Evaluation
Pechon-Elkins, Mateo
Chun, Jon
Computer Science and Game Theory
I.2.1; J.4
Theory of Mind benchmarks for large language models typically produce aggregate scores without theoretical grounding, making it unclear whether high performance reflects strategic reasoning or surface-level heuristics. We introduce a game-theoretic evaluation framework grounded in quantal response equilibrium (QRE). We derive closed-form equilibria for four strategic games, each targeting a distinct cognitive capability. We estimate QRE rationality parameters lambda that place model behavior on a continuous scale calibrated against human data (lambda_human in [1.0, 2.5]), and establish finite-sample convergence bounds via martingale concentration. Validation across 1,855 games with seven frontier models (plus four expansion models) confirms predictions: bluff rates converge to within 4% of equilibrium, lambda estimates range from 0.05 to 1.10 across games and models with substantial cross-model variation, and capability profiles differ across cognitive axes. Robustness analyses reveal high sensitivity to prompt framing and version instability in QRE rankings, highlighting the need for standardized protocols.
title Quantal Response Equilibrium as a Measure of Strategic Sophistication: Theory and Validation for LLM Evaluation
topic Computer Science and Game Theory
I.2.1; J.4
url https://arxiv.org/abs/2603.10029