Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.10029 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866915851587813376 |
|---|---|
| author | Pechon-Elkins, Mateo Chun, Jon |
| author_facet | Pechon-Elkins, Mateo Chun, Jon |
| contents | Theory of Mind benchmarks for large language models typically produce aggregate scores without theoretical grounding, making it unclear whether high performance reflects strategic reasoning or surface-level heuristics. We introduce a game-theoretic evaluation framework grounded in quantal response equilibrium (QRE). We derive closed-form equilibria for four strategic games, each targeting a distinct cognitive capability. We estimate QRE rationality parameters lambda that place model behavior on a continuous scale calibrated against human data (lambda_human in [1.0, 2.5]), and establish finite-sample convergence bounds via martingale concentration. Validation across 1,855 games with seven frontier models (plus four expansion models) confirms predictions: bluff rates converge to within 4% of equilibrium, lambda estimates range from 0.05 to 1.10 across games and models with substantial cross-model variation, and capability profiles differ across cognitive axes. Robustness analyses reveal high sensitivity to prompt framing and version instability in QRE rankings, highlighting the need for standardized protocols. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2603_10029 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Quantal Response Equilibrium as a Measure of Strategic Sophistication: Theory and Validation for LLM Evaluation Pechon-Elkins, Mateo Chun, Jon Computer Science and Game Theory I.2.1; J.4 Theory of Mind benchmarks for large language models typically produce aggregate scores without theoretical grounding, making it unclear whether high performance reflects strategic reasoning or surface-level heuristics. We introduce a game-theoretic evaluation framework grounded in quantal response equilibrium (QRE). We derive closed-form equilibria for four strategic games, each targeting a distinct cognitive capability. We estimate QRE rationality parameters lambda that place model behavior on a continuous scale calibrated against human data (lambda_human in [1.0, 2.5]), and establish finite-sample convergence bounds via martingale concentration. Validation across 1,855 games with seven frontier models (plus four expansion models) confirms predictions: bluff rates converge to within 4% of equilibrium, lambda estimates range from 0.05 to 1.10 across games and models with substantial cross-model variation, and capability profiles differ across cognitive axes. Robustness analyses reveal high sensitivity to prompt framing and version instability in QRE rankings, highlighting the need for standardized protocols. |
| title | Quantal Response Equilibrium as a Measure of Strategic Sophistication: Theory and Validation for LLM Evaluation |
| topic | Computer Science and Game Theory I.2.1; J.4 |
| url | https://arxiv.org/abs/2603.10029 |