UNIMARC/MARC: :: Library Catalog

Uloženo v:

Podrobná bibliografie
Hlavní autor:	Nowickij (Navitski), Kirill Vladimirovich
Médium:	Recurso digital
Jazyk:	angličtina
Vydáno:	Zenodo 2026
Témata:	theatrical compliance large language models AI reasoning quality cognitive process evaluation prompt engineering metacognitive systems AI alignment chain-of-thought prompting AI reasoning language model failure modes LLM reasoning evaluation pseudo-reasoning cognitive emptiness AI safety model interpretability faithfulness of reasoning evaluation frameworks reasoning quality metrics LLM auditing cognitive bias in LLMs reasoning traces deep learning failures language model reliability AI risk assessment human-AI interaction
On-line přístup:	https://doi.org/10.5281/zenodo.19628186
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

_version_	1866901272691474432
author	Nowickij (Navitski), Kirill Vladimirovich
author_facet	Nowickij (Navitski), Kirill Vladimirovich
contents	<p>Abstract<br>There is a failure mode in large language models that we do not have a good name for, and that<br>we therefore tend not to treat seriously enough. It is not hallucination — the model is not asserting<br>something false. It is not refusal — the model answers at length. It is the production of responses that<br>carry the complete outward form of careful reasoning while the cognitive work that reasoning is<br>supposed to represent has not, in any meaningful sense, occurred. We call this theatrical compliance,<br>and we argue that it is, in practical terms, more dangerous than either of the failure modes that<br>currently dominate alignment research. This paper identifies the phenomenon, characterizes its five<br>principal forms, explains the asymmetry that makes it particularly costly in high-stakes settings, and<br>outlines the design requirements for systems intended to resist it. We do not describe such a system<br>in detail here. Our goal is to establish theatrical compliance as a research problem in its own right<br>and to argue that addressing it requires instruments operating at a fundamentally different level of<br>abstraction than task-level prompting frameworks.<br>Keywords: theatrical compliance, large language models, AI reasoning quality, cognitive<br>process evaluation, prompt engineering, metacognitive systems.</p>
format	Recurso digital
id	zenodo_https___doi_org_10_5281_zenodo_19628186
institution	Zenodo
language	eng
publishDate	2026
publisher	Zenodo
record_format	zenodo
spellingShingle	Theatrical Compliance: A Failure Mode in Large Language Models Nowickij (Navitski), Kirill Vladimirovich theatrical compliance large language models AI reasoning quality cognitive process evaluation prompt engineering metacognitive systems AI alignment chain-of-thought prompting AI reasoning language model failure modes LLM reasoning evaluation pseudo-reasoning cognitive emptiness AI safety model interpretability faithfulness of reasoning evaluation frameworks reasoning quality metrics LLM auditing cognitive bias in LLMs reasoning traces deep learning failures language model reliability AI risk assessment human-AI interaction <p>Abstract<br>There is a failure mode in large language models that we do not have a good name for, and that<br>we therefore tend not to treat seriously enough. It is not hallucination — the model is not asserting<br>something false. It is not refusal — the model answers at length. It is the production of responses that<br>carry the complete outward form of careful reasoning while the cognitive work that reasoning is<br>supposed to represent has not, in any meaningful sense, occurred. We call this theatrical compliance,<br>and we argue that it is, in practical terms, more dangerous than either of the failure modes that<br>currently dominate alignment research. This paper identifies the phenomenon, characterizes its five<br>principal forms, explains the asymmetry that makes it particularly costly in high-stakes settings, and<br>outlines the design requirements for systems intended to resist it. We do not describe such a system<br>in detail here. Our goal is to establish theatrical compliance as a research problem in its own right<br>and to argue that addressing it requires instruments operating at a fundamentally different level of<br>abstraction than task-level prompting frameworks.<br>Keywords: theatrical compliance, large language models, AI reasoning quality, cognitive<br>process evaluation, prompt engineering, metacognitive systems.</p>
title	Theatrical Compliance: A Failure Mode in Large Language Models
topic	theatrical compliance large language models AI reasoning quality cognitive process evaluation prompt engineering metacognitive systems AI alignment chain-of-thought prompting AI reasoning language model failure modes LLM reasoning evaluation pseudo-reasoning cognitive emptiness AI safety model interpretability faithfulness of reasoning evaluation frameworks reasoning quality metrics LLM auditing cognitive bias in LLMs reasoning traces deep learning failures language model reliability AI risk assessment human-AI interaction
url	https://doi.org/10.5281/zenodo.19628186

Podobné jednotky