MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Krastev, Sekoul, Sweatman, Hilary, Sternisko, Anni, Rathje, Steve
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Human-Computer Interaction
Accesso online:	https://arxiv.org/abs/2511.22746
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866915643039678464
author	Krastev, Sekoul Sweatman, Hilary Sternisko, Anni Rathje, Steve
author_facet	Krastev, Sekoul Sweatman, Hilary Sternisko, Anni Rathje, Steve
contents	As large language models (LLMs) rapidly displace traditional expertise, their capacity to correct misinformation has become a core concern. We investigate the idea that prompt framing systematically modulates misinformation correction - something we term 'epistemic fragility'. We manipulated prompts by open-mindedness, user intent, user role, and complexity. Across ten misinformation domains, we generated 320 prompts and elicited 2,560 responses from four frontier LLMs, which were coded for strength of misinformation correction and rectification strategy use. Analyses showed that creative intent, expert role, and closed framing led to a significant reduction in correction likelihood and effectiveness of used strategy. We also found striking model differences: Gemini 2.5 Pro had 74% lower odds of strong correction than Claude Sonnet 4.5. These findings highlight epistemic fragility as an important structural property of LLMs, challenging current guardrails and underscoring the need for alignment strategies that prioritize epistemic integrity over conversational compliance.
format	Preprint
id	arxiv_https___arxiv_org_abs_2511_22746
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Epistemic Fragility in Large Language Models: Prompt Framing Systematically Modulates Misinformation Correction Krastev, Sekoul Sweatman, Hilary Sternisko, Anni Rathje, Steve Human-Computer Interaction As large language models (LLMs) rapidly displace traditional expertise, their capacity to correct misinformation has become a core concern. We investigate the idea that prompt framing systematically modulates misinformation correction - something we term 'epistemic fragility'. We manipulated prompts by open-mindedness, user intent, user role, and complexity. Across ten misinformation domains, we generated 320 prompts and elicited 2,560 responses from four frontier LLMs, which were coded for strength of misinformation correction and rectification strategy use. Analyses showed that creative intent, expert role, and closed framing led to a significant reduction in correction likelihood and effectiveness of used strategy. We also found striking model differences: Gemini 2.5 Pro had 74% lower odds of strong correction than Claude Sonnet 4.5. These findings highlight epistemic fragility as an important structural property of LLMs, challenging current guardrails and underscoring the need for alignment strategies that prioritize epistemic integrity over conversational compliance.
title	Epistemic Fragility in Large Language Models: Prompt Framing Systematically Modulates Misinformation Correction
topic	Human-Computer Interaction
url	https://arxiv.org/abs/2511.22746

Documenti analoghi