Salvato in:
Dettagli Bibliografici
Autori principali: Krastev, Sekoul, Sweatman, Hilary, Sternisko, Anni, Rathje, Steve
Natura: Preprint
Pubblicazione: 2025
Soggetti:
Accesso online:https://arxiv.org/abs/2511.22746
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866915643039678464
author Krastev, Sekoul
Sweatman, Hilary
Sternisko, Anni
Rathje, Steve
author_facet Krastev, Sekoul
Sweatman, Hilary
Sternisko, Anni
Rathje, Steve
contents As large language models (LLMs) rapidly displace traditional expertise, their capacity to correct misinformation has become a core concern. We investigate the idea that prompt framing systematically modulates misinformation correction - something we term 'epistemic fragility'. We manipulated prompts by open-mindedness, user intent, user role, and complexity. Across ten misinformation domains, we generated 320 prompts and elicited 2,560 responses from four frontier LLMs, which were coded for strength of misinformation correction and rectification strategy use. Analyses showed that creative intent, expert role, and closed framing led to a significant reduction in correction likelihood and effectiveness of used strategy. We also found striking model differences: Gemini 2.5 Pro had 74% lower odds of strong correction than Claude Sonnet 4.5. These findings highlight epistemic fragility as an important structural property of LLMs, challenging current guardrails and underscoring the need for alignment strategies that prioritize epistemic integrity over conversational compliance.
format Preprint
id arxiv_https___arxiv_org_abs_2511_22746
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Epistemic Fragility in Large Language Models: Prompt Framing Systematically Modulates Misinformation Correction
Krastev, Sekoul
Sweatman, Hilary
Sternisko, Anni
Rathje, Steve
Human-Computer Interaction
As large language models (LLMs) rapidly displace traditional expertise, their capacity to correct misinformation has become a core concern. We investigate the idea that prompt framing systematically modulates misinformation correction - something we term 'epistemic fragility'. We manipulated prompts by open-mindedness, user intent, user role, and complexity. Across ten misinformation domains, we generated 320 prompts and elicited 2,560 responses from four frontier LLMs, which were coded for strength of misinformation correction and rectification strategy use. Analyses showed that creative intent, expert role, and closed framing led to a significant reduction in correction likelihood and effectiveness of used strategy. We also found striking model differences: Gemini 2.5 Pro had 74% lower odds of strong correction than Claude Sonnet 4.5. These findings highlight epistemic fragility as an important structural property of LLMs, challenging current guardrails and underscoring the need for alignment strategies that prioritize epistemic integrity over conversational compliance.
title Epistemic Fragility in Large Language Models: Prompt Framing Systematically Modulates Misinformation Correction
topic Human-Computer Interaction
url https://arxiv.org/abs/2511.22746