Saved in:
| Main Authors: | Yang, Yuqing, Jia, Robin |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.16170 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs
by: Kamoi, Ryo, et al.
Published: (2024)
by: Kamoi, Ryo, et al.
Published: (2024)
Retrieval Augmented Question Answering: When Should LLMs Admit Ignorance?
by: Wang, Dingmin, et al.
Published: (2025)
by: Wang, Dingmin, et al.
Published: (2025)
Do LLMs Make Mistakes Like Students? Exploring Natural Alignment between Language Models and Human Error Patterns
by: Liu, Naiming, et al.
Published: (2025)
by: Liu, Naiming, et al.
Published: (2025)
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks
by: Chang, Ting-Yun, et al.
Published: (2023)
by: Chang, Ting-Yun, et al.
Published: (2023)
Can LLMs Learn from Previous Mistakes? Investigating LLMs' Errors to Boost for Reasoning
by: Tong, Yongqi, et al.
Published: (2024)
by: Tong, Yongqi, et al.
Published: (2024)
Do LLMs Recognize me, When I is not me: Assessment of LLMs Understanding of Turkish Indexical Pronouns in Indexical Shift Contexts
by: Oğuz, Metehan, et al.
Published: (2024)
by: Oğuz, Metehan, et al.
Published: (2024)
Do LLMs Truly Understand When a Precedent Is Overruled?
by: Zhang, Li, et al.
Published: (2025)
by: Zhang, Li, et al.
Published: (2025)
PDDL-Mind: Large Language Models are Capable on Belief Reasoning with Reliable State Tracking
by: Zhu, Wang Bill, et al.
Published: (2026)
by: Zhu, Wang Bill, et al.
Published: (2026)
LLMs can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought
by: Jiang, Zhuoxuan, et al.
Published: (2024)
by: Jiang, Zhuoxuan, et al.
Published: (2024)
When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure
by: Xiao, Boyu, et al.
Published: (2026)
by: Xiao, Boyu, et al.
Published: (2026)
When Parts Are Greater Than Sums: Individual LLM Components Can Outperform Full Models
by: Chang, Ting-Yun, et al.
Published: (2024)
by: Chang, Ting-Yun, et al.
Published: (2024)
Do LLMs Know When to NOT Answer? Investigating Abstention Abilities of Large Language Models
by: Madhusudhan, Nishanth, et al.
Published: (2024)
by: Madhusudhan, Nishanth, et al.
Published: (2024)
In-Context Principle Learning from Mistakes
by: Zhang, Tianjun, et al.
Published: (2024)
by: Zhang, Tianjun, et al.
Published: (2024)
Towards Reward Modeling for AI Tutors in Math Mistake Remediation
by: Petukhova, Kseniia, et al.
Published: (2026)
by: Petukhova, Kseniia, et al.
Published: (2026)
When Do LLMs Need Retrieval Augmentation? Mitigating LLMs' Overconfidence Helps Retrieval Augmentation
by: Ni, Shiyu, et al.
Published: (2024)
by: Ni, Shiyu, et al.
Published: (2024)
Metacognitive Prompting Improves Understanding in Large Language Models
by: Wang, Yuqing, et al.
Published: (2023)
by: Wang, Yuqing, et al.
Published: (2023)
Probing the Lack of Stable Internal Beliefs in LLMs
by: Luo, Yifan, et al.
Published: (2026)
by: Luo, Yifan, et al.
Published: (2026)
Large Models of What? Mistaking Engineering Achievements for Human Linguistic Agency
by: Birhane, Abeba, et al.
Published: (2024)
by: Birhane, Abeba, et al.
Published: (2024)
Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning
by: Singh, Joykirat, et al.
Published: (2024)
by: Singh, Joykirat, et al.
Published: (2024)
Retrieved In-Context Principles from Previous Mistakes
by: Sun, Hao, et al.
Published: (2024)
by: Sun, Hao, et al.
Published: (2024)
Causal Understanding by LLMs: The Role of Uncertainty
by: Lithgow-Serrano, Oscar, et al.
Published: (2025)
by: Lithgow-Serrano, Oscar, et al.
Published: (2025)
When LLMs Meet Cunning Texts: A Fallacy Understanding Benchmark for Large Language Models
by: Li, Yinghui, et al.
Published: (2024)
by: Li, Yinghui, et al.
Published: (2024)
Self-Evolving LLM Memory Extraction Across Heterogeneous Tasks
by: Yang, Yuqing, et al.
Published: (2026)
by: Yang, Yuqing, et al.
Published: (2026)
Why Do LLMs Struggle in Strategic Play? Broken Links Between Observations, Beliefs, and Actions
by: Sobotka, Jan, et al.
Published: (2026)
by: Sobotka, Jan, et al.
Published: (2026)
How Well Do LLMs Understand Tunisian Arabic?
by: Mahdi, Mohamed
Published: (2025)
by: Mahdi, Mohamed
Published: (2025)
Transparent and Coherent Procedural Mistake Detection
by: Storks, Shane, et al.
Published: (2024)
by: Storks, Shane, et al.
Published: (2024)
Vulnerability of LLMs' Stated Beliefs? LLMs Belief Resistance Check Through Strategic Persuasive Conversation Interventions
by: Huang, Fan, et al.
Published: (2026)
by: Huang, Fan, et al.
Published: (2026)
Do LLMs Signal When They're Right? Evidence from Neuron Agreement
by: Chen, Kang, et al.
Published: (2025)
by: Chen, Kang, et al.
Published: (2025)
When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs
by: Sun, Zhongxiang, et al.
Published: (2026)
by: Sun, Zhongxiang, et al.
Published: (2026)
Learning-From-Mistakes Prompting for Indigenous Language Translation
by: Liao, You-Cheng, et al.
Published: (2024)
by: Liao, You-Cheng, et al.
Published: (2024)
Teaching Models to Understand (but not Generate) High-risk Data
by: Wang, Ryan, et al.
Published: (2025)
by: Wang, Ryan, et al.
Published: (2025)
Understanding the Role of LLMs in Multimodal Evaluation Benchmarks
by: Jiang, Botian, et al.
Published: (2024)
by: Jiang, Botian, et al.
Published: (2024)
Zero-Shot Belief: A Hard Problem for LLMs
by: Murzaku, John, et al.
Published: (2025)
by: Murzaku, John, et al.
Published: (2025)
Using LLMs to Model the Beliefs and Preferences of Targeted Populations
by: Namikoshi, Keiichi, et al.
Published: (2024)
by: Namikoshi, Keiichi, et al.
Published: (2024)
Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs
by: Bai, Jun, et al.
Published: (2025)
by: Bai, Jun, et al.
Published: (2025)
Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models
by: Bortoletto, Matteo, et al.
Published: (2024)
by: Bortoletto, Matteo, et al.
Published: (2024)
Do LLMs Play Dice? Exploring Probability Distribution Sampling in Large Language Models for Behavioral Simulation
by: Gu, Jia, et al.
Published: (2024)
by: Gu, Jia, et al.
Published: (2024)
When More is Less: Understanding Chain-of-Thought Length in LLMs
by: Wu, Yuyang, et al.
Published: (2025)
by: Wu, Yuyang, et al.
Published: (2025)
When Should Models Change Their Minds? Contextual Belief Management in Large Language Models
by: Xu, Haoming, et al.
Published: (2026)
by: Xu, Haoming, et al.
Published: (2026)
Rectifying Belief Space via Unlearning to Harness LLMs' Reasoning
by: Niwa, Ayana, et al.
Published: (2025)
by: Niwa, Ayana, et al.
Published: (2025)
Similar Items
-
When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs
by: Kamoi, Ryo, et al.
Published: (2024) -
Retrieval Augmented Question Answering: When Should LLMs Admit Ignorance?
by: Wang, Dingmin, et al.
Published: (2025) -
Do LLMs Make Mistakes Like Students? Exploring Natural Alignment between Language Models and Human Error Patterns
by: Liu, Naiming, et al.
Published: (2025) -
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks
by: Chang, Ting-Yun, et al.
Published: (2023) -
Can LLMs Learn from Previous Mistakes? Investigating LLMs' Errors to Boost for Reasoning
by: Tong, Yongqi, et al.
Published: (2024)