Saved in:
| Main Author: | Pandey, Manav |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.19117 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Do Androids Know They're Only Dreaming of Electric Sheep?
by: CH-Wang, Sky, et al.
Published: (2023)
by: CH-Wang, Sky, et al.
Published: (2023)
Know When You're Wrong: Aligning Confidence with Correctness for LLM Error Detection
by: Xiaohu, Xie, et al.
Published: (2026)
by: Xiaohu, Xie, et al.
Published: (2026)
Letting Others Know How They're Doing.
by: Hartzell, Gary
Published: (1993)
by: Hartzell, Gary
Published: (1993)
AgreeMate: Teaching LLMs to Haggle
by: Chatterjee, Ainesh, et al.
Published: (2024)
by: Chatterjee, Ainesh, et al.
Published: (2024)
They're All Doctors: Synthesizing Diverse Counterfactuals to Mitigate Associative Bias
by: Magid, Salma Abdel, et al.
Published: (2024)
by: Magid, Salma Abdel, et al.
Published: (2024)
Can AI Agents Agree?
by: Berdoz, Frédéric, et al.
Published: (2026)
by: Berdoz, Frédéric, et al.
Published: (2026)
Do Small Language Models Know When They're Wrong? Confidence-Based Cascade Scoring for Educational Assessment
by: Burleigh, Tyler
Published: (2026)
by: Burleigh, Tyler
Published: (2026)
Do Two AI Scientists Agree?
by: Fu, Xinghong, et al.
Published: (2025)
by: Fu, Xinghong, et al.
Published: (2025)
They're Back! Invite Them In!
by: Barron, Daniel D.
Published: (1992)
by: Barron, Daniel D.
Published: (1992)
BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs
by: Petrov, Ivo, et al.
Published: (2025)
by: Petrov, Ivo, et al.
Published: (2025)
Easy Problems That LLMs Get Wrong
by: Williams, Sean, et al.
Published: (2024)
by: Williams, Sean, et al.
Published: (2024)
Do Language Models Know When They're Hallucinating References?
by: Agrawal, Ayush, et al.
Published: (2023)
by: Agrawal, Ayush, et al.
Published: (2023)
Europe. They're
Published: (2002)
Published: (2002)
Consistency Training Helps Stop Sycophancy and Jailbreaks
by: Irpan, Alex, et al.
Published: (2025)
by: Irpan, Alex, et al.
Published: (2025)
A Few Bad Neurons: Isolating and Surgically Correcting Sycophancy
by: O'Brien, Claire, et al.
Published: (2026)
by: O'Brien, Claire, et al.
Published: (2026)
Adapt, Agree, Aggregate: Semi-Supervised Ensemble Labeling for Graph Convolutional Networks
by: Abdolali, Maryam, et al.
Published: (2025)
by: Abdolali, Maryam, et al.
Published: (2025)
LLMs Know When They Know, but Do Not Act on It: A Metacognitive Harness for Test-time Scaling
by: Cao, Qi, et al.
Published: (2026)
by: Cao, Qi, et al.
Published: (2026)
"Patriarchy Hurts Men Too." Does Your Model Agree? A Discussion on Fairness Assumptions
by: Favier, Marco, et al.
Published: (2024)
by: Favier, Marco, et al.
Published: (2024)
Bayesian Mixture-of-Experts: Towards Making LLMs Know What They Don't Know
by: Li, Albus Yizhuo
Published: (2025)
by: Li, Albus Yizhuo
Published: (2025)
Calibration Collapse Under Sycophancy Fine-Tuning: How Reward Hacking Breaks Uncertainty Quantification in LLMs
by: Sahoo, Subramanyam
Published: (2026)
by: Sahoo, Subramanyam
Published: (2026)
Towards Understanding Sycophancy in Language Models
by: Sharma, Mrinank, et al.
Published: (2023)
by: Sharma, Mrinank, et al.
Published: (2023)
The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications
by: Zhao, Zhenyu, et al.
Published: (2026)
by: Zhao, Zhenyu, et al.
Published: (2026)
They're happy and they know it
by: Anscombe, Nadya
Published: (2005)
by: Anscombe, Nadya
Published: (2005)
They're redesigning the airplane
Published: (1981)
Published: (1981)
What Did I Do Wrong? Quantifying LLMs' Sensitivity and Consistency to Prompt Engineering
by: Errica, Federico, et al.
Published: (2024)
by: Errica, Federico, et al.
Published: (2024)
Physicists Don't Know What They're Talking About When They Say 'Order'
by: Arafat Gaspar Jiménez Gaistardo
Published: (2025)
by: Arafat Gaspar Jiménez Gaistardo
Published: (2025)
Thinking Out Loud: Do Reasoning Models Know When They're Right?
by: Zeng, Qingcheng, et al.
Published: (2025)
by: Zeng, Qingcheng, et al.
Published: (2025)
PARROT: Persuasion and Agreement Robustness Rating of Output Truth -- A Sycophancy Robustness Benchmark for LLMs
by: Çelebi, Yusuf, et al.
Published: (2025)
by: Çelebi, Yusuf, et al.
Published: (2025)
Extending Beacon to Hindi: Cultural Adaptation Drives Cross-Lingual Sycophancy
by: Sattigeri, Sarthak
Published: (2026)
by: Sattigeri, Sarthak
Published: (2026)
Capacity-Aware Planning and Scheduling in Budget-Constrained Multi-Agent MDPs: A Meta-RL Approach
by: Vora, Manav, et al.
Published: (2024)
by: Vora, Manav, et al.
Published: (2024)
Knowing but Not Correcting: Routine Task Requests Suppress Factual Correction in LLMs
by: Chen, Zixuan, et al.
Published: (2026)
by: Chen, Zixuan, et al.
Published: (2026)
To Agree or To Be Right? The Grounding-Sycophancy Tradeoff in Medical Vision-Language Models
by: Aranya, OFM Riaz Rahman, et al.
Published: (2026)
by: Aranya, OFM Riaz Rahman, et al.
Published: (2026)
Sycophancy as compositions of Atomic Psychometric Traits
by: Jain, Shreyans, et al.
Published: (2025)
by: Jain, Shreyans, et al.
Published: (2025)
Data Unlearning in Diffusion Models
by: Alberti, Silas, et al.
Published: (2025)
by: Alberti, Silas, et al.
Published: (2025)
Not Just RLHF: Why Alignment Alone Won't Fix Multi-Agent Sycophancy
by: Kumarappan, Adarsh, et al.
Published: (2026)
by: Kumarappan, Adarsh, et al.
Published: (2026)
When LLMs Learn to Be Consistently Wrong: A Multi-Model Study of Linear Representations of Synthetic Deception
by: Zolfaghari, Vahideh
Published: (2026)
by: Zolfaghari, Vahideh
Published: (2026)
It's Not Always Sycophancy: Measuring LLM Conformity as a Function of Epistemic Uncertainty
by: Guo, Kevin H., et al.
Published: (2026)
by: Guo, Kevin H., et al.
Published: (2026)
Reasoning Models Know When They're Right: Probing Hidden States for Self-Verification
by: Zhang, Anqi, et al.
Published: (2025)
by: Zhang, Anqi, et al.
Published: (2025)
Out-of-Distribution Detection Methods Answer the Wrong Questions
by: Li, Yucen Lily, et al.
Published: (2025)
by: Li, Yucen Lily, et al.
Published: (2025)
Your Assumed DAG is Wrong and Here's How To Deal With It
by: Padh, Kirtan, et al.
Published: (2025)
by: Padh, Kirtan, et al.
Published: (2025)
Similar Items
-
Do Androids Know They're Only Dreaming of Electric Sheep?
by: CH-Wang, Sky, et al.
Published: (2023) -
Know When You're Wrong: Aligning Confidence with Correctness for LLM Error Detection
by: Xiaohu, Xie, et al.
Published: (2026) -
Letting Others Know How They're Doing.
by: Hartzell, Gary
Published: (1993) -
AgreeMate: Teaching LLMs to Haggle
by: Chatterjee, Ainesh, et al.
Published: (2024) -
They're All Doctors: Synthesizing Diverse Counterfactuals to Mitigate Associative Bias
by: Magid, Salma Abdel, et al.
Published: (2024)