Saved in:
| Main Authors: | Zhao, Yunpu, Zhang, Rui, Xiao, Junbin, Ke, Changxin, Hou, Ruibo, Hao, Yifan, Li, Ling |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.11261 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Object-Level Verbalized Confidence Calibration in Vision-Language Models via Semantic Perturbation
by: Zhao, Yunpu, et al.
Published: (2025)
by: Zhao, Yunpu, et al.
Published: (2025)
Sycophancy in Large Language Models: Causes and Mitigations
by: Malmqvist, Lars
Published: (2024)
by: Malmqvist, Lars
Published: (2024)
Beacon: Single-Turn Diagnosis and Mitigation of Latent Sycophancy in Large Language Models
by: Pandey, Sanskar, et al.
Published: (2025)
by: Pandey, Sanskar, et al.
Published: (2025)
Assessing and Understanding Creativity in Large Language Models
by: Zhao, Yunpu, et al.
Published: (2024)
by: Zhao, Yunpu, et al.
Published: (2024)
Not Your Typical Sycophant: The Elusive Nature of Sycophancy in Large Language Models
by: Natan, Shahar Ben, et al.
Published: (2026)
by: Natan, Shahar Ben, et al.
Published: (2026)
Self-Blinding and Counterfactual Self-Simulation Mitigate Biases and Sycophancy in Large Language Models
by: Christian, Brian, et al.
Published: (2026)
by: Christian, Brian, et al.
Published: (2026)
Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models
by: Denison, Carson, et al.
Published: (2024)
by: Denison, Carson, et al.
Published: (2024)
Benchmarking and Mitigating Sycophancy in Medical Vision Language Models
by: Xu, Juangui, et al.
Published: (2025)
by: Xu, Juangui, et al.
Published: (2025)
Accounting for Sycophancy in Language Model Uncertainty Estimation
by: Sicilia, Anthony, et al.
Published: (2024)
by: Sicilia, Anthony, et al.
Published: (2024)
Unveiling Trust in Multimodal Large Language Models: Evaluation, Analysis, and Mitigation
by: Zhang, Yichi, et al.
Published: (2025)
by: Zhang, Yichi, et al.
Published: (2025)
Towards Understanding Sycophancy in Language Models
by: Sharma, Mrinank, et al.
Published: (2023)
by: Sharma, Mrinank, et al.
Published: (2023)
MONICA: Real-Time Monitoring and Calibration of Chain-of-Thought Sycophancy in Large Reasoning Models
by: Hu, Jingyu, et al.
Published: (2025)
by: Hu, Jingyu, et al.
Published: (2025)
Moral Sycophancy in Vision Language Models
by: Rabby, Shadman, et al.
Published: (2026)
by: Rabby, Shadman, et al.
Published: (2026)
Improving LLM Reasoning through Interpretable Role-Playing Steering
by: Wang, Anyi, et al.
Published: (2025)
by: Wang, Anyi, et al.
Published: (2025)
Internal Reasoning vs. External Control: A Thermodynamic Analysis of Sycophancy in Large Language Models
by: Chang, Edward Y.
Published: (2025)
by: Chang, Edward Y.
Published: (2025)
BASIL: Bayesian Assessment of Sycophancy in LLMs
by: Atwell, Katherine, et al.
Published: (2025)
by: Atwell, Katherine, et al.
Published: (2025)
Sycophancy Hides Linearly in the Attention Heads
by: Genadi, Rifo, et al.
Published: (2026)
by: Genadi, Rifo, et al.
Published: (2026)
Too Nice to Tell the Truth: Quantifying Agreeableness-Driven Sycophancy in Role-Playing Language Models
by: Shah, Arya, et al.
Published: (2026)
by: Shah, Arya, et al.
Published: (2026)
Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy
by: Jiang, Eric Hanchen, et al.
Published: (2025)
by: Jiang, Eric Hanchen, et al.
Published: (2025)
Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models
by: Liu, Qin, et al.
Published: (2024)
by: Liu, Qin, et al.
Published: (2024)
Fake News Detection and Manipulation Reasoning via Large Vision-Language Models
by: Jin, Ruihan, et al.
Published: (2024)
by: Jin, Ruihan, et al.
Published: (2024)
A Systematic Analysis of Biases in Large Language Models
by: Zhang, Xulang, et al.
Published: (2025)
by: Zhang, Xulang, et al.
Published: (2025)
Can Vision-Language Models Solve Visual Math Equations?
by: Choudhury, Monjoy Narayan, et al.
Published: (2025)
by: Choudhury, Monjoy Narayan, et al.
Published: (2025)
Defining and Evaluating Decision and Composite Risk in Language Models Applied to Natural Language Inference
by: Shen, Ke, et al.
Published: (2024)
by: Shen, Ke, et al.
Published: (2024)
Detecting and Mitigating Hateful Content in Multimodal Memes with Vision-Language Models
by: Van, Minh-Hao, et al.
Published: (2025)
by: Van, Minh-Hao, et al.
Published: (2025)
TimeSense:Making Large Language Models Proficient in Time-Series Analysis
by: Zhang, Zhirui, et al.
Published: (2025)
by: Zhang, Zhirui, et al.
Published: (2025)
A Systematic Study of Compositional Syntactic Transformer Language Models
by: Zhao, Yida, et al.
Published: (2025)
by: Zhao, Yida, et al.
Published: (2025)
GenTKG: Generative Forecasting on Temporal Knowledge Graph with Large Language Models
by: Liao, Ruotong, et al.
Published: (2023)
by: Liao, Ruotong, et al.
Published: (2023)
TDBench: A Benchmark for Top-Down Image Understanding with Reliability Analysis of Vision-Language Models
by: Hou, Kaiyuan, et al.
Published: (2025)
by: Hou, Kaiyuan, et al.
Published: (2025)
Unraveling and Mitigating Retriever Inconsistencies in Retrieval-Augmented Large Language Models
by: Li, Mingda, et al.
Published: (2024)
by: Li, Mingda, et al.
Published: (2024)
Augmented Vision-Language Models: A Systematic Review
by: Davis, Anthony C, et al.
Published: (2025)
by: Davis, Anthony C, et al.
Published: (2025)
A Markov Categorical Framework for Language Modeling
by: Zhang, Yifan
Published: (2025)
by: Zhang, Yifan
Published: (2025)
High-Quality Data Augmentation for Low-Resource NMT: Combining a Translation Memory, a GAN Generator, and Filtering
by: Liu, Hengjie, et al.
Published: (2024)
by: Liu, Hengjie, et al.
Published: (2024)
Reasoning Isn't Enough: Examining Truth-Bias and Sycophancy in LLMs
by: Barkett, Emilio, et al.
Published: (2025)
by: Barkett, Emilio, et al.
Published: (2025)
CoT-Kinetics: A Theoretical Modeling Assessing LRM Reasoning Process
by: Bi, Jinhe, et al.
Published: (2025)
by: Bi, Jinhe, et al.
Published: (2025)
TrueBrief: Faithful Summarization through Small Language Models
by: Lakara, Kumud, et al.
Published: (2025)
by: Lakara, Kumud, et al.
Published: (2025)
CRANE: Causal Relevance Analysis of Language-Specific Neurons in Multilingual Large Language Models
by: Le, Yifan, et al.
Published: (2026)
by: Le, Yifan, et al.
Published: (2026)
dInfer: An Efficient Inference Framework for Diffusion Language Models
by: Ma, Yuxin, et al.
Published: (2025)
by: Ma, Yuxin, et al.
Published: (2025)
From Sycophancy to Sensemaking: Premise Governance for Human-AI Decision Making
by: Jain, Raunak
Published: (2026)
by: Jain, Raunak
Published: (2026)
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
by: Abbasli, Toghrul, et al.
Published: (2025)
by: Abbasli, Toghrul, et al.
Published: (2025)
Similar Items
-
Object-Level Verbalized Confidence Calibration in Vision-Language Models via Semantic Perturbation
by: Zhao, Yunpu, et al.
Published: (2025) -
Sycophancy in Large Language Models: Causes and Mitigations
by: Malmqvist, Lars
Published: (2024) -
Beacon: Single-Turn Diagnosis and Mitigation of Latent Sycophancy in Large Language Models
by: Pandey, Sanskar, et al.
Published: (2025) -
Assessing and Understanding Creativity in Large Language Models
by: Zhao, Yunpu, et al.
Published: (2024) -
Not Your Typical Sycophant: The Elusive Nature of Sycophancy in Large Language Models
by: Natan, Shahar Ben, et al.
Published: (2026)