Saved in:
| Main Authors: | Zhang, Siyuan, Zhang, Yichi, Dong, Yinpeng, Su, Hang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.19127 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Reasoning as State Transition: A Representational Analysis of Reasoning Evolution in Large Language Models
by: Zhang, Siyuan, et al.
Published: (2026)
by: Zhang, Siyuan, et al.
Published: (2026)
Mitigating Overthinking in Large Reasoning Models via Manifold Steering
by: Huang, Yao, et al.
Published: (2025)
by: Huang, Yao, et al.
Published: (2025)
Exploring the Transferability of Visual Prompting for Multimodal Large Language Models
by: Zhang, Yichi, et al.
Published: (2024)
by: Zhang, Yichi, et al.
Published: (2024)
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
by: Zhang, Xiaoying, et al.
Published: (2024)
by: Zhang, Xiaoying, et al.
Published: (2024)
Breaking the Ceiling: Exploring the Potential of Jailbreak Attacks through Expanding Strategy Space
by: Huang, Yao, et al.
Published: (2025)
by: Huang, Yao, et al.
Published: (2025)
Mechanistic Understanding and Mitigation of Language Model Non-Factual Hallucinations
by: Yu, Lei, et al.
Published: (2024)
by: Yu, Lei, et al.
Published: (2024)
Towards Safe Reasoning in Large Reasoning Models via Corrective Intervention
by: Zhang, Yichi, et al.
Published: (2025)
by: Zhang, Yichi, et al.
Published: (2025)
Evil Geniuses: Delving into the Safety of LLM-based Agents
by: Tian, Yu, et al.
Published: (2023)
by: Tian, Yu, et al.
Published: (2023)
Unveiling Trust in Multimodal Large Language Models: Evaluation, Analysis, and Mitigation
by: Zhang, Yichi, et al.
Published: (2025)
by: Zhang, Yichi, et al.
Published: (2025)
STAIR: Improving Safety Alignment with Introspective Reasoning
by: Zhang, Yichi, et al.
Published: (2025)
by: Zhang, Yichi, et al.
Published: (2025)
Understanding New-Knowledge-Induced Factual Hallucinations in LLMs: Analysis and Interpretation
by: Dang, Renfei, et al.
Published: (2025)
by: Dang, Renfei, et al.
Published: (2025)
Mitigating Geospatial Knowledge Hallucination in Large Language Models: Benchmarking and Dynamic Factuality Aligning
by: Wang, Shengyuan, et al.
Published: (2025)
by: Wang, Shengyuan, et al.
Published: (2025)
BSPA: Exploring Black-box Stealthy Prompt Attacks against Image Generators
by: Tian, Yu, et al.
Published: (2024)
by: Tian, Yu, et al.
Published: (2024)
DeceptionBench: A Comprehensive Benchmark for AI Deception Behaviors in Real-world Scenarios
by: Huang, Yao, et al.
Published: (2025)
by: Huang, Yao, et al.
Published: (2025)
RealSafe-R1: Safety-Aligned DeepSeek-R1 without Compromising Reasoning Capability
by: Zhang, Yichi, et al.
Published: (2025)
by: Zhang, Yichi, et al.
Published: (2025)
KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking
by: Zhang, Jiawei, et al.
Published: (2024)
by: Zhang, Jiawei, et al.
Published: (2024)
One SPACE to Rule Them All: Jointly Mitigating Factuality and Faithfulness Hallucinations in LLMs
by: Wang, Pengbo, et al.
Published: (2025)
by: Wang, Pengbo, et al.
Published: (2025)
Monitoring Decoding: Mitigating Hallucination via Evaluating the Factuality of Partial Response during Generation
by: Chang, Yurui, et al.
Published: (2025)
by: Chang, Yurui, et al.
Published: (2025)
Evaluating the Factuality of Large Language Models using Large-Scale Knowledge Graphs
by: Liu, Xiaoze, et al.
Published: (2024)
by: Liu, Xiaoze, et al.
Published: (2024)
UAQFact: Evaluating Factual Knowledge Utilization of LLMs on Unanswerable Questions
by: Tan, Chuanyuan, et al.
Published: (2025)
by: Tan, Chuanyuan, et al.
Published: (2025)
Safety Alignment as Continual Learning: Mitigating the Alignment Tax via Orthogonal Gradient Projection
by: Sun, Guanglong, et al.
Published: (2026)
by: Sun, Guanglong, et al.
Published: (2026)
Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases
by: Li, Jiarui, et al.
Published: (2024)
by: Li, Jiarui, et al.
Published: (2024)
PretrainRL: Alleviating Factuality Hallucination of Large Language Models at the Beginning
by: Liu, Langming, et al.
Published: (2026)
by: Liu, Langming, et al.
Published: (2026)
Exploring and Mitigating Fawning Hallucinations in Large Language Models
by: Shangguan, Zixuan, et al.
Published: (2025)
by: Shangguan, Zixuan, et al.
Published: (2025)
Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification
by: Zhang, Yichi, et al.
Published: (2026)
by: Zhang, Yichi, et al.
Published: (2026)
Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs
by: Gu, Yuzhe, et al.
Published: (2025)
by: Gu, Yuzhe, et al.
Published: (2025)
Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking
by: Zhang, Yichi, et al.
Published: (2024)
by: Zhang, Yichi, et al.
Published: (2024)
Reducing Hallucinations in LLMs via Factuality-Aware Preference Learning
by: Chaduvula, Sindhuja, et al.
Published: (2026)
by: Chaduvula, Sindhuja, et al.
Published: (2026)
JointCQ: Improving Factual Hallucination Detection with Joint Claim and Query Generation
by: Xu, Fan, et al.
Published: (2025)
by: Xu, Fan, et al.
Published: (2025)
KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality
by: Ren, Baochang, et al.
Published: (2025)
by: Ren, Baochang, et al.
Published: (2025)
Rethinking Model Ensemble in Transfer-based Adversarial Attacks
by: Chen, Huanran, et al.
Published: (2023)
by: Chen, Huanran, et al.
Published: (2023)
HICD: Hallucination-Inducing via Attention Dispersion for Contrastive Decoding to Mitigate Hallucinations in Large Language Models
by: Jiang, Xinyan, et al.
Published: (2025)
by: Jiang, Xinyan, et al.
Published: (2025)
Mitigating Hallucination on Hallucination in RAG via Ensemble Voting
by: Xie, Zequn, et al.
Published: (2026)
by: Xie, Zequn, et al.
Published: (2026)
Knowledgeable In-Context Tuning: Exploring and Exploiting Factual Knowledge for In-Context Learning
by: Wang, Jianing, et al.
Published: (2023)
by: Wang, Jianing, et al.
Published: (2023)
On-Policy Self-Alignment with Fine-grained Knowledge Feedback for Hallucination Mitigation
by: Wen, Xueru, et al.
Published: (2024)
by: Wen, Xueru, et al.
Published: (2024)
Dialectic-Med: Mitigating Diagnostic Hallucinations via Counterfactual Adversarial Multi-Agent Debate
by: Lu, Zhixiang, et al.
Published: (2026)
by: Lu, Zhixiang, et al.
Published: (2026)
Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation
by: Nguyen, Hieu, et al.
Published: (2025)
by: Nguyen, Hieu, et al.
Published: (2025)
On Early Detection of Hallucinations in Factual Question Answering
by: Snyder, Ben, et al.
Published: (2023)
by: Snyder, Ben, et al.
Published: (2023)
Multilingual Knowledge Editing with Language-Agnostic Factual Neurons
by: Zhang, Xue, et al.
Published: (2024)
by: Zhang, Xue, et al.
Published: (2024)
Mitigating Multimodal Hallucination via Phase-wise Self-reward
by: Zhang, Yu, et al.
Published: (2026)
by: Zhang, Yu, et al.
Published: (2026)
Similar Items
-
Reasoning as State Transition: A Representational Analysis of Reasoning Evolution in Large Language Models
by: Zhang, Siyuan, et al.
Published: (2026) -
Mitigating Overthinking in Large Reasoning Models via Manifold Steering
by: Huang, Yao, et al.
Published: (2025) -
Exploring the Transferability of Visual Prompting for Multimodal Large Language Models
by: Zhang, Yichi, et al.
Published: (2024) -
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
by: Zhang, Xiaoying, et al.
Published: (2024) -
Breaking the Ceiling: Exploring the Potential of Jailbreak Attacks through Expanding Strategy Space
by: Huang, Yao, et al.
Published: (2025)