:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Siyuan, Zhang, Yichi, Dong, Yinpeng, Su, Hang
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2502.19127
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Reasoning as State Transition: A Representational Analysis of Reasoning Evolution in Large Language Models
by: Zhang, Siyuan, et al.
Published: (2026)

Mitigating Overthinking in Large Reasoning Models via Manifold Steering
by: Huang, Yao, et al.
Published: (2025)

Exploring the Transferability of Visual Prompting for Multimodal Large Language Models
by: Zhang, Yichi, et al.
Published: (2024)

Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
by: Zhang, Xiaoying, et al.
Published: (2024)

Breaking the Ceiling: Exploring the Potential of Jailbreak Attacks through Expanding Strategy Space
by: Huang, Yao, et al.
Published: (2025)

Mechanistic Understanding and Mitigation of Language Model Non-Factual Hallucinations
by: Yu, Lei, et al.
Published: (2024)

Towards Safe Reasoning in Large Reasoning Models via Corrective Intervention
by: Zhang, Yichi, et al.
Published: (2025)

Evil Geniuses: Delving into the Safety of LLM-based Agents
by: Tian, Yu, et al.
Published: (2023)

Unveiling Trust in Multimodal Large Language Models: Evaluation, Analysis, and Mitigation
by: Zhang, Yichi, et al.
Published: (2025)

STAIR: Improving Safety Alignment with Introspective Reasoning
by: Zhang, Yichi, et al.
Published: (2025)

Understanding New-Knowledge-Induced Factual Hallucinations in LLMs: Analysis and Interpretation
by: Dang, Renfei, et al.
Published: (2025)

Mitigating Geospatial Knowledge Hallucination in Large Language Models: Benchmarking and Dynamic Factuality Aligning
by: Wang, Shengyuan, et al.
Published: (2025)

BSPA: Exploring Black-box Stealthy Prompt Attacks against Image Generators
by: Tian, Yu, et al.
Published: (2024)

DeceptionBench: A Comprehensive Benchmark for AI Deception Behaviors in Real-world Scenarios
by: Huang, Yao, et al.
Published: (2025)

RealSafe-R1: Safety-Aligned DeepSeek-R1 without Compromising Reasoning Capability
by: Zhang, Yichi, et al.
Published: (2025)

KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking
by: Zhang, Jiawei, et al.
Published: (2024)

One SPACE to Rule Them All: Jointly Mitigating Factuality and Faithfulness Hallucinations in LLMs
by: Wang, Pengbo, et al.
Published: (2025)

Monitoring Decoding: Mitigating Hallucination via Evaluating the Factuality of Partial Response during Generation
by: Chang, Yurui, et al.
Published: (2025)

Evaluating the Factuality of Large Language Models using Large-Scale Knowledge Graphs
by: Liu, Xiaoze, et al.
Published: (2024)

UAQFact: Evaluating Factual Knowledge Utilization of LLMs on Unanswerable Questions
by: Tan, Chuanyuan, et al.
Published: (2025)

Safety Alignment as Continual Learning: Mitigating the Alignment Tax via Orthogonal Gradient Projection
by: Sun, Guanglong, et al.
Published: (2026)

Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases
by: Li, Jiarui, et al.
Published: (2024)

PretrainRL: Alleviating Factuality Hallucination of Large Language Models at the Beginning
by: Liu, Langming, et al.
Published: (2026)

Exploring and Mitigating Fawning Hallucinations in Large Language Models
by: Shangguan, Zixuan, et al.
Published: (2025)

Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification
by: Zhang, Yichi, et al.
Published: (2026)

Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs
by: Gu, Yuzhe, et al.
Published: (2025)

Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking
by: Zhang, Yichi, et al.
Published: (2024)

Reducing Hallucinations in LLMs via Factuality-Aware Preference Learning
by: Chaduvula, Sindhuja, et al.
Published: (2026)

JointCQ: Improving Factual Hallucination Detection with Joint Claim and Query Generation
by: Xu, Fan, et al.
Published: (2025)

KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality
by: Ren, Baochang, et al.
Published: (2025)

Rethinking Model Ensemble in Transfer-based Adversarial Attacks
by: Chen, Huanran, et al.
Published: (2023)

HICD: Hallucination-Inducing via Attention Dispersion for Contrastive Decoding to Mitigate Hallucinations in Large Language Models
by: Jiang, Xinyan, et al.
Published: (2025)

Mitigating Hallucination on Hallucination in RAG via Ensemble Voting
by: Xie, Zequn, et al.
Published: (2026)

Knowledgeable In-Context Tuning: Exploring and Exploiting Factual Knowledge for In-Context Learning
by: Wang, Jianing, et al.
Published: (2023)

On-Policy Self-Alignment with Fine-grained Knowledge Feedback for Hallucination Mitigation
by: Wen, Xueru, et al.
Published: (2024)

Dialectic-Med: Mitigating Diagnostic Hallucinations via Counterfactual Adversarial Multi-Agent Debate
by: Lu, Zhixiang, et al.
Published: (2026)

Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation
by: Nguyen, Hieu, et al.
Published: (2025)

On Early Detection of Hallucinations in Factual Question Answering
by: Snyder, Ben, et al.
Published: (2023)

Multilingual Knowledge Editing with Language-Agnostic Factual Neurons
by: Zhang, Xue, et al.
Published: (2024)

Mitigating Multimodal Hallucination via Phase-wise Self-reward
by: Zhang, Yu, et al.
Published: (2026)