:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Fujie, Yu, Peiqi, Yi, Biao, Zhang, Baolei, Li, Tong, Liu, Zheli
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2411.04847
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

CTRAP: Embedding Collapse Trap to Safeguard Large Language Models from Harmful Fine-Tuning
by: Yi, Biao, et al.
Published: (2025)

BadActs: A Universal Backdoor Defense in the Activation Space
by: Yi, Biao, et al.
Published: (2024)

Gradient Surgery for Safe LLM Fine-Tuning
by: Yi, Biao, et al.
Published: (2025)

Probe before You Talk: Towards Black-box Defense against Backdoor Unalignment for Large Language Models
by: Yi, Biao, et al.
Published: (2025)

BadReasoner: Planting Tunable Overthinking Backdoors into Large Reasoning Models for Fun or Profit
by: Yi, Biao, et al.
Published: (2025)

Hallucination Detection via Internal States and Structured Reasoning Consistency in Large Language Models
by: Song, Yusheng, et al.
Published: (2025)

Detection Method for Prompt Injection by Integrating Pre-trained Model and Heuristic Feature Engineering
by: Ji, Yi, et al.
Published: (2025)

Confabulation: The Surprising Value of Large Language Model Hallucinations
by: Sui, Peiqi, et al.
Published: (2024)

Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language Models
by: Su, Weihang, et al.
Published: (2024)

INSIDE: LLMs' Internal States Retain the Power of Hallucination Detection
by: Chen, Chao, et al.
Published: (2024)

Detecting Hallucinations in Large Language Models via Internal Attention Divergence Signals
by: van Dijk, Gijs
Published: (2026)

Attention Sinks as Internal Signals for Hallucination Detection in Large Language Models
by: Binkowski, Jakub, et al.
Published: (2026)

Knowledge Overshadowing Causes Amalgamated Hallucination in Large Language Models
by: Zhang, Yuji, et al.
Published: (2024)

Bolster Hallucination Detection via Prompt-Guided Data Augmentation
by: Li, Wenyun, et al.
Published: (2025)

Hallucination Detection and Evaluation of Large Language Model
by: Zhang, Chenggong, et al.
Published: (2025)

Practical Framework for Privacy-Preserving and Byzantine-robust Federated Learning
by: Zhang, Baolei, et al.
Published: (2025)

Enhancing Robustness in Large Language Models: Prompting for Mitigating the Impact of Irrelevant Information
by: Jiang, Ming, et al.
Published: (2024)

HaluNet: Learning Hallucination Risk from Internal Signals in LLM Question Answering
by: Tong, Chaodong, et al.
Published: (2025)

Active Prompting with Chain-of-Thought for Large Language Models
by: Diao, Shizhe, et al.
Published: (2023)

Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language Models: An Empirical Evaluation
by: Cheng, Jiahao, et al.
Published: (2025)

Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models
by: Chen, Yuyan, et al.
Published: (2024)

Mitigating Prompt-Induced Hallucinations in Large Language Models via Structured Reasoning
by: Hao, Jinbo, et al.
Published: (2026)

Mitigating Hallucinations in Large Vision-Language Models by Self-Injecting Hallucinations
by: Lu, Yifan, et al.
Published: (2025)

PretrainRL: Alleviating Factuality Hallucination of Large Language Models at the Beginning
by: Liu, Langming, et al.
Published: (2026)

PromptIntern: Saving Inference Costs by Internalizing Recurrent Prompt during Large Language Model Fine-tuning
by: Zou, Jiaru, et al.
Published: (2024)

Efficient Detection of Toxic Prompts in Large Language Models
by: Liu, Yi, et al.
Published: (2024)

Loki's Dance of Illusions: A Comprehensive Survey of Hallucination in Large Language Models
by: Li, Chaozhuo, et al.
Published: (2025)

Preference Orchestrator: Prompt-Aware Multi-Objective Alignment for Large Language Models
by: Liu, Biao, et al.
Published: (2025)

Traceback of Poisoning Attacks to Retrieval-Augmented Generation
by: Zhang, Baolei, et al.
Published: (2025)

Hallucination Detection with the Internal Layers of LLMs
by: Preiß, Martin
Published: (2025)

Alleviating Hallucinations of Large Language Models through Induced Hallucinations
by: Zhang, Yue, et al.
Published: (2023)

Calibrating Reasoning in Language Models with Internal Consistency
by: Xie, Zhihui, et al.
Published: (2024)

Triggering Hallucinations in LLMs: A Quantitative Study of Prompt-Induced Hallucination in Large Language Models
by: Sato, Makoto
Published: (2025)

Critical Confabulation: Can LLMs Hallucinate for Social Good?
by: Sui, Peiqi, et al.
Published: (2025)

Mitigating Hallucinations in Large Vision-Language Models with Internal Fact-based Contrastive Decoding
by: Wang, Chao, et al.
Published: (2025)

HIVE: Hidden-Evidence Verification for Hallucination Detection in Diffusion Large Language Models
by: Zhao, Guoshenghui, et al.
Published: (2026)

Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models
by: Huang, Yanwen, et al.
Published: (2025)

Scalable Token-Level Hallucination Detection in Large Language Models
by: Min, Rui, et al.
Published: (2026)

HAD: HAllucination Detection Language Models Based on a Comprehensive Hallucination Taxonomy
by: Xu, Fan, et al.
Published: (2025)

Prompt-Response Semantic Divergence Metrics for Faithfulness Hallucination and Misalignment Detection in Large Language Models
by: Halperin, Igor
Published: (2025)