Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhang, Dong-Xiao, Lou, Hu, Zhang, Jun-Jie, Zhu, Jun, Meng, Deyu
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Information Theory Computational Physics
Online Access:	https://arxiv.org/abs/2603.19562
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908916712996864
author	Zhang, Dong-Xiao Lou, Hu Zhang, Jun-Jie Zhu, Jun Meng, Deyu
author_facet	Zhang, Dong-Xiao Lou, Hu Zhang, Jun-Jie Zhu, Jun Meng, Deyu
contents	Adversarial vulnerability in vision and hallucination in large language models are conventionally viewed as separate problems, each addressed with modality-specific patches. This study first reveals that they share a common geometric origin: the input and its loss gradient are conjugate observables subject to an irreducible uncertainty bound. Formalizing a Neural Uncertainty Principle (NUP) under a loss-induced state, we find that in near-bound regimes, further compression must be accompanied by increased sensitivity dispersion (adversarial fragility), while weak prompt-gradient coupling leaves generation under-constrained (hallucination). Crucially, this bound is modulated by an input-gradient correlation channel, captured by a specifically designed single-backward probe. In vision, masking highly coupled components improves robustness without costly adversarial training; in language, the same prefill-stage probe detects hallucination risk before generating any answer tokens. NUP thus turns two seemingly separate failure taxonomies into a shared uncertainty-budget view and provides a principled lens for reliability analysis. Guided by this NUP theory, we propose ConjMask (masking high-contribution input components) and LogitReg (logit-side regularization) to improve robustness without adversarial training, and use the probe as a decoding-free risk signal for LLMs, enabling hallucination detection and prompt selection. NUP thus provides a unified, practical framework for diagnosing and mitigating boundary anomalies across perception and generation tasks.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_19562
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Neural Uncertainty Principle: A Unified View of Adversarial Fragility and LLM Hallucination Zhang, Dong-Xiao Lou, Hu Zhang, Jun-Jie Zhu, Jun Meng, Deyu Machine Learning Information Theory Computational Physics Adversarial vulnerability in vision and hallucination in large language models are conventionally viewed as separate problems, each addressed with modality-specific patches. This study first reveals that they share a common geometric origin: the input and its loss gradient are conjugate observables subject to an irreducible uncertainty bound. Formalizing a Neural Uncertainty Principle (NUP) under a loss-induced state, we find that in near-bound regimes, further compression must be accompanied by increased sensitivity dispersion (adversarial fragility), while weak prompt-gradient coupling leaves generation under-constrained (hallucination). Crucially, this bound is modulated by an input-gradient correlation channel, captured by a specifically designed single-backward probe. In vision, masking highly coupled components improves robustness without costly adversarial training; in language, the same prefill-stage probe detects hallucination risk before generating any answer tokens. NUP thus turns two seemingly separate failure taxonomies into a shared uncertainty-budget view and provides a principled lens for reliability analysis. Guided by this NUP theory, we propose ConjMask (masking high-contribution input components) and LogitReg (logit-side regularization) to improve robustness without adversarial training, and use the probe as a decoding-free risk signal for LLMs, enabling hallucination detection and prompt selection. NUP thus provides a unified, practical framework for diagnosing and mitigating boundary anomalies across perception and generation tasks.
title	Neural Uncertainty Principle: A Unified View of Adversarial Fragility and LLM Hallucination
topic	Machine Learning Information Theory Computational Physics
url	https://arxiv.org/abs/2603.19562

Similar Items