Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Oorloff, Trevine, Yacoob, Yaser, Shrivastava, Abhinav
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2502.16872
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929728695304192
author	Oorloff, Trevine Yacoob, Yaser Shrivastava, Abhinav
author_facet	Oorloff, Trevine Yacoob, Yaser Shrivastava, Abhinav
contents	Diffusion models, while increasingly adept at generating realistic images, are notably hindered by hallucinations -- unrealistic or incorrect features inconsistent with the trained data distribution. In this work, we propose Adaptive Attention Modulation (AAM), a novel approach to mitigate hallucinations by analyzing and modulating the self-attention mechanism in diffusion models. We hypothesize that self-attention during early denoising steps may inadvertently amplify or suppress features, contributing to hallucinations. To counter this, AAM introduces a temperature scaling mechanism within the softmax operation of the self-attention layers, dynamically modulating the attention distribution during inference. Additionally, AAM employs a masked perturbation technique to disrupt early-stage noise that may otherwise propagate into later stages as hallucinations. Extensive experiments demonstrate that AAM effectively reduces hallucinatory artifacts, enhancing both the fidelity and reliability of generated images. For instance, the proposed approach improves the FID score by 20.8% and reduces the percentage of hallucinated images by 12.9% (in absolute terms) on the Hands dataset.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_16872
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Mitigating Hallucinations in Diffusion Models through Adaptive Attention Modulation Oorloff, Trevine Yacoob, Yaser Shrivastava, Abhinav Computer Vision and Pattern Recognition Diffusion models, while increasingly adept at generating realistic images, are notably hindered by hallucinations -- unrealistic or incorrect features inconsistent with the trained data distribution. In this work, we propose Adaptive Attention Modulation (AAM), a novel approach to mitigate hallucinations by analyzing and modulating the self-attention mechanism in diffusion models. We hypothesize that self-attention during early denoising steps may inadvertently amplify or suppress features, contributing to hallucinations. To counter this, AAM introduces a temperature scaling mechanism within the softmax operation of the self-attention layers, dynamically modulating the attention distribution during inference. Additionally, AAM employs a masked perturbation technique to disrupt early-stage noise that may otherwise propagate into later stages as hallucinations. Extensive experiments demonstrate that AAM effectively reduces hallucinatory artifacts, enhancing both the fidelity and reliability of generated images. For instance, the proposed approach improves the FID score by 20.8% and reduces the percentage of hallucinated images by 12.9% (in absolute terms) on the Hands dataset.
title	Mitigating Hallucinations in Diffusion Models through Adaptive Attention Modulation
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2502.16872

Similar Items