Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Serrà, Joan, Goswami, Dipam, Morreale, Fabio, Liao, Wei-Hsiang, Mitsufuji, Yuki
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.17938
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914577347772416
author	Serrà, Joan Goswami, Dipam Morreale, Fabio Liao, Wei-Hsiang Mitsufuji, Yuki
author_facet	Serrà, Joan Goswami, Dipam Morreale, Fabio Liao, Wei-Hsiang Mitsufuji, Yuki
contents	Training data attribution (TDA) should enable generative model interpretability and foster a variety of related downstream tasks. Nonetheless, current TDA approaches lack reliability and robustness, preventing their adoption in real-world setups. In this paper, we take a decisive step towards more reliable and robust TDA for diffusion models. We propose to perform TDA with mirrored unlearning and noise-consistent skew (MUCS). The idea is to fine-tune a second model with bounded mirrored gradient ascent, and to measure the normalized skew of this model with respect to the original one using consistent noise samples. We show that, while being conceptually simple and generic, MUCS systematically outperforms existing methods on three different datasets by a large margin. We additionally study the effect that core design choices have on final performance, and analyze novel aspects regarding the overlap of influential instances across generated items and the potential of ensembling TDA approaches. We believe that our findings may have broader implications for more general unlearning setups, as well as for tasks requiring the comparison of diffusion losses.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_17938
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Training data attribution in diffusion models via mirrored unlearning and noise-consistent skew Serrà, Joan Goswami, Dipam Morreale, Fabio Liao, Wei-Hsiang Mitsufuji, Yuki Machine Learning Artificial Intelligence Training data attribution (TDA) should enable generative model interpretability and foster a variety of related downstream tasks. Nonetheless, current TDA approaches lack reliability and robustness, preventing their adoption in real-world setups. In this paper, we take a decisive step towards more reliable and robust TDA for diffusion models. We propose to perform TDA with mirrored unlearning and noise-consistent skew (MUCS). The idea is to fine-tune a second model with bounded mirrored gradient ascent, and to measure the normalized skew of this model with respect to the original one using consistent noise samples. We show that, while being conceptually simple and generic, MUCS systematically outperforms existing methods on three different datasets by a large margin. We additionally study the effect that core design choices have on final performance, and analyze novel aspects regarding the overlap of influential instances across generated items and the potential of ensembling TDA approaches. We believe that our findings may have broader implications for more general unlearning setups, as well as for tasks requiring the comparison of diffusion losses.
title	Training data attribution in diffusion models via mirrored unlearning and noise-consistent skew
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2605.17938

Similar Items