Saved in:
Bibliographic Details
Main Authors: Serrà, Joan, Goswami, Dipam, Morreale, Fabio, Liao, Wei-Hsiang, Mitsufuji, Yuki
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2605.17938
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914577347772416
author Serrà, Joan
Goswami, Dipam
Morreale, Fabio
Liao, Wei-Hsiang
Mitsufuji, Yuki
author_facet Serrà, Joan
Goswami, Dipam
Morreale, Fabio
Liao, Wei-Hsiang
Mitsufuji, Yuki
contents Training data attribution (TDA) should enable generative model interpretability and foster a variety of related downstream tasks. Nonetheless, current TDA approaches lack reliability and robustness, preventing their adoption in real-world setups. In this paper, we take a decisive step towards more reliable and robust TDA for diffusion models. We propose to perform TDA with mirrored unlearning and noise-consistent skew (MUCS). The idea is to fine-tune a second model with bounded mirrored gradient ascent, and to measure the normalized skew of this model with respect to the original one using consistent noise samples. We show that, while being conceptually simple and generic, MUCS systematically outperforms existing methods on three different datasets by a large margin. We additionally study the effect that core design choices have on final performance, and analyze novel aspects regarding the overlap of influential instances across generated items and the potential of ensembling TDA approaches. We believe that our findings may have broader implications for more general unlearning setups, as well as for tasks requiring the comparison of diffusion losses.
format Preprint
id arxiv_https___arxiv_org_abs_2605_17938
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Training data attribution in diffusion models via mirrored unlearning and noise-consistent skew
Serrà, Joan
Goswami, Dipam
Morreale, Fabio
Liao, Wei-Hsiang
Mitsufuji, Yuki
Machine Learning
Artificial Intelligence
Training data attribution (TDA) should enable generative model interpretability and foster a variety of related downstream tasks. Nonetheless, current TDA approaches lack reliability and robustness, preventing their adoption in real-world setups. In this paper, we take a decisive step towards more reliable and robust TDA for diffusion models. We propose to perform TDA with mirrored unlearning and noise-consistent skew (MUCS). The idea is to fine-tune a second model with bounded mirrored gradient ascent, and to measure the normalized skew of this model with respect to the original one using consistent noise samples. We show that, while being conceptually simple and generic, MUCS systematically outperforms existing methods on three different datasets by a large margin. We additionally study the effect that core design choices have on final performance, and analyze novel aspects regarding the overlap of influential instances across generated items and the potential of ensembling TDA approaches. We believe that our findings may have broader implications for more general unlearning setups, as well as for tasks requiring the comparison of diffusion losses.
title Training data attribution in diffusion models via mirrored unlearning and noise-consistent skew
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2605.17938