Saved in:
Bibliographic Details
Main Authors: Shah, Kulin, Kalavasis, Alkis, Klivans, Adam R., Daras, Giannis
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.21278
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915824942448640
author Shah, Kulin
Kalavasis, Alkis
Klivans, Adam R.
Daras, Giannis
author_facet Shah, Kulin
Kalavasis, Alkis
Klivans, Adam R.
Daras, Giannis
contents There is strong empirical evidence that the state-of-the-art diffusion modeling paradigm leads to models that memorize the training set, especially when the training set is small. Prior methods to mitigate the memorization problem often lead to a decrease in image quality. Is it possible to obtain strong and creative generative models, i.e., models that achieve high generation quality and low memorization? Despite the current pessimistic landscape of results, we make significant progress in pushing the trade-off between fidelity and memorization. We first provide theoretical evidence that memorization in diffusion models is only necessary for denoising problems at low noise scales (usually used in generating high-frequency details). Using this theoretical insight, we propose a simple, principled method to train the diffusion models using noisy data at large noise scales. We show that our method significantly reduces memorization without decreasing the image quality, for both text-conditional and unconditional models and for a variety of data availability settings.
format Preprint
id arxiv_https___arxiv_org_abs_2502_21278
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Does Generation Require Memorization? Creative Diffusion Models using Ambient Diffusion
Shah, Kulin
Kalavasis, Alkis
Klivans, Adam R.
Daras, Giannis
Machine Learning
There is strong empirical evidence that the state-of-the-art diffusion modeling paradigm leads to models that memorize the training set, especially when the training set is small. Prior methods to mitigate the memorization problem often lead to a decrease in image quality. Is it possible to obtain strong and creative generative models, i.e., models that achieve high generation quality and low memorization? Despite the current pessimistic landscape of results, we make significant progress in pushing the trade-off between fidelity and memorization. We first provide theoretical evidence that memorization in diffusion models is only necessary for denoising problems at low noise scales (usually used in generating high-frequency details). Using this theoretical insight, we propose a simple, principled method to train the diffusion models using noisy data at large noise scales. We show that our method significantly reduces memorization without decreasing the image quality, for both text-conditional and unconditional models and for a variety of data availability settings.
title Does Generation Require Memorization? Creative Diffusion Models using Ambient Diffusion
topic Machine Learning
url https://arxiv.org/abs/2502.21278