Saved in:
Bibliographic Details
Main Authors: Hagemann, Paul, Mildenberger, Sophie, Ruthotto, Lars, Steidl, Gabriele, Yang, Nicole Tianjiao
Format: Preprint
Published: 2023
Subjects:
Online Access:https://arxiv.org/abs/2303.04772
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909354266984448
author Hagemann, Paul
Mildenberger, Sophie
Ruthotto, Lars
Steidl, Gabriele
Yang, Nicole Tianjiao
author_facet Hagemann, Paul
Mildenberger, Sophie
Ruthotto, Lars
Steidl, Gabriele
Yang, Nicole Tianjiao
contents Score-based diffusion models (SBDM) have recently emerged as state-of-the-art approaches for image generation. Existing SBDMs are typically formulated in a finite-dimensional setting, where images are considered as tensors of finite size. This paper develops SBDMs in the infinite-dimensional setting, that is, we model the training data as functions supported on a rectangular domain. In addition to the quest for generating images at ever-higher resolutions, our primary motivation is to create a well-posed infinite-dimensional learning problem that we can discretize consistently on multiple resolution levels. We thereby intend to obtain diffusion models that generalize across different resolution levels and improve the efficiency of the training process. We demonstrate how to overcome two shortcomings of current SBDM approaches in the infinite-dimensional setting. First, we modify the forward process using trace class operators to ensure that the latent distribution is well-defined in the infinite-dimensional setting and derive the reverse processes for finite-dimensional approximations. Second, we illustrate that approximating the score function with an operator network is beneficial for multilevel training. After deriving the convergence of the discretization and the approximation of multilevel training, we demonstrate some practical benefits of our infinite-dimensional SBDM approach on a synthetic Gaussian mixture example, the MNIST dataset, and a dataset generated from a nonlinear 2D reaction-diffusion equation.
format Preprint
id arxiv_https___arxiv_org_abs_2303_04772
institution arXiv
publishDate 2023
record_format arxiv
spellingShingle Multilevel Diffusion: Infinite Dimensional Score-Based Diffusion Models for Image Generation
Hagemann, Paul
Mildenberger, Sophie
Ruthotto, Lars
Steidl, Gabriele
Yang, Nicole Tianjiao
Machine Learning
Computer Vision and Pattern Recognition
Probability
60H10, 65D18
Score-based diffusion models (SBDM) have recently emerged as state-of-the-art approaches for image generation. Existing SBDMs are typically formulated in a finite-dimensional setting, where images are considered as tensors of finite size. This paper develops SBDMs in the infinite-dimensional setting, that is, we model the training data as functions supported on a rectangular domain. In addition to the quest for generating images at ever-higher resolutions, our primary motivation is to create a well-posed infinite-dimensional learning problem that we can discretize consistently on multiple resolution levels. We thereby intend to obtain diffusion models that generalize across different resolution levels and improve the efficiency of the training process. We demonstrate how to overcome two shortcomings of current SBDM approaches in the infinite-dimensional setting. First, we modify the forward process using trace class operators to ensure that the latent distribution is well-defined in the infinite-dimensional setting and derive the reverse processes for finite-dimensional approximations. Second, we illustrate that approximating the score function with an operator network is beneficial for multilevel training. After deriving the convergence of the discretization and the approximation of multilevel training, we demonstrate some practical benefits of our infinite-dimensional SBDM approach on a synthetic Gaussian mixture example, the MNIST dataset, and a dataset generated from a nonlinear 2D reaction-diffusion equation.
title Multilevel Diffusion: Infinite Dimensional Score-Based Diffusion Models for Image Generation
topic Machine Learning
Computer Vision and Pattern Recognition
Probability
60H10, 65D18
url https://arxiv.org/abs/2303.04772