Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Humayun, Ahmed Imtiaz, Amara, Ibtihel, Vasconcelos, Cristina, Ramachandran, Deepak, Schumann, Candice, He, Junfeng, Heller, Katherine, Farnadi, Golnoosh, Rostamzadeh, Negar, Havaei, Mohammad
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2408.08307
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913680435707904
author	Humayun, Ahmed Imtiaz Amara, Ibtihel Vasconcelos, Cristina Ramachandran, Deepak Schumann, Candice He, Junfeng Heller, Katherine Farnadi, Golnoosh Rostamzadeh, Negar Havaei, Mohammad
author_facet	Humayun, Ahmed Imtiaz Amara, Ibtihel Vasconcelos, Cristina Ramachandran, Deepak Schumann, Candice He, Junfeng Heller, Katherine Farnadi, Golnoosh Rostamzadeh, Negar Havaei, Mohammad
contents	Deep Generative Models are frequently used to learn continuous representations of complex data distributions using a finite number of samples. For any generative model, including pre-trained foundation models with Diffusion or Transformer architectures, generation performance can significantly vary across the learned data manifold. In this paper we study the local geometry of the learned manifold and its relationship to generation outcomes for a wide range of generative models, including DDPM, Diffusion Transformer (DiT), and Stable Diffusion 1.4. Building on the theory of continuous piecewise-linear (CPWL) generators, we characterize the local geometry in terms of three geometric descriptors - scaling ($ψ$), rank ($ν$), and complexity/un-smoothness ($δ$). We provide quantitative and qualitative evidence showing that for a given latent-image pair, the local descriptors are indicative of generation aesthetics, diversity, and memorization by the generative model. Finally, we demonstrate that by training a reward model on the local scaling for Stable Diffusion, we can self-improve both generation aesthetics and diversity using `geometry reward' based guidance during denoising.
format	Preprint
id	arxiv_https___arxiv_org_abs_2408_08307
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	What Secrets Do Your Manifolds Hold? Understanding the Local Geometry of Generative Models Humayun, Ahmed Imtiaz Amara, Ibtihel Vasconcelos, Cristina Ramachandran, Deepak Schumann, Candice He, Junfeng Heller, Katherine Farnadi, Golnoosh Rostamzadeh, Negar Havaei, Mohammad Machine Learning Computer Vision and Pattern Recognition Deep Generative Models are frequently used to learn continuous representations of complex data distributions using a finite number of samples. For any generative model, including pre-trained foundation models with Diffusion or Transformer architectures, generation performance can significantly vary across the learned data manifold. In this paper we study the local geometry of the learned manifold and its relationship to generation outcomes for a wide range of generative models, including DDPM, Diffusion Transformer (DiT), and Stable Diffusion 1.4. Building on the theory of continuous piecewise-linear (CPWL) generators, we characterize the local geometry in terms of three geometric descriptors - scaling ($ψ$), rank ($ν$), and complexity/un-smoothness ($δ$). We provide quantitative and qualitative evidence showing that for a given latent-image pair, the local descriptors are indicative of generation aesthetics, diversity, and memorization by the generative model. Finally, we demonstrate that by training a reward model on the local scaling for Stable Diffusion, we can self-improve both generation aesthetics and diversity using `geometry reward' based guidance during denoising.
title	What Secrets Do Your Manifolds Hold? Understanding the Local Geometry of Generative Models
topic	Machine Learning Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2408.08307

Similar Items