Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yue, Kaiyu, Jia, Menglin, Hou, Ji, Goldstein, Tom
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.15030
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914333186850816
author	Yue, Kaiyu Jia, Menglin Hou, Ji Goldstein, Tom
author_facet	Yue, Kaiyu Jia, Menglin Hou, Ji Goldstein, Tom
contents	We introduce the Sphere Encoder, an efficient generative framework capable of producing images in a single forward pass and competing with many-step diffusion models using fewer than five steps. Our approach works by learning an encoder that maps natural images uniformly onto a spherical latent space, and a decoder that maps random latent vectors back to the image space. Trained solely through image reconstruction losses, the model generates an image by simply decoding a random point on the sphere. Our architecture naturally supports conditional generation, and looping the encoder/decoder a few times can further enhance image quality. Across several datasets, the sphere encoder approach yields performance competitive with state of the art diffusions, but with a small fraction of the inference cost. Project page is available at https://sphere-encoder.github.io .
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_15030
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Image Generation with a Sphere Encoder Yue, Kaiyu Jia, Menglin Hou, Ji Goldstein, Tom Computer Vision and Pattern Recognition We introduce the Sphere Encoder, an efficient generative framework capable of producing images in a single forward pass and competing with many-step diffusion models using fewer than five steps. Our approach works by learning an encoder that maps natural images uniformly onto a spherical latent space, and a decoder that maps random latent vectors back to the image space. Trained solely through image reconstruction losses, the model generates an image by simply decoding a random point on the sphere. Our architecture naturally supports conditional generation, and looping the encoder/decoder a few times can further enhance image quality. Across several datasets, the sphere encoder approach yields performance competitive with state of the art diffusions, but with a small fraction of the inference cost. Project page is available at https://sphere-encoder.github.io .
title	Image Generation with a Sphere Encoder
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2602.15030

Similar Items