MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Burgert, Ryan, Xu, Yuancheng, Xian, Wenqi, Pilarski, Oliver, Clausen, Pascal, He, Mingming, Ma, Li, Deng, Yitong, Li, Lingxiao, Mousavi, Mohsen, Ryoo, Michael, Debevec, Paul, Yu, Ning
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Computer Vision and Pattern Recognition
Accesso online:	https://arxiv.org/abs/2501.08331
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866916882856017920
author	Burgert, Ryan Xu, Yuancheng Xian, Wenqi Pilarski, Oliver Clausen, Pascal He, Mingming Ma, Li Deng, Yitong Li, Lingxiao Mousavi, Mohsen Ryoo, Michael Debevec, Paul Yu, Ning
author_facet	Burgert, Ryan Xu, Yuancheng Xian, Wenqi Pilarski, Oliver Clausen, Pascal He, Mingming Ma, Li Deng, Yitong Li, Lingxiao Mousavi, Mohsen Ryoo, Michael Debevec, Paul Yu, Ning
contents	Generative modeling aims to transform random noise into structured outputs. In this work, we enhance video diffusion models by allowing motion control via structured latent noise sampling. This is achieved by just a change in data: we pre-process training videos to yield structured noise. Consequently, our method is agnostic to diffusion model design, requiring no changes to model architectures or training pipelines. Specifically, we propose a novel noise warping algorithm, fast enough to run in real time, that replaces random temporal Gaussianity with correlated warped noise derived from optical flow fields, while preserving the spatial Gaussianity. The efficiency of our algorithm enables us to fine-tune modern video diffusion base models using warped noise with minimal overhead, and provide a one-stop solution for a wide range of user-friendly motion control: local object motion control, global camera movement control, and motion transfer. The harmonization between temporal coherence and spatial Gaussianity in our warped noise leads to effective motion control while maintaining per-frame pixel quality. Extensive experiments and user studies demonstrate the advantages of our method, making it a robust and scalable approach for controlling motion in video diffusion models. Video results are available on our webpage: https://eyeline-labs.github.io/Go-with-the-Flow. Source code and model checkpoints are available on GitHub: https://github.com/Eyeline-Labs/Go-with-the-Flow.
format	Preprint
id	arxiv_https___arxiv_org_abs_2501_08331
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise Burgert, Ryan Xu, Yuancheng Xian, Wenqi Pilarski, Oliver Clausen, Pascal He, Mingming Ma, Li Deng, Yitong Li, Lingxiao Mousavi, Mohsen Ryoo, Michael Debevec, Paul Yu, Ning Computer Vision and Pattern Recognition Generative modeling aims to transform random noise into structured outputs. In this work, we enhance video diffusion models by allowing motion control via structured latent noise sampling. This is achieved by just a change in data: we pre-process training videos to yield structured noise. Consequently, our method is agnostic to diffusion model design, requiring no changes to model architectures or training pipelines. Specifically, we propose a novel noise warping algorithm, fast enough to run in real time, that replaces random temporal Gaussianity with correlated warped noise derived from optical flow fields, while preserving the spatial Gaussianity. The efficiency of our algorithm enables us to fine-tune modern video diffusion base models using warped noise with minimal overhead, and provide a one-stop solution for a wide range of user-friendly motion control: local object motion control, global camera movement control, and motion transfer. The harmonization between temporal coherence and spatial Gaussianity in our warped noise leads to effective motion control while maintaining per-frame pixel quality. Extensive experiments and user studies demonstrate the advantages of our method, making it a robust and scalable approach for controlling motion in video diffusion models. Video results are available on our webpage: https://eyeline-labs.github.io/Go-with-the-Flow. Source code and model checkpoints are available on GitHub: https://github.com/Eyeline-Labs/Go-with-the-Flow.
title	Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2501.08331

Documenti analoghi