Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Tanveer, Maham, Wang, Yizhi, Wang, Ruiqi, Zhao, Nanxuan, Mahdavi-Amiri, Ali, Zhang, Hao
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2402.03549
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910319048130560
author	Tanveer, Maham Wang, Yizhi Wang, Ruiqi Zhao, Nanxuan Mahdavi-Amiri, Ali Zhang, Hao
author_facet	Tanveer, Maham Wang, Yizhi Wang, Ruiqi Zhao, Nanxuan Mahdavi-Amiri, Ali Zhang, Hao
contents	We present AnaMoDiff, a novel diffusion-based method for 2D motion analogies that is applied to raw, unannotated videos of articulated characters. Our goal is to accurately transfer motions from a 2D driving video onto a source character, with its identity, in terms of appearance and natural movement, well preserved, even when there may be significant discrepancies between the source and driving characters in their part proportions and movement speed and styles. Our diffusion model transfers the input motion via a latent optical flow (LOF) network operating in a noised latent space, which is spatially aware, efficient to process compared to the original RGB videos, and artifact-resistant through the diffusion denoising process even amid dense movements. To accomplish both motion analogy and identity preservation, we train our denoising model in a feature-disentangled manner, operating at two noise levels. While identity-revealing features of the source are learned via conventional noise injection, motion features are learned from LOF-warped videos by only injecting noise with large values, with the stipulation that motion properties involving pose and limbs are encoded by higher-level features. Experiments demonstrate that our method achieves the best trade-off between motion analogy and identity preservation.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_03549
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	AnaMoDiff: 2D Analogical Motion Diffusion via Disentangled Denoising Tanveer, Maham Wang, Yizhi Wang, Ruiqi Zhao, Nanxuan Mahdavi-Amiri, Ali Zhang, Hao Computer Vision and Pattern Recognition We present AnaMoDiff, a novel diffusion-based method for 2D motion analogies that is applied to raw, unannotated videos of articulated characters. Our goal is to accurately transfer motions from a 2D driving video onto a source character, with its identity, in terms of appearance and natural movement, well preserved, even when there may be significant discrepancies between the source and driving characters in their part proportions and movement speed and styles. Our diffusion model transfers the input motion via a latent optical flow (LOF) network operating in a noised latent space, which is spatially aware, efficient to process compared to the original RGB videos, and artifact-resistant through the diffusion denoising process even amid dense movements. To accomplish both motion analogy and identity preservation, we train our denoising model in a feature-disentangled manner, operating at two noise levels. While identity-revealing features of the source are learned via conventional noise injection, motion features are learned from LOF-warped videos by only injecting noise with large values, with the stipulation that motion properties involving pose and limbs are encoded by higher-level features. Experiments demonstrate that our method achieves the best trade-off between motion analogy and identity preservation.
title	AnaMoDiff: 2D Analogical Motion Diffusion via Disentangled Denoising
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2402.03549

Similar Items