Saved in:
Bibliographic Details
Main Authors: Tanveer, Maham, Wang, Yizhi, Wang, Ruiqi, Zhao, Nanxuan, Mahdavi-Amiri, Ali, Zhang, Hao
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2402.03549
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910319048130560
author Tanveer, Maham
Wang, Yizhi
Wang, Ruiqi
Zhao, Nanxuan
Mahdavi-Amiri, Ali
Zhang, Hao
author_facet Tanveer, Maham
Wang, Yizhi
Wang, Ruiqi
Zhao, Nanxuan
Mahdavi-Amiri, Ali
Zhang, Hao
contents We present AnaMoDiff, a novel diffusion-based method for 2D motion analogies that is applied to raw, unannotated videos of articulated characters. Our goal is to accurately transfer motions from a 2D driving video onto a source character, with its identity, in terms of appearance and natural movement, well preserved, even when there may be significant discrepancies between the source and driving characters in their part proportions and movement speed and styles. Our diffusion model transfers the input motion via a latent optical flow (LOF) network operating in a noised latent space, which is spatially aware, efficient to process compared to the original RGB videos, and artifact-resistant through the diffusion denoising process even amid dense movements. To accomplish both motion analogy and identity preservation, we train our denoising model in a feature-disentangled manner, operating at two noise levels. While identity-revealing features of the source are learned via conventional noise injection, motion features are learned from LOF-warped videos by only injecting noise with large values, with the stipulation that motion properties involving pose and limbs are encoded by higher-level features. Experiments demonstrate that our method achieves the best trade-off between motion analogy and identity preservation.
format Preprint
id arxiv_https___arxiv_org_abs_2402_03549
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle AnaMoDiff: 2D Analogical Motion Diffusion via Disentangled Denoising
Tanveer, Maham
Wang, Yizhi
Wang, Ruiqi
Zhao, Nanxuan
Mahdavi-Amiri, Ali
Zhang, Hao
Computer Vision and Pattern Recognition
We present AnaMoDiff, a novel diffusion-based method for 2D motion analogies that is applied to raw, unannotated videos of articulated characters. Our goal is to accurately transfer motions from a 2D driving video onto a source character, with its identity, in terms of appearance and natural movement, well preserved, even when there may be significant discrepancies between the source and driving characters in their part proportions and movement speed and styles. Our diffusion model transfers the input motion via a latent optical flow (LOF) network operating in a noised latent space, which is spatially aware, efficient to process compared to the original RGB videos, and artifact-resistant through the diffusion denoising process even amid dense movements. To accomplish both motion analogy and identity preservation, we train our denoising model in a feature-disentangled manner, operating at two noise levels. While identity-revealing features of the source are learned via conventional noise injection, motion features are learned from LOF-warped videos by only injecting noise with large values, with the stipulation that motion properties involving pose and limbs are encoded by higher-level features. Experiments demonstrate that our method achieves the best trade-off between motion analogy and identity preservation.
title AnaMoDiff: 2D Analogical Motion Diffusion via Disentangled Denoising
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2402.03549