Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	He, Xu, Li, Xiaoyu, Kang, Di, Ye, Jiangnan, Zhang, Chaopeng, Chen, Liyang, Gao, Xiangjun, Zhang, Han, Wu, Zhiyong, Zhuang, Haolin
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2408.14211
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910576471441408
author	He, Xu Li, Xiaoyu Kang, Di Ye, Jiangnan Zhang, Chaopeng Chen, Liyang Gao, Xiangjun Zhang, Han Wu, Zhiyong Zhuang, Haolin
author_facet	He, Xu Li, Xiaoyu Kang, Di Ye, Jiangnan Zhang, Chaopeng Chen, Liyang Gao, Xiangjun Zhang, Han Wu, Zhiyong Zhuang, Haolin
contents	Existing works in single-image human reconstruction suffer from weak generalizability due to insufficient training data or 3D inconsistencies for a lack of comprehensive multi-view knowledge. In this paper, we introduce MagicMan, a human-specific multi-view diffusion model designed to generate high-quality novel view images from a single reference image. As its core, we leverage a pre-trained 2D diffusion model as the generative prior for generalizability, with the parametric SMPL-X model as the 3D body prior to promote 3D awareness. To tackle the critical challenge of maintaining consistency while achieving dense multi-view generation for improved 3D human reconstruction, we first introduce hybrid multi-view attention to facilitate both efficient and thorough information interchange across different views. Additionally, we present a geometry-aware dual branch to perform concurrent generation in both RGB and normal domains, further enhancing consistency via geometry cues. Last but not least, to address ill-shaped issues arising from inaccurate SMPL-X estimation that conflicts with the reference image, we propose a novel iterative refinement strategy, which progressively optimizes SMPL-X accuracy while enhancing the quality and consistency of the generated multi-views. Extensive experimental results demonstrate that our method significantly outperforms existing approaches in both novel view synthesis and subsequent 3D human reconstruction tasks.
format	Preprint
id	arxiv_https___arxiv_org_abs_2408_14211
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement He, Xu Li, Xiaoyu Kang, Di Ye, Jiangnan Zhang, Chaopeng Chen, Liyang Gao, Xiangjun Zhang, Han Wu, Zhiyong Zhuang, Haolin Computer Vision and Pattern Recognition Artificial Intelligence Existing works in single-image human reconstruction suffer from weak generalizability due to insufficient training data or 3D inconsistencies for a lack of comprehensive multi-view knowledge. In this paper, we introduce MagicMan, a human-specific multi-view diffusion model designed to generate high-quality novel view images from a single reference image. As its core, we leverage a pre-trained 2D diffusion model as the generative prior for generalizability, with the parametric SMPL-X model as the 3D body prior to promote 3D awareness. To tackle the critical challenge of maintaining consistency while achieving dense multi-view generation for improved 3D human reconstruction, we first introduce hybrid multi-view attention to facilitate both efficient and thorough information interchange across different views. Additionally, we present a geometry-aware dual branch to perform concurrent generation in both RGB and normal domains, further enhancing consistency via geometry cues. Last but not least, to address ill-shaped issues arising from inaccurate SMPL-X estimation that conflicts with the reference image, we propose a novel iterative refinement strategy, which progressively optimizes SMPL-X accuracy while enhancing the quality and consistency of the generated multi-views. Extensive experimental results demonstrate that our method significantly outperforms existing approaches in both novel view synthesis and subsequent 3D human reconstruction tasks.
title	MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement
topic	Computer Vision and Pattern Recognition Artificial Intelligence
url	https://arxiv.org/abs/2408.14211

Similar Items