Saved in:
Bibliographic Details
Main Authors: Song, Bingjie, Huang, Xin, Xie, Ruting, Wang, Xue, Wang, Qing
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2412.03571
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929615677685760
author Song, Bingjie
Huang, Xin
Xie, Ruting
Wang, Xue
Wang, Qing
author_facet Song, Bingjie
Huang, Xin
Xie, Ruting
Wang, Xue
Wang, Qing
contents We present Style3D, a novel approach for generating stylized 3D objects from a content image and a style image. Unlike most previous methods that require case- or style-specific training, Style3D supports instant 3D object stylization. Our key insight is that 3D object stylization can be decomposed into two interconnected processes: multi-view dual-feature alignment and sparse-view spatial reconstruction. We introduce MultiFusion Attention, an attention-guided technique to achieve multi-view stylization from the content-style pair. Specifically, the query features from the content image preserve geometric consistency across multiple views, while the key and value features from the style image are used to guide the stylistic transfer. This dual-feature alignment ensures that spatial coherence and stylistic fidelity are maintained across multi-view images. Finally, a large 3D reconstruction model is introduced to generate coherent stylized 3D objects. By establishing an interplay between structural and stylistic features across multiple views, our approach enables a holistic 3D stylization process. Extensive experiments demonstrate that Style3D offers a more flexible and scalable solution for generating style-consistent 3D assets, surpassing existing methods in both computational efficiency and visual quality.
format Preprint
id arxiv_https___arxiv_org_abs_2412_03571
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Style3D: Attention-guided Multi-view Style Transfer for 3D Object Generation
Song, Bingjie
Huang, Xin
Xie, Ruting
Wang, Xue
Wang, Qing
Computer Vision and Pattern Recognition
We present Style3D, a novel approach for generating stylized 3D objects from a content image and a style image. Unlike most previous methods that require case- or style-specific training, Style3D supports instant 3D object stylization. Our key insight is that 3D object stylization can be decomposed into two interconnected processes: multi-view dual-feature alignment and sparse-view spatial reconstruction. We introduce MultiFusion Attention, an attention-guided technique to achieve multi-view stylization from the content-style pair. Specifically, the query features from the content image preserve geometric consistency across multiple views, while the key and value features from the style image are used to guide the stylistic transfer. This dual-feature alignment ensures that spatial coherence and stylistic fidelity are maintained across multi-view images. Finally, a large 3D reconstruction model is introduced to generate coherent stylized 3D objects. By establishing an interplay between structural and stylistic features across multiple views, our approach enables a holistic 3D stylization process. Extensive experiments demonstrate that Style3D offers a more flexible and scalable solution for generating style-consistent 3D assets, surpassing existing methods in both computational efficiency and visual quality.
title Style3D: Attention-guided Multi-view Style Transfer for 3D Object Generation
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2412.03571