Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.11601 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866917957281513472 |
|---|---|
| author | Zhang, Xuanqi Lee, Jieun Joslin, Chris Lee, Wonsook |
| author_facet | Zhang, Xuanqi Lee, Jieun Joslin, Chris Lee, Wonsook |
| contents | We present a novel framework for enhancing the visual fidelity and consistency of text-guided 3D Gaussian Splatting (3DGS) editing. Existing editing approaches face two critical challenges: inconsistent geometric reconstructions across multiple viewpoints, particularly in challenging camera positions, and ineffective utilization of depth information during image manipulation, resulting in over-texture artifacts and degraded object boundaries. To address these limitations, we introduce: 1) A complementary information mutual learning network that enhances depth map estimation from 3DGS, enabling precise depth-conditioned 3D editing while preserving geometric structures. 2) A wavelet consensus attention mechanism that effectively aligns latent codes during the diffusion denoising process, ensuring multi-view consistency in the edited results. Through extensive experimentation, our method demonstrates superior performance in rendering quality and view consistency compared to state-of-the-art approaches. The results validate our framework as an effective solution for text-guided editing of 3D scenes. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2503_11601 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Advancing 3D Gaussian Splatting Editing with Complementary and Consensus Information Zhang, Xuanqi Lee, Jieun Joslin, Chris Lee, Wonsook Computer Vision and Pattern Recognition We present a novel framework for enhancing the visual fidelity and consistency of text-guided 3D Gaussian Splatting (3DGS) editing. Existing editing approaches face two critical challenges: inconsistent geometric reconstructions across multiple viewpoints, particularly in challenging camera positions, and ineffective utilization of depth information during image manipulation, resulting in over-texture artifacts and degraded object boundaries. To address these limitations, we introduce: 1) A complementary information mutual learning network that enhances depth map estimation from 3DGS, enabling precise depth-conditioned 3D editing while preserving geometric structures. 2) A wavelet consensus attention mechanism that effectively aligns latent codes during the diffusion denoising process, ensuring multi-view consistency in the edited results. Through extensive experimentation, our method demonstrates superior performance in rendering quality and view consistency compared to state-of-the-art approaches. The results validate our framework as an effective solution for text-guided editing of 3D scenes. |
| title | Advancing 3D Gaussian Splatting Editing with Complementary and Consensus Information |
| topic | Computer Vision and Pattern Recognition |
| url | https://arxiv.org/abs/2503.11601 |