Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Wang, You, Fang, Li, Zhu, Hao, Hu, Fei, Ye, Long, Ma, Zhan
Format: Preprint
Veröffentlicht: 2025
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2505.19813
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866913859533537280
author Wang, You
Fang, Li
Zhu, Hao
Hu, Fei
Ye, Long
Ma, Zhan
author_facet Wang, You
Fang, Li
Zhu, Hao
Hu, Fei
Ye, Long
Ma, Zhan
contents Neural Radiance Fields (NeRF) have transformed novel view synthesis by modeling scene-specific volumetric representations directly from images. While generalizable NeRF models can generate novel views across unknown scenes by learning latent ray representations, their performance heavily depends on a large number of multi-view observations. However, with limited input views, these methods experience significant degradation in rendering quality. To address this limitation, we propose GoLF-NRT: a Global and Local feature Fusion-based Neural Rendering Transformer. GoLF-NRT enhances generalizable neural rendering from few input views by leveraging a 3D transformer with efficient sparse attention to capture global scene context. In parallel, it integrates local geometric features extracted along the epipolar line, enabling high-quality scene reconstruction from as few as 1 to 3 input views. Furthermore, we introduce an adaptive sampling strategy based on attention weights and kernel regression, improving the accuracy of transformer-based neural rendering. Extensive experiments on public datasets show that GoLF-NRT achieves state-of-the-art performance across varying numbers of input views, highlighting the effectiveness and superiority of our approach. Code is available at https://github.com/KLMAV-CUC/GoLF-NRT.
format Preprint
id arxiv_https___arxiv_org_abs_2505_19813
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle GoLF-NRT: Integrating Global Context and Local Geometry for Few-Shot View Synthesis
Wang, You
Fang, Li
Zhu, Hao
Hu, Fei
Ye, Long
Ma, Zhan
Computer Vision and Pattern Recognition
Neural Radiance Fields (NeRF) have transformed novel view synthesis by modeling scene-specific volumetric representations directly from images. While generalizable NeRF models can generate novel views across unknown scenes by learning latent ray representations, their performance heavily depends on a large number of multi-view observations. However, with limited input views, these methods experience significant degradation in rendering quality. To address this limitation, we propose GoLF-NRT: a Global and Local feature Fusion-based Neural Rendering Transformer. GoLF-NRT enhances generalizable neural rendering from few input views by leveraging a 3D transformer with efficient sparse attention to capture global scene context. In parallel, it integrates local geometric features extracted along the epipolar line, enabling high-quality scene reconstruction from as few as 1 to 3 input views. Furthermore, we introduce an adaptive sampling strategy based on attention weights and kernel regression, improving the accuracy of transformer-based neural rendering. Extensive experiments on public datasets show that GoLF-NRT achieves state-of-the-art performance across varying numbers of input views, highlighting the effectiveness and superiority of our approach. Code is available at https://github.com/KLMAV-CUC/GoLF-NRT.
title GoLF-NRT: Integrating Global Context and Local Geometry for Few-Shot View Synthesis
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2505.19813