Saved in:
Bibliographic Details
Main Authors: Yu, Fengcheng, Xu, Haoran, Xia, Canming, Zong, Ziyang, Tan, Guang
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.15438
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912805375967232
author Yu, Fengcheng
Xu, Haoran
Xia, Canming
Zong, Ziyang
Tan, Guang
author_facet Yu, Fengcheng
Xu, Haoran
Xia, Canming
Zong, Ziyang
Tan, Guang
contents Vision-based occupancy networks (VONs) provide an end-to-end solution for reconstructing 3D environments in autonomous driving. However, existing methods often suffer from temporal inconsistencies, manifesting as flickering effects that degrade temporal coherence and adversely affect downstream decision-making. While recent approaches incorporate historical information to alleviate this issue, they often incur high computational costs and may introduce misaligned or redundant features that interfere with object detection. We propose OccLinker, a novel plugin framework that can be easily integrated into existing VONs to improve performance. Our method efficiently consolidates historical static and motion cues, learns sparse latent correlations with current features through a dual cross-attention mechanism, and generates correction occupancy components to refine the base network predictions. In addition, we introduce a new temporal consistency metric to quantitatively measure flickering effects. Extensive experiments on two benchmark datasets demonstrate that our method achieves superior performance with minimal computational overhead while effectively reducing flickering artifacts.
format Preprint
id arxiv_https___arxiv_org_abs_2502_15438
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Deflickering Vision-Based Occupancy Networks through Lightweight Spatio-Temporal Correlation
Yu, Fengcheng
Xu, Haoran
Xia, Canming
Zong, Ziyang
Tan, Guang
Computer Vision and Pattern Recognition
Vision-based occupancy networks (VONs) provide an end-to-end solution for reconstructing 3D environments in autonomous driving. However, existing methods often suffer from temporal inconsistencies, manifesting as flickering effects that degrade temporal coherence and adversely affect downstream decision-making. While recent approaches incorporate historical information to alleviate this issue, they often incur high computational costs and may introduce misaligned or redundant features that interfere with object detection. We propose OccLinker, a novel plugin framework that can be easily integrated into existing VONs to improve performance. Our method efficiently consolidates historical static and motion cues, learns sparse latent correlations with current features through a dual cross-attention mechanism, and generates correction occupancy components to refine the base network predictions. In addition, we introduce a new temporal consistency metric to quantitatively measure flickering effects. Extensive experiments on two benchmark datasets demonstrate that our method achieves superior performance with minimal computational overhead while effectively reducing flickering artifacts.
title Deflickering Vision-Based Occupancy Networks through Lightweight Spatio-Temporal Correlation
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2502.15438