Saved in:
Bibliographic Details
Main Authors: Tian, Chengchang, Ma, Jianwei, Huang, Yan, Chen, Zhanye, Wei, Honghao, Zhang, Hui, Hong, Wei
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2507.18237
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916962946252800
author Tian, Chengchang
Ma, Jianwei
Huang, Yan
Chen, Zhanye
Wei, Honghao
Zhang, Hui
Hong, Wei
author_facet Tian, Chengchang
Ma, Jianwei
Huang, Yan
Chen, Zhanye
Wei, Honghao
Zhang, Hui
Hong, Wei
contents Feature-level fusion shows promise in collaborative perception (CP) through balanced performance and communication bandwidth trade-off. However, its effectiveness critically relies on input feature quality. The acquisition of high-quality features faces domain gaps from hardware diversity and deployment conditions, alongside temporal misalignment from transmission delays. These challenges degrade feature quality with cumulative effects throughout the collaborative network. In this paper, we present the Domain-And-Time Alignment (DATA) network, designed to systematically align features while maximizing their semantic representations for fusion. Specifically, we propose a Consistency-preserving Domain Alignment Module (CDAM) that reduces domain gaps through proximal-region hierarchical downsampling and observability-constrained discriminator. We further propose a Progressive Temporal Alignment Module (PTAM) to handle transmission delays via multi-scale motion modeling and two-stage compensation. Building upon the aligned features, an Instance-focused Feature Aggregation Module (IFAM) is developed to enhance semantic representations. Extensive experiments demonstrate that DATA achieves state-of-the-art performance on three typical datasets, maintaining robustness with severe communication delays and pose errors. The code will be released at https://github.com/ChengchangTian/DATA.
format Preprint
id arxiv_https___arxiv_org_abs_2507_18237
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle DATA: Domain-And-Time Alignment for High-Quality Feature Fusion in Collaborative Perception
Tian, Chengchang
Ma, Jianwei
Huang, Yan
Chen, Zhanye
Wei, Honghao
Zhang, Hui
Hong, Wei
Computer Vision and Pattern Recognition
Feature-level fusion shows promise in collaborative perception (CP) through balanced performance and communication bandwidth trade-off. However, its effectiveness critically relies on input feature quality. The acquisition of high-quality features faces domain gaps from hardware diversity and deployment conditions, alongside temporal misalignment from transmission delays. These challenges degrade feature quality with cumulative effects throughout the collaborative network. In this paper, we present the Domain-And-Time Alignment (DATA) network, designed to systematically align features while maximizing their semantic representations for fusion. Specifically, we propose a Consistency-preserving Domain Alignment Module (CDAM) that reduces domain gaps through proximal-region hierarchical downsampling and observability-constrained discriminator. We further propose a Progressive Temporal Alignment Module (PTAM) to handle transmission delays via multi-scale motion modeling and two-stage compensation. Building upon the aligned features, an Instance-focused Feature Aggregation Module (IFAM) is developed to enhance semantic representations. Extensive experiments demonstrate that DATA achieves state-of-the-art performance on three typical datasets, maintaining robustness with severe communication delays and pose errors. The code will be released at https://github.com/ChengchangTian/DATA.
title DATA: Domain-And-Time Alignment for High-Quality Feature Fusion in Collaborative Perception
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2507.18237