Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Tian, Chengchang, Ma, Jianwei, Huang, Yan, Chen, Zhanye, Wei, Honghao, Zhang, Hui, Hong, Wei
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2507.18237
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916962946252800
author	Tian, Chengchang Ma, Jianwei Huang, Yan Chen, Zhanye Wei, Honghao Zhang, Hui Hong, Wei
author_facet	Tian, Chengchang Ma, Jianwei Huang, Yan Chen, Zhanye Wei, Honghao Zhang, Hui Hong, Wei
contents	Feature-level fusion shows promise in collaborative perception (CP) through balanced performance and communication bandwidth trade-off. However, its effectiveness critically relies on input feature quality. The acquisition of high-quality features faces domain gaps from hardware diversity and deployment conditions, alongside temporal misalignment from transmission delays. These challenges degrade feature quality with cumulative effects throughout the collaborative network. In this paper, we present the Domain-And-Time Alignment (DATA) network, designed to systematically align features while maximizing their semantic representations for fusion. Specifically, we propose a Consistency-preserving Domain Alignment Module (CDAM) that reduces domain gaps through proximal-region hierarchical downsampling and observability-constrained discriminator. We further propose a Progressive Temporal Alignment Module (PTAM) to handle transmission delays via multi-scale motion modeling and two-stage compensation. Building upon the aligned features, an Instance-focused Feature Aggregation Module (IFAM) is developed to enhance semantic representations. Extensive experiments demonstrate that DATA achieves state-of-the-art performance on three typical datasets, maintaining robustness with severe communication delays and pose errors. The code will be released at https://github.com/ChengchangTian/DATA.
format	Preprint
id	arxiv_https___arxiv_org_abs_2507_18237
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	DATA: Domain-And-Time Alignment for High-Quality Feature Fusion in Collaborative Perception Tian, Chengchang Ma, Jianwei Huang, Yan Chen, Zhanye Wei, Honghao Zhang, Hui Hong, Wei Computer Vision and Pattern Recognition Feature-level fusion shows promise in collaborative perception (CP) through balanced performance and communication bandwidth trade-off. However, its effectiveness critically relies on input feature quality. The acquisition of high-quality features faces domain gaps from hardware diversity and deployment conditions, alongside temporal misalignment from transmission delays. These challenges degrade feature quality with cumulative effects throughout the collaborative network. In this paper, we present the Domain-And-Time Alignment (DATA) network, designed to systematically align features while maximizing their semantic representations for fusion. Specifically, we propose a Consistency-preserving Domain Alignment Module (CDAM) that reduces domain gaps through proximal-region hierarchical downsampling and observability-constrained discriminator. We further propose a Progressive Temporal Alignment Module (PTAM) to handle transmission delays via multi-scale motion modeling and two-stage compensation. Building upon the aligned features, an Instance-focused Feature Aggregation Module (IFAM) is developed to enhance semantic representations. Extensive experiments demonstrate that DATA achieves state-of-the-art performance on three typical datasets, maintaining robustness with severe communication delays and pose errors. The code will be released at https://github.com/ChengchangTian/DATA.
title	DATA: Domain-And-Time Alignment for High-Quality Feature Fusion in Collaborative Perception
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2507.18237

Similar Items