Saved in:
Bibliographic Details
Main Authors: Jiakun, Li, Qingqing, Wang, Hongbin, Dong, Kexin, Li
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2412.01859
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909412745019392
author Jiakun, Li
Qingqing, Wang
Hongbin, Dong
Kexin, Li
author_facet Jiakun, Li
Qingqing, Wang
Hongbin, Dong
Kexin, Li
contents Current state-of-the-art vision models often utilize feature pyramids to extract multi-scale information, with the Feature Pyramid Network (FPN) being one of the most widely used classic architectures. However, traditional FPNs and their variants (e.g., AUGFPN, PAFPN) fail to fully address spatial misalignment on a global scale, leading to suboptimal performance in high-precision localization of objects. In this paper, we propose a novel Bidirectional Alignment Feature Pyramid Network (BAFPN), which aligns misaligned features globally through a Spatial Feature Alignment Module (SPAM) during the bottom-up information propagation phase. Subsequently, it further mitigates aliasing effects caused by cross-scale feature fusion via a fine-grained Semantic Alignment Module (SEAM) in the top-down phase. On the DOTAv1.5 dataset, BAFPN improves the baseline model's AP75, AP50, and mAP by 1.68%, 1.45%, and 1.34%, respectively. Additionally, BAFPN demonstrates significant performance gains when applied to various other advanced detectors.
format Preprint
id arxiv_https___arxiv_org_abs_2412_01859
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle BAFPN: Bi directional alignment of features to improve localization accuracy
Jiakun, Li
Qingqing, Wang
Hongbin, Dong
Kexin, Li
Computer Vision and Pattern Recognition
Current state-of-the-art vision models often utilize feature pyramids to extract multi-scale information, with the Feature Pyramid Network (FPN) being one of the most widely used classic architectures. However, traditional FPNs and their variants (e.g., AUGFPN, PAFPN) fail to fully address spatial misalignment on a global scale, leading to suboptimal performance in high-precision localization of objects. In this paper, we propose a novel Bidirectional Alignment Feature Pyramid Network (BAFPN), which aligns misaligned features globally through a Spatial Feature Alignment Module (SPAM) during the bottom-up information propagation phase. Subsequently, it further mitigates aliasing effects caused by cross-scale feature fusion via a fine-grained Semantic Alignment Module (SEAM) in the top-down phase. On the DOTAv1.5 dataset, BAFPN improves the baseline model's AP75, AP50, and mAP by 1.68%, 1.45%, and 1.34%, respectively. Additionally, BAFPN demonstrates significant performance gains when applied to various other advanced detectors.
title BAFPN: Bi directional alignment of features to improve localization accuracy
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2412.01859