Saved in:
Bibliographic Details
Main Authors: Zhang, Zhongxuan, Zeng, Bi, Ni, Xinyu, Du, Yimin
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2501.03616
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929681514627072
author Zhang, Zhongxuan
Zeng, Bi
Ni, Xinyu
Du, Yimin
author_facet Zhang, Zhongxuan
Zeng, Bi
Ni, Xinyu
Du, Yimin
contents RGB-T tracking leverages the complementary strengths of RGB and thermal infrared (TIR) modalities to address challenging scenarios such as low illumination and adverse weather. However, existing methods often fail to effectively integrate temporal information and perform efficient cross-modal interactions, which constrain their adaptability to dynamic targets. In this paper, we propose BTMTrack, a novel framework for RGB-T tracking. The core of our approach lies in the dual-template backbone network and the Temporal-Modal Candidate Elimination (TMCE) strategy. The dual-template backbone effectively integrates temporal information, while the TMCE strategy focuses the model on target-relevant tokens by evaluating temporal and modal correlations, reducing computational overhead and avoiding irrelevant background noise. Building upon this foundation, we propose the Temporal Dual Template Bridging (TDTB) module, which facilitates precise cross-modal fusion through dynamically filtered tokens. This approach further strengthens the interaction between templates and the search region. Extensive experiments conducted on three benchmark datasets demonstrate the effectiveness of BTMTrack. Our method achieves state-of-the-art performance, with a 72.3% precision rate on the LasHeR test set and competitive results on RGBT210 and RGBT234 datasets.
format Preprint
id arxiv_https___arxiv_org_abs_2501_03616
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle BTMTrack: Robust RGB-T Tracking via Dual-template Bridging and Temporal-Modal Candidate Elimination
Zhang, Zhongxuan
Zeng, Bi
Ni, Xinyu
Du, Yimin
Computer Vision and Pattern Recognition
RGB-T tracking leverages the complementary strengths of RGB and thermal infrared (TIR) modalities to address challenging scenarios such as low illumination and adverse weather. However, existing methods often fail to effectively integrate temporal information and perform efficient cross-modal interactions, which constrain their adaptability to dynamic targets. In this paper, we propose BTMTrack, a novel framework for RGB-T tracking. The core of our approach lies in the dual-template backbone network and the Temporal-Modal Candidate Elimination (TMCE) strategy. The dual-template backbone effectively integrates temporal information, while the TMCE strategy focuses the model on target-relevant tokens by evaluating temporal and modal correlations, reducing computational overhead and avoiding irrelevant background noise. Building upon this foundation, we propose the Temporal Dual Template Bridging (TDTB) module, which facilitates precise cross-modal fusion through dynamically filtered tokens. This approach further strengthens the interaction between templates and the search region. Extensive experiments conducted on three benchmark datasets demonstrate the effectiveness of BTMTrack. Our method achieves state-of-the-art performance, with a 72.3% precision rate on the LasHeR test set and competitive results on RGBT210 and RGBT234 datasets.
title BTMTrack: Robust RGB-T Tracking via Dual-template Bridging and Temporal-Modal Candidate Elimination
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2501.03616