Saved in:
| Main Authors: | Lu, Andong, Wen, Mai, Wang, Jinhu, Guo, Yuanzhi, Li, Chenglong, Tang, Jin, Luo, Bin |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.11218 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Breaking Shallow Limits: Task-Driven Pixel Fusion for Gap-free RGBT Tracking
by: Lu, Andong, et al.
Published: (2025)
by: Lu, Andong, et al.
Published: (2025)
RGBT Tracking via All-layer Multimodal Interactions with Progressive Fusion Mamba
by: Lu, Andong, et al.
Published: (2024)
by: Lu, Andong, et al.
Published: (2024)
AFter: Attention-based Fusion Router for RGBT Tracking
by: Lu, Andong, et al.
Published: (2024)
by: Lu, Andong, et al.
Published: (2024)
Modality-missing RGBT Tracking: Invertible Prompt Learning and High-quality Benchmarks
by: Lu, Andong, et al.
Published: (2023)
by: Lu, Andong, et al.
Published: (2023)
Transformer RGBT Tracking with Spatio-Temporal Multimodal Tokens
by: Sun, Dengdi, et al.
Published: (2024)
by: Sun, Dengdi, et al.
Published: (2024)
Breaking Modality Gap in RGBT Tracking: Coupled Knowledge Distillation
by: Lu, Andong, et al.
Published: (2024)
by: Lu, Andong, et al.
Published: (2024)
Nighttime Person Re-Identification via Collaborative Enhancement Network with Multi-domain Learning
by: Lu, Andong, et al.
Published: (2023)
by: Lu, Andong, et al.
Published: (2023)
Towards Robust Optical-SAR Object Detection under Missing Modalities: A Dynamic Quality-Aware Fusion Framework
by: Zhao, Zhicheng, et al.
Published: (2025)
by: Zhao, Zhicheng, et al.
Published: (2025)
Cross-modulated Attention Transformer for RGBT Tracking
by: Xiao, Yun, et al.
Published: (2024)
by: Xiao, Yun, et al.
Published: (2024)
Vehicle-centric Perception via Multimodal Structured Pre-training
by: Wu, Wentao, et al.
Published: (2025)
by: Wu, Wentao, et al.
Published: (2025)
Dynamic Disentangled Fusion Network for RGBT Tracking
by: Li, Chenglong, et al.
Published: (2024)
by: Li, Chenglong, et al.
Published: (2024)
ICPL-ReID: Identity-Conditional Prompt Learning for Multi-Spectral Object Re-Identification
by: Li, Shihao, et al.
Published: (2025)
by: Li, Shihao, et al.
Published: (2025)
UniModel: A Visual-Only Framework for Unified Multimodal Understanding and Generation
by: Zhang, Chi, et al.
Published: (2025)
by: Zhang, Chi, et al.
Published: (2025)
Visible-Thermal Multiple Object Tracking: Large-scale Video Dataset and Progressive Fusion Approach
by: Zhu, Yabin, et al.
Published: (2024)
by: Zhu, Yabin, et al.
Published: (2024)
Sparse-Dense Mixture of Experts Adapter for Multi-Modal Tracking
by: Zhu, Yabin, et al.
Published: (2026)
by: Zhu, Yabin, et al.
Published: (2026)
DeTrack: A Benchmark and Altitude-Aware Dual World Model for Drone-embodied Tracking
by: Hu, Guyue, et al.
Published: (2026)
by: Hu, Guyue, et al.
Published: (2026)
Towards Effective and Efficient Adversarial Defense with Diffusion Models for Robust Visual Tracking
by: Xu, Long, et al.
Published: (2025)
by: Xu, Long, et al.
Published: (2025)
Large Language Model Guided Progressive Feature Alignment for Multimodal UAV Object Detection
by: Wu, Wentao, et al.
Published: (2025)
by: Wu, Wentao, et al.
Published: (2025)
Decoupled Hierarchical Distillation for Multimodal Emotion Recognition
by: Li, Yong, et al.
Published: (2026)
by: Li, Yong, et al.
Published: (2026)
GLAD: Generative Language-Assisted Visual Tracking for Low-Semantic Templates
by: Luo, Xingyu, et al.
Published: (2026)
by: Luo, Xingyu, et al.
Published: (2026)
Unified Multimodal Visual Tracking with Dual Mixture-of-Experts
by: Hong, Lingyi, et al.
Published: (2026)
by: Hong, Lingyi, et al.
Published: (2026)
UAUTrack: Towards Unified Multimodal Anti-UAV Visual Tracking
by: Ren, Qionglin, et al.
Published: (2025)
by: Ren, Qionglin, et al.
Published: (2025)
Omni Survey for Multimodality Analysis in Visual Object Tracking
by: Tang, Zhangyong, et al.
Published: (2025)
by: Tang, Zhangyong, et al.
Published: (2025)
Towards Low-Latency Event Stream-based Visual Object Tracking: A Slow-Fast Approach
by: Wang, Shiao, et al.
Published: (2025)
by: Wang, Shiao, et al.
Published: (2025)
SpongeBob: Sync-Aware Harmonious Audio-Visual Generative Editing
by: Liang, Sen, et al.
Published: (2026)
by: Liang, Sen, et al.
Published: (2026)
Seeing What Matters: Visual Preference Policy Optimization for Visual Generation
by: Ni, Ziqi, et al.
Published: (2025)
by: Ni, Ziqi, et al.
Published: (2025)
Reward-Aware Trajectory Shaping for Few-step Visual Generation
by: Li, Rui, et al.
Published: (2026)
by: Li, Rui, et al.
Published: (2026)
Text-Guided Coarse-to-Fine Fusion Network for Robust Remote Sensing Visual Question Answering
by: Zhao, Zhicheng, et al.
Published: (2024)
by: Zhao, Zhicheng, et al.
Published: (2024)
Bridging the Scale Gap: Balanced Tiny and General Object Detection in Remote Sensing Imagery
by: Zhao, Zhicheng, et al.
Published: (2025)
by: Zhao, Zhicheng, et al.
Published: (2025)
Semantic Change Detection of Roads and Bridges: A Fine-grained Dataset and Multimodal Frequency-driven Detector
by: Shu, Qingling, et al.
Published: (2025)
by: Shu, Qingling, et al.
Published: (2025)
CM3AE: A Unified RGB Frame and Event-Voxel/-Frame Pre-training Framework
by: Wu, Wentao, et al.
Published: (2025)
by: Wu, Wentao, et al.
Published: (2025)
Learning Adaptive Fusion Bank for Multi-modal Salient Object Detection
by: Wang, Kunpeng, et al.
Published: (2024)
by: Wang, Kunpeng, et al.
Published: (2024)
Alignment-Free RGBT Salient Object Detection: Semantics-guided Asymmetric Correlation Network and A Unified Benchmark
by: Wang, Kunpeng, et al.
Published: (2024)
by: Wang, Kunpeng, et al.
Published: (2024)
Unified-modal Salient Object Detection via Adaptive Prompt Learning
by: Wang, Kunpeng, et al.
Published: (2023)
by: Wang, Kunpeng, et al.
Published: (2023)
Adapting Segment Anything Model to Multi-modal Salient Object Detection with Semantic Feature Fusion Guidance
by: Wang, Kunpeng, et al.
Published: (2024)
by: Wang, Kunpeng, et al.
Published: (2024)
Alignment-Free RGB-T Salient Object Detection: A Large-scale Dataset and Progressive Correlation Network
by: Wang, Kunpeng, et al.
Published: (2024)
by: Wang, Kunpeng, et al.
Published: (2024)
Compositional-Degradation UAV Image Restoration: Conditional Decoupled MoE Network and A Benchmark
by: Yan, Jinquan, et al.
Published: (2026)
by: Yan, Jinquan, et al.
Published: (2026)
Visual Text Generation in the Wild
by: Zhu, Yuanzhi, et al.
Published: (2024)
by: Zhu, Yuanzhi, et al.
Published: (2024)
Revisiting Disentanglement in Downstream Tasks: A Study on Its Necessity for Abstract Visual Reasoning
by: Nai, Ruiqian, et al.
Published: (2024)
by: Nai, Ruiqian, et al.
Published: (2024)
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation
by: Wang, Yuqing, et al.
Published: (2025)
by: Wang, Yuqing, et al.
Published: (2025)
Similar Items
-
Breaking Shallow Limits: Task-Driven Pixel Fusion for Gap-free RGBT Tracking
by: Lu, Andong, et al.
Published: (2025) -
RGBT Tracking via All-layer Multimodal Interactions with Progressive Fusion Mamba
by: Lu, Andong, et al.
Published: (2024) -
AFter: Attention-based Fusion Router for RGBT Tracking
by: Lu, Andong, et al.
Published: (2024) -
Modality-missing RGBT Tracking: Invertible Prompt Learning and High-quality Benchmarks
by: Lu, Andong, et al.
Published: (2023) -
Transformer RGBT Tracking with Spatio-Temporal Multimodal Tokens
by: Sun, Dengdi, et al.
Published: (2024)