Saved in:
| Main Authors: | Shi, Junze, Yu, Yang, Shi, Jian, Luo, Haibo |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.09078 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Adaptive Perception for Unified Visual Multi-modal Object Tracking
by: Hu, Xiantao, et al.
Published: (2025)
by: Hu, Xiantao, et al.
Published: (2025)
UncTrack: Reliable Visual Object Tracking with Uncertainty-Aware Prototype Memory Network
by: Yao, Siyuan, et al.
Published: (2025)
by: Yao, Siyuan, et al.
Published: (2025)
Explicit Visual Prompts for Visual Object Tracking
by: Shi, Liangtao, et al.
Published: (2024)
by: Shi, Liangtao, et al.
Published: (2024)
Exploring Dynamic Transformer for Efficient Object Tracking
by: Zhu, Jiawen, et al.
Published: (2024)
by: Zhu, Jiawen, et al.
Published: (2024)
SpikeTrack: A Spike-driven Framework for Efficient Visual Tracking
by: Zhang, Qiuyang, et al.
Published: (2026)
by: Zhang, Qiuyang, et al.
Published: (2026)
Exploring Reliable PPG Authentication on Smartwatches in Daily Scenarios
by: Tang, Jiankai, et al.
Published: (2025)
by: Tang, Jiankai, et al.
Published: (2025)
Explicit Context Reasoning with Supervision for Visual Tracking
by: Zeng, Fansheng, et al.
Published: (2025)
by: Zeng, Fansheng, et al.
Published: (2025)
Visual Prompt-Agnostic Evolution
by: Wang, Junze, et al.
Published: (2026)
by: Wang, Junze, et al.
Published: (2026)
SMTrack: State-Aware Mamba for Efficient Temporal Modeling in Visual Tracking
by: Ma, Yinchao, et al.
Published: (2026)
by: Ma, Yinchao, et al.
Published: (2026)
AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation
by: Wang, Zili, et al.
Published: (2024)
by: Wang, Zili, et al.
Published: (2024)
Radar and Camera Fusion for Object Detection and Tracking: A Comprehensive Survey
by: Shi, Kun, et al.
Published: (2024)
by: Shi, Kun, et al.
Published: (2024)
Exploring Spatiotemporal Feature Propagation for Video-Level Compressive Spectral Reconstruction: Dataset, Model and Benchmark
by: Cai, Lijing, et al.
Published: (2026)
by: Cai, Lijing, et al.
Published: (2026)
Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking
by: Yang, Xiangyang, et al.
Published: (2024)
by: Yang, Xiangyang, et al.
Published: (2024)
Highly Efficient 3D Human Pose Tracking from Events with Spiking Spatiotemporal Transformer
by: Zou, Shihao, et al.
Published: (2023)
by: Zou, Shihao, et al.
Published: (2023)
SRRT: Exploring Search Region Regulation for Visual Object Tracking
by: Zhu, Jiawen, et al.
Published: (2022)
by: Zhu, Jiawen, et al.
Published: (2022)
CoopTrack: Exploring End-to-End Learning for Efficient Cooperative Sequential Perception
by: Zhong, Jiaru, et al.
Published: (2025)
by: Zhong, Jiaru, et al.
Published: (2025)
Grid-Centric Traffic Scenario Perception for Autonomous Driving: A Comprehensive Review
by: Shi, Yining, et al.
Published: (2023)
by: Shi, Yining, et al.
Published: (2023)
Efficient Motion Prompt Learning for Robust Visual Tracking
by: Zhao, Jie, et al.
Published: (2025)
by: Zhao, Jie, et al.
Published: (2025)
Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization
by: Hu, Jiajun, et al.
Published: (2024)
by: Hu, Jiajun, et al.
Published: (2024)
Towards General Multimodal Visual Tracking
by: Lu, Andong, et al.
Published: (2025)
by: Lu, Andong, et al.
Published: (2025)
Self-Creative Text-to-Object Generation using Semantic-Aware Spatial Weighting
by: Yu, Yue, et al.
Published: (2026)
by: Yu, Yue, et al.
Published: (2026)
Improving Accuracy and Generalization for Efficient Visual Tracking
by: Zaveri, Ram, et al.
Published: (2024)
by: Zaveri, Ram, et al.
Published: (2024)
OpenECAD: An Efficient Visual Language Model for Editable 3D-CAD Design
by: Yuan, Zhe, et al.
Published: (2024)
by: Yuan, Zhe, et al.
Published: (2024)
Constructing and Exploring Intermediate Domains in Mixed Domain Semi-supervised Medical Image Segmentation
by: Ma, Qinghe, et al.
Published: (2024)
by: Ma, Qinghe, et al.
Published: (2024)
Region Matters: Efficient and Reliable Region-Aware Visual Place Recognition
by: Chen, Shunpeng, et al.
Published: (2026)
by: Chen, Shunpeng, et al.
Published: (2026)
Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation
by: Jain, Jitesh, et al.
Published: (2024)
by: Jain, Jitesh, et al.
Published: (2024)
Exploring Reliable Matching with Phase Enhancement for Night-time Semantic Segmentation
by: Pan, Yuwen, et al.
Published: (2024)
by: Pan, Yuwen, et al.
Published: (2024)
Fast Window-Based Event Denoising with Spatiotemporal Correlation Enhancement
by: Fang, Huachen, et al.
Published: (2024)
by: Fang, Huachen, et al.
Published: (2024)
Dynamic Updates for Language Adaptation in Visual-Language Tracking
by: Li, Xiaohai, et al.
Published: (2025)
by: Li, Xiaohai, et al.
Published: (2025)
EmoCaliber: Advancing Reliable Visual Emotion Comprehension via Confidence Verbalization and Calibration
by: Wu, Daiqing, et al.
Published: (2025)
by: Wu, Daiqing, et al.
Published: (2025)
Context Cascade Compression: Exploring the Upper Limits of Text Compression
by: Liu, Fanfan, et al.
Published: (2025)
by: Liu, Fanfan, et al.
Published: (2025)
Frames2Residual: Spatiotemporal Decoupling for Self-Supervised Video Denoising
by: Ji, Mingjie, et al.
Published: (2026)
by: Ji, Mingjie, et al.
Published: (2026)
SAM-PD: How Far Can SAM Take Us in Tracking and Segmenting Anything in Videos by Prompt Denoising
by: Zhou, Tao, et al.
Published: (2024)
by: Zhou, Tao, et al.
Published: (2024)
Learning Spatiotemporal Inconsistency via Thumbnail Layout for Face Deepfake Detection
by: Xu, Yuting, et al.
Published: (2024)
by: Xu, Yuting, et al.
Published: (2024)
ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations
by: Liang, Tianming, et al.
Published: (2025)
by: Liang, Tianming, et al.
Published: (2025)
From Two-Stream to One-Stream: Efficient RGB-T Tracking via Mutual Prompt Learning and Knowledge Distillation
by: Luo, Yang, et al.
Published: (2024)
by: Luo, Yang, et al.
Published: (2024)
Can KAN Work? Exploring the Potential of Kolmogorov-Arnold Networks in Computer Vision
by: Cang, Yueyang, et al.
Published: (2024)
by: Cang, Yueyang, et al.
Published: (2024)
Beyond Visual Cues: Synchronously Exploring Target-Centric Semantics for Vision-Language Tracking
by: Ge, Jiawei, et al.
Published: (2023)
by: Ge, Jiawei, et al.
Published: (2023)
TrackVLA: Embodied Visual Tracking in the Wild
by: Wang, Shaoan, et al.
Published: (2025)
by: Wang, Shaoan, et al.
Published: (2025)
Information Coordination as a Bridge: A Neuro-Symbolic Architecture for Reliable Autonomous Driving Scene Understanding
by: Liu, Shuo, et al.
Published: (2026)
by: Liu, Shuo, et al.
Published: (2026)
Similar Items
-
Adaptive Perception for Unified Visual Multi-modal Object Tracking
by: Hu, Xiantao, et al.
Published: (2025) -
UncTrack: Reliable Visual Object Tracking with Uncertainty-Aware Prototype Memory Network
by: Yao, Siyuan, et al.
Published: (2025) -
Explicit Visual Prompts for Visual Object Tracking
by: Shi, Liangtao, et al.
Published: (2024) -
Exploring Dynamic Transformer for Efficient Object Tracking
by: Zhu, Jiawen, et al.
Published: (2024) -
SpikeTrack: A Spike-driven Framework for Efficient Visual Tracking
by: Zhang, Qiuyang, et al.
Published: (2026)