Saved in:
| Main Authors: | Du, Chengyi, Jin, Keyan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.10048 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Siamese-DETR for Generic Multi-Object Tracking
by: Liu, Qiankun, et al.
Published: (2023)
by: Liu, Qiankun, et al.
Published: (2023)
CoTZero: Annotation-Free Human-Like Vision Reasoning via Hierarchical Synthetic CoT
by: Du, Chengyi, et al.
Published: (2026)
by: Du, Chengyi, et al.
Published: (2026)
Contrastive Learning for Multi-Object Tracking with Transformers
by: De Plaen, Pierre-François, et al.
Published: (2023)
by: De Plaen, Pierre-François, et al.
Published: (2023)
MotionGrounder: Grounded Multi-Object Motion Transfer via Diffusion Transformer
by: Teodoro, Samuel, et al.
Published: (2026)
by: Teodoro, Samuel, et al.
Published: (2026)
SiamMo: Siamese Motion-Centric 3D Object Tracking
by: Yang, Yuxiang, et al.
Published: (2024)
by: Yang, Yuxiang, et al.
Published: (2024)
Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding
by: Tan, Chaolei, et al.
Published: (2024)
by: Tan, Chaolei, et al.
Published: (2024)
MAST: Video Polyp Segmentation with a Mixture-Attention Siamese Transformer
by: Chen, Geng, et al.
Published: (2024)
by: Chen, Geng, et al.
Published: (2024)
Asymmetrical Siamese Network for Point Clouds Normal Estimation
by: Jin, Wei, et al.
Published: (2024)
by: Jin, Wei, et al.
Published: (2024)
Multi-Rationale Explainable Object Recognition via Contrastive Conditional Inference
by: Rasekh, Ali, et al.
Published: (2025)
by: Rasekh, Ali, et al.
Published: (2025)
Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation
by: Wan, Zifu, et al.
Published: (2024)
by: Wan, Zifu, et al.
Published: (2024)
Siamese Transformer Networks for Few-shot Image Classification
by: Jiang, Weihao, et al.
Published: (2024)
by: Jiang, Weihao, et al.
Published: (2024)
IDT: A Physically Grounded Transformer for Feed-Forward Multi-View Intrinsic Decomposition
by: Du, Kang, et al.
Published: (2025)
by: Du, Kang, et al.
Published: (2025)
CreatiLayout: Siamese Multimodal Diffusion Transformer for Creative Layout-to-Image Generation
by: Zhang, Hui, et al.
Published: (2024)
by: Zhang, Hui, et al.
Published: (2024)
Multi-Scale Contrastive Learning for Video Temporal Grounding
by: Nguyen, Thong Thanh, et al.
Published: (2024)
by: Nguyen, Thong Thanh, et al.
Published: (2024)
Open-Vocabulary Indoor Object Grounding with 3D Hierarchical Scene Graph
by: Linok, Sergey, et al.
Published: (2025)
by: Linok, Sergey, et al.
Published: (2025)
Object Hallucination-Free Reinforcement Unlearning for Vision-Language Models
by: Jia, Kaidi, et al.
Published: (2026)
by: Jia, Kaidi, et al.
Published: (2026)
T-SiamTPN: Temporal Siamese Transformer Pyramid Networks for Robust and Efficient UAV Tracking
by: Ardi, Hojat, et al.
Published: (2025)
by: Ardi, Hojat, et al.
Published: (2025)
Multi-Object Tracking by Hierarchical Visual Representations
by: Cao, Jinkun, et al.
Published: (2024)
by: Cao, Jinkun, et al.
Published: (2024)
Cross-Layer Feature Pyramid Transformer for Small Object Detection in Aerial Images
by: Du, Zewen, et al.
Published: (2024)
by: Du, Zewen, et al.
Published: (2024)
Interactive Instance Annotation with Siamese Networks
by: Xu, Xiang, et al.
Published: (2025)
by: Xu, Xiang, et al.
Published: (2025)
Lightweight Full-Convolutional Siamese Tracker
by: Li, Yunfeng, et al.
Published: (2023)
by: Li, Yunfeng, et al.
Published: (2023)
Representation Alignment Contrastive Regularization for Multi-Object Tracking
by: Liu, Zhonglin, et al.
Published: (2024)
by: Liu, Zhonglin, et al.
Published: (2024)
Exploring Simple Siamese Network for High-Resolution Video Quality Assessment
by: Shen, Guotao, et al.
Published: (2025)
by: Shen, Guotao, et al.
Published: (2025)
Dual Cross-Attention Siamese Transformer for Rectal Tumor Regrowth Assessment in Watch-and-Wait Endoscopy
by: Gomez, Jorge Tapias, et al.
Published: (2025)
by: Gomez, Jorge Tapias, et al.
Published: (2025)
Object Affordance Recognition and Grounding via Multi-scale Cross-modal Representation Learning
by: Wan, Xinhang, et al.
Published: (2025)
by: Wan, Xinhang, et al.
Published: (2025)
ContrastAlign: Toward Robust BEV Feature Alignment via Contrastive Learning for Multi-Modal 3D Object Detection
by: Song, Ziying, et al.
Published: (2024)
by: Song, Ziying, et al.
Published: (2024)
SiamGM: Siamese Geometry-Aware and Motion-Guided Network for Real-Time Satellite Video Object Tracking
by: Wen, Zixiao, et al.
Published: (2026)
by: Wen, Zixiao, et al.
Published: (2026)
Hierarchical Neural Collapse Detection Transformer for Class Incremental Object Detection
by: Pham, Duc Thanh, et al.
Published: (2025)
by: Pham, Duc Thanh, et al.
Published: (2025)
SeaDATE: Remedy Dual-Attention Transformer with Semantic Alignment via Contrast Learning for Multimodal Object Detection
by: Dong, Shuhan, et al.
Published: (2024)
by: Dong, Shuhan, et al.
Published: (2024)
Siamese Vision Transformers are Scalable Audio-visual Learners
by: Lin, Yan-Bo, et al.
Published: (2024)
by: Lin, Yan-Bo, et al.
Published: (2024)
Learning to Balance: Decoupled Siamese Diffusion Transformer for Reference-Based Remote Sensing Image Super-Resolution
by: Luo, Bin, et al.
Published: (2026)
by: Luo, Bin, et al.
Published: (2026)
THCRL: Trusted Hierarchical Contrastive Representation Learning for Multi-View Clustering
by: Zhu, Jian
Published: (2025)
by: Zhu, Jian
Published: (2025)
Smart Feature is What You Need
by: Hu, Zhaoxin, et al.
Published: (2024)
by: Hu, Zhaoxin, et al.
Published: (2024)
Hyperbolic Hierarchical Contrastive Hashing
by: Wei, Rukai, et al.
Published: (2022)
by: Wei, Rukai, et al.
Published: (2022)
Balanced Hierarchical Contrastive Learning with Decoupled Queries for Fine-grained Object Detection in Remote Sensing Images
by: Chen, Jingzhou, et al.
Published: (2025)
by: Chen, Jingzhou, et al.
Published: (2025)
Hierarchical Graph Interaction Transformer with Dynamic Token Clustering for Camouflaged Object Detection
by: Yao, Siyuan, et al.
Published: (2024)
by: Yao, Siyuan, et al.
Published: (2024)
Hierarchical Side-Tuning for Vision Transformers
by: Lin, Weifeng, et al.
Published: (2023)
by: Lin, Weifeng, et al.
Published: (2023)
Improving Object Detection via Local-global Contrastive Learning
by: Triantafyllidou, Danai, et al.
Published: (2024)
by: Triantafyllidou, Danai, et al.
Published: (2024)
Self-Supervised Siamese Autoencoders
by: Baier, Friederike, et al.
Published: (2023)
by: Baier, Friederike, et al.
Published: (2023)
Multi-Task Domain Adaptation for Language Grounding with 3D Objects
by: Sun, Penglei, et al.
Published: (2024)
by: Sun, Penglei, et al.
Published: (2024)
Similar Items
-
Siamese-DETR for Generic Multi-Object Tracking
by: Liu, Qiankun, et al.
Published: (2023) -
CoTZero: Annotation-Free Human-Like Vision Reasoning via Hierarchical Synthetic CoT
by: Du, Chengyi, et al.
Published: (2026) -
Contrastive Learning for Multi-Object Tracking with Transformers
by: De Plaen, Pierre-François, et al.
Published: (2023) -
MotionGrounder: Grounded Multi-Object Motion Transfer via Diffusion Transformer
by: Teodoro, Samuel, et al.
Published: (2026) -
SiamMo: Siamese Motion-Centric 3D Object Tracking
by: Yang, Yuxiang, et al.
Published: (2024)