:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Hu, Xixu, Zheng, Runkai, Wang, Jindong, Leung, Cheuk Hang, Wu, Qi, Xie, Xing
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2402.03317
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Exploring Scale Shift in Crowd Localization under the Context of Domain Generalization
by: Wang, Juncheng, et al.
Published: (2025)

Proto-Former: Unified Facial Landmark Detection by Prototype Transformer
by: Hu, Shengkai, et al.
Published: (2025)

SpecGuard: Spectral Projection-based Advanced Invisible Watermarking
by: Alam, Inzamamul, et al.
Published: (2025)

Risk-Neutral Generative Networks
by: Xian, Zhonghao, et al.
Published: (2024)

Distributionally Robust Policy Evaluation and Learning for Continuous Treatment with Observational Data
by: Leung, Cheuk Hang, et al.
Published: (2025)

SparseFormer: Detecting Objects in HRW Shots via Sparse Vision Transformer
by: Li, Wenxi, et al.
Published: (2025)

Probabilistic Learning of Multivariate Time Series with Temporal Irregularity
by: Li, Yijun, et al.
Published: (2023)

Distribution-valued Causal Machine Learning: Implications of Credit on Spending Patterns
by: Leung, Cheuk Hang, et al.
Published: (2025)

PolaFormer: Polarity-aware Linear Attention for Vision Transformers
by: Meng, Weikang, et al.
Published: (2025)

SLAM-Former: Putting SLAM into One Transformer
by: Yuan, Yijun, et al.
Published: (2025)

FreeSpec: Training-Free Long Video Generation via Singular-Spectrum Reconstruction
by: Chen, Fangda, et al.
Published: (2026)

Boosting 3D Neuron Segmentation with 2D Vision Transformer Pre-trained on Natural Images
by: Cheng, Yik San, et al.
Published: (2024)

PlankFormer: Robust Plankton Instance Segmentation via MAE-Pretrained Vision Transformers and Pseudo Community Image Generation
by: Miyazaki, Masaharu, et al.
Published: (2026)

Finetune Like You Pretrain: Boosting Zero-shot Adversarial Robustness in Vision-language Models
by: Xing, Songlong, et al.
Published: (2026)

Unveiling the Potential of Robustness in Selecting Conditional Average Treatment Effect Estimators
by: Huang, Yiyan, et al.
Published: (2024)

PatchGuard: Adversarially Robust Anomaly Detection and Localization through Vision Transformers and Pseudo Anomalies
by: Nafez, Mojtaba, et al.
Published: (2025)

SALT: Parameter-Efficient Fine-Tuning via Singular Value Adaptation with Low-Rank Transformation
by: Elsayed, Abdelrahman, et al.
Published: (2025)

CountFormer: Multi-View Crowd Counting Transformer
by: Mo, Hong, et al.
Published: (2024)

HexFormer: Hyperbolic Vision Transformer with Exponential Map Aggregation
by: Alyoussef, Haya, et al.
Published: (2026)

TreeFormers -- An Exploration of Vision Transformers for Deforestation Driver Classification
by: Ochuba, Uche
Published: (2024)

CLIP-SVD: Efficient and Interpretable Vision-Language Adaptation via Singular Values
by: Koleilat, Taha, et al.
Published: (2025)

AnchorFormer: Differentiable Anchor Attention for Efficient Vision Transformer
by: Shan, Jiquan, et al.
Published: (2025)

SpecSAR-Former: A Lightweight Transformer-based Network for Global LULC Mapping Using Integrated Sentinel-1 and Sentinel-2
by: Yu, Hao, et al.
Published: (2024)

PartFormer: Awakening Latent Diverse Representation from Vision Transformer for Object Re-Identification
by: Tan, Lei, et al.
Published: (2024)

ImplantFormer: Vision Transformer based Implant Position Regression Using Dental CBCT Data
by: Yang, Xinquan, et al.
Published: (2022)

LoFormer: Local Frequency Transformer for Image Deblurring
by: Mao, Xintian, et al.
Published: (2024)

AuthGuard: Generalizable Deepfake Detection via Language Guidance
by: Shen, Guangyu, et al.
Published: (2025)

TCI-Former: Thermal Conduction-Inspired Transformer for Infrared Small Target Detection
by: Chen, Tianxiang, et al.
Published: (2024)

PDiscoFormer: Relaxing Part Discovery Constraints with Vision Transformers
by: Aniraj, Ananthu, et al.
Published: (2024)

SplatFormer: Point Transformer for Robust 3D Gaussian Splatting
by: Chen, Yutong, et al.
Published: (2024)

Bootstrapping SparseFormers from Vision Foundation Models
by: Gao, Ziteng, et al.
Published: (2023)

Latent Guard: a Safety Framework for Text-to-image Generation
by: Liu, Runtao, et al.
Published: (2024)

NeuroSeg Meets DINOv3: Transferring 2D Self-Supervised Visual Priors to 3D Neuron Segmentation via DINOv3 Initialization
by: Cheng, Yik San, et al.
Published: (2026)

TGBFormer: Transformer-GraphFormer Blender Network for Video Object Detection
by: Qi, Qiang, et al.
Published: (2025)

VistaFormer: Scalable Vision Transformers for Satellite Image Time Series Segmentation
by: MacDonald, Ezra, et al.
Published: (2024)

DuoFormer: Leveraging Hierarchical Representations by Local and Global Attention Vision Transformer
by: Tang, Xiaoya, et al.
Published: (2025)

Learning Visual Prompts for Guiding the Attention of Vision Transformers
by: Rezaei, Razieh, et al.
Published: (2024)

Beyond Retraining: Training-Free Unknown Class Filtering for Source-Free Open Set Domain Adaptation of Vision-Language Models
by: Li, Yongguang, et al.
Published: (2025)

iFormer: Integrating ConvNet and Transformer for Mobile Application
by: Zheng, Chuanyang
Published: (2025)

SigFormer: Sparse Signal-Guided Transformer for Multi-Modal Human Action Segmentation
by: Liu, Qi, et al.
Published: (2023)