:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kaltampanidis, Yannis, Doumanoglou, Alexandros, Zarpalas, Dimitrios
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2509.15272
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Learning Encoding-Decoding Direction Pairs to Unveil Concepts of Influence in Deep Vision Networks
by: Doumanoglou, Alexandros, et al.
Published: (2025)

Unsupervised Interpretable Basis Extraction for Concept-Based Visual Explanations
by: Doumanoglou, Alexandros, et al.
Published: (2023)

Purrturbed but Stable: Human-Cat Invariant Representations Across CNNs, ViTs and Self-Supervised ViTs
by: Shah, Arya, et al.
Published: (2025)

Unsupervised Object Localization in the Era of Self-Supervised ViTs: A Survey
by: Siméoni, Oriane, et al.
Published: (2023)

Pretrained ViTs Yield Versatile Representations For Medical Images
by: Matsoukas, Christos, et al.
Published: (2023)

I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization
by: Zhong, Yunshan, et al.
Published: (2023)

Token Cropr: Faster ViTs for Quite a Few Tasks
by: Bergner, Benjamin, et al.
Published: (2024)

VIVID-Med: LLM-Supervised Structured Pretraining for Deployable Medical ViTs
by: Wang, Xiyao, et al.
Published: (2026)

STRAP-ViT: Segregated Tokens with Randomized -- Transformations for Defense against Adversarial Patches in ViTs
by: Chattopadhyay, Nandish, et al.
Published: (2026)

U-REPA: Aligning Diffusion U-Nets to ViTs
by: Tian, Yuchuan, et al.
Published: (2025)

Elastic ViTs from Pretrained Models without Retraining
by: Simoncini, Walter, et al.
Published: (2025)

Harnessing the Computation Redundancy in ViTs to Boost Adversarial Transferability
by: Liu, Jiani, et al.
Published: (2025)

Parameter Efficient Fine-tuning of Self-supervised ViTs without Catastrophic Forgetting
by: Bafghi, Reza Akbarian, et al.
Published: (2024)

EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation
by: Liu, Longfei, et al.
Published: (2026)

Colinearity Decay: Training Quantization-Friendly ViTs with Outlier Decay
by: Tong, Jin, et al.
Published: (2026)

TAP-ViTs: Task-Adaptive Pruning for On-Device Deployment of Vision Transformers
by: Wang, Zhibo, et al.
Published: (2026)

Intriguing Frequency Interpretation of Adversarial Robustness for CNNs and ViTs
by: Chen, Lu, et al.
Published: (2025)

CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
by: Ramachandran, Akshat, et al.
Published: (2024)

UniRefiner: Teaching Pre-trained ViTs to Self-Dispose Dross via Contrastive Register
by: Qiu, Congpei, et al.
Published: (2026)

Decomposing and Interpreting Image Representations via Text in ViTs Beyond CLIP
by: Balasubramanian, Sriram, et al.
Published: (2024)

Octic Vision Transformers: Quicker ViTs Through Equivariance
by: Nordström, David, et al.
Published: (2025)

Training-Free Acceleration of ViTs with Delayed Spatial Merging
by: Heo, Jung Hwan, et al.
Published: (2023)

Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor Control
by: Hwang, Dongyoon, et al.
Published: (2024)

AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an Efficient Alternative to Attention in ViTs
by: Zheng, Yunling, et al.
Published: (2024)

ConcatPlexer: Additional Dim1 Batching for Faster ViTs
by: Han, Donghoon, et al.
Published: (2023)

Causality $\neq$ Decodability, and Vice Versa: Lessons from Interpreting Counting ViTs
by: Huang, Lianghuan, et al.
Published: (2025)

ViTs are Everywhere: A Comprehensive Study Showcasing Vision Transformers in Different Domain
by: Mia, Md Sohag, et al.
Published: (2023)

Exploring the Synergies of Hybrid CNNs and ViTs Architectures for Computer Vision: A survey
by: Yunusa, Haruna, et al.
Published: (2024)

DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
by: Kim, Donghyun, et al.
Published: (2024)

CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and Favorable Transferability For ViTs
by: Wang, Ao, et al.
Published: (2023)

ViTs for Action Classification in Videos: An Approach to Risky Tackle Detection in American Football Practice Videos
by: Zaidi, Syed Ahsan Masud, et al.
Published: (2026)

Communication Efficient Split Learning of ViTs with Attention-based Double Compression
by: Alvetreti, Federico, et al.
Published: (2025)

Language-Unlocked ViT (LUViT): Empowering Self-Supervised Vision Transformers with LLMs
by: Kuzucu, Selim, et al.
Published: (2025)

ViT-DD: Multi-Task Vision Transformer for Semi-Supervised Driver Distraction Detection
by: Ma, Yunsheng, et al.
Published: (2022)

Register and [CLS] tokens yield a decoupling of local and global features in large ViTs
by: Lappe, Alexander, et al.
Published: (2025)

Layer by layer, module by module: Choose both for optimal OOD probing of ViT
by: Odonnat, Ambroise, et al.
Published: (2026)

QID: Efficient Query-Informed ViTs in Data-Scarce Regimes for OCR-free Visual Document Understanding
by: Le, Binh M., et al.
Published: (2025)

Rethinking Random Masking in Self-Distillation on ViT
by: Seong, Jihyeon, et al.
Published: (2025)

SAFE: a SAR Feature Extractor based on self-supervised learning and masked Siamese ViTs
by: Muzeau, Max, et al.
Published: (2024)

ViT-Lens: Towards Omni-modal Representations
by: Lei, Weixian, et al.
Published: (2023)