Saved in:
| Main Authors: | Kaltampanidis, Yannis, Doumanoglou, Alexandros, Zarpalas, Dimitrios |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.15272 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Learning Encoding-Decoding Direction Pairs to Unveil Concepts of Influence in Deep Vision Networks
by: Doumanoglou, Alexandros, et al.
Published: (2025)
by: Doumanoglou, Alexandros, et al.
Published: (2025)
Unsupervised Interpretable Basis Extraction for Concept-Based Visual Explanations
by: Doumanoglou, Alexandros, et al.
Published: (2023)
by: Doumanoglou, Alexandros, et al.
Published: (2023)
Purrturbed but Stable: Human-Cat Invariant Representations Across CNNs, ViTs and Self-Supervised ViTs
by: Shah, Arya, et al.
Published: (2025)
by: Shah, Arya, et al.
Published: (2025)
Unsupervised Object Localization in the Era of Self-Supervised ViTs: A Survey
by: Siméoni, Oriane, et al.
Published: (2023)
by: Siméoni, Oriane, et al.
Published: (2023)
Pretrained ViTs Yield Versatile Representations For Medical Images
by: Matsoukas, Christos, et al.
Published: (2023)
by: Matsoukas, Christos, et al.
Published: (2023)
I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization
by: Zhong, Yunshan, et al.
Published: (2023)
by: Zhong, Yunshan, et al.
Published: (2023)
Token Cropr: Faster ViTs for Quite a Few Tasks
by: Bergner, Benjamin, et al.
Published: (2024)
by: Bergner, Benjamin, et al.
Published: (2024)
VIVID-Med: LLM-Supervised Structured Pretraining for Deployable Medical ViTs
by: Wang, Xiyao, et al.
Published: (2026)
by: Wang, Xiyao, et al.
Published: (2026)
STRAP-ViT: Segregated Tokens with Randomized -- Transformations for Defense against Adversarial Patches in ViTs
by: Chattopadhyay, Nandish, et al.
Published: (2026)
by: Chattopadhyay, Nandish, et al.
Published: (2026)
U-REPA: Aligning Diffusion U-Nets to ViTs
by: Tian, Yuchuan, et al.
Published: (2025)
by: Tian, Yuchuan, et al.
Published: (2025)
Elastic ViTs from Pretrained Models without Retraining
by: Simoncini, Walter, et al.
Published: (2025)
by: Simoncini, Walter, et al.
Published: (2025)
Harnessing the Computation Redundancy in ViTs to Boost Adversarial Transferability
by: Liu, Jiani, et al.
Published: (2025)
by: Liu, Jiani, et al.
Published: (2025)
Parameter Efficient Fine-tuning of Self-supervised ViTs without Catastrophic Forgetting
by: Bafghi, Reza Akbarian, et al.
Published: (2024)
by: Bafghi, Reza Akbarian, et al.
Published: (2024)
EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation
by: Liu, Longfei, et al.
Published: (2026)
by: Liu, Longfei, et al.
Published: (2026)
Colinearity Decay: Training Quantization-Friendly ViTs with Outlier Decay
by: Tong, Jin, et al.
Published: (2026)
by: Tong, Jin, et al.
Published: (2026)
TAP-ViTs: Task-Adaptive Pruning for On-Device Deployment of Vision Transformers
by: Wang, Zhibo, et al.
Published: (2026)
by: Wang, Zhibo, et al.
Published: (2026)
Intriguing Frequency Interpretation of Adversarial Robustness for CNNs and ViTs
by: Chen, Lu, et al.
Published: (2025)
by: Chen, Lu, et al.
Published: (2025)
CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
by: Ramachandran, Akshat, et al.
Published: (2024)
by: Ramachandran, Akshat, et al.
Published: (2024)
UniRefiner: Teaching Pre-trained ViTs to Self-Dispose Dross via Contrastive Register
by: Qiu, Congpei, et al.
Published: (2026)
by: Qiu, Congpei, et al.
Published: (2026)
Decomposing and Interpreting Image Representations via Text in ViTs Beyond CLIP
by: Balasubramanian, Sriram, et al.
Published: (2024)
by: Balasubramanian, Sriram, et al.
Published: (2024)
Octic Vision Transformers: Quicker ViTs Through Equivariance
by: Nordström, David, et al.
Published: (2025)
by: Nordström, David, et al.
Published: (2025)
Training-Free Acceleration of ViTs with Delayed Spatial Merging
by: Heo, Jung Hwan, et al.
Published: (2023)
by: Heo, Jung Hwan, et al.
Published: (2023)
Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor Control
by: Hwang, Dongyoon, et al.
Published: (2024)
by: Hwang, Dongyoon, et al.
Published: (2024)
AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an Efficient Alternative to Attention in ViTs
by: Zheng, Yunling, et al.
Published: (2024)
by: Zheng, Yunling, et al.
Published: (2024)
ConcatPlexer: Additional Dim1 Batching for Faster ViTs
by: Han, Donghoon, et al.
Published: (2023)
by: Han, Donghoon, et al.
Published: (2023)
Causality $\neq$ Decodability, and Vice Versa: Lessons from Interpreting Counting ViTs
by: Huang, Lianghuan, et al.
Published: (2025)
by: Huang, Lianghuan, et al.
Published: (2025)
ViTs are Everywhere: A Comprehensive Study Showcasing Vision Transformers in Different Domain
by: Mia, Md Sohag, et al.
Published: (2023)
by: Mia, Md Sohag, et al.
Published: (2023)
Exploring the Synergies of Hybrid CNNs and ViTs Architectures for Computer Vision: A survey
by: Yunusa, Haruna, et al.
Published: (2024)
by: Yunusa, Haruna, et al.
Published: (2024)
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
by: Kim, Donghyun, et al.
Published: (2024)
by: Kim, Donghyun, et al.
Published: (2024)
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and Favorable Transferability For ViTs
by: Wang, Ao, et al.
Published: (2023)
by: Wang, Ao, et al.
Published: (2023)
ViTs for Action Classification in Videos: An Approach to Risky Tackle Detection in American Football Practice Videos
by: Zaidi, Syed Ahsan Masud, et al.
Published: (2026)
by: Zaidi, Syed Ahsan Masud, et al.
Published: (2026)
Communication Efficient Split Learning of ViTs with Attention-based Double Compression
by: Alvetreti, Federico, et al.
Published: (2025)
by: Alvetreti, Federico, et al.
Published: (2025)
Language-Unlocked ViT (LUViT): Empowering Self-Supervised Vision Transformers with LLMs
by: Kuzucu, Selim, et al.
Published: (2025)
by: Kuzucu, Selim, et al.
Published: (2025)
ViT-DD: Multi-Task Vision Transformer for Semi-Supervised Driver Distraction Detection
by: Ma, Yunsheng, et al.
Published: (2022)
by: Ma, Yunsheng, et al.
Published: (2022)
Register and [CLS] tokens yield a decoupling of local and global features in large ViTs
by: Lappe, Alexander, et al.
Published: (2025)
by: Lappe, Alexander, et al.
Published: (2025)
Layer by layer, module by module: Choose both for optimal OOD probing of ViT
by: Odonnat, Ambroise, et al.
Published: (2026)
by: Odonnat, Ambroise, et al.
Published: (2026)
QID: Efficient Query-Informed ViTs in Data-Scarce Regimes for OCR-free Visual Document Understanding
by: Le, Binh M., et al.
Published: (2025)
by: Le, Binh M., et al.
Published: (2025)
Rethinking Random Masking in Self-Distillation on ViT
by: Seong, Jihyeon, et al.
Published: (2025)
by: Seong, Jihyeon, et al.
Published: (2025)
SAFE: a SAR Feature Extractor based on self-supervised learning and masked Siamese ViTs
by: Muzeau, Max, et al.
Published: (2024)
by: Muzeau, Max, et al.
Published: (2024)
ViT-Lens: Towards Omni-modal Representations
by: Lei, Weixian, et al.
Published: (2023)
by: Lei, Weixian, et al.
Published: (2023)
Similar Items
-
Learning Encoding-Decoding Direction Pairs to Unveil Concepts of Influence in Deep Vision Networks
by: Doumanoglou, Alexandros, et al.
Published: (2025) -
Unsupervised Interpretable Basis Extraction for Concept-Based Visual Explanations
by: Doumanoglou, Alexandros, et al.
Published: (2023) -
Purrturbed but Stable: Human-Cat Invariant Representations Across CNNs, ViTs and Self-Supervised ViTs
by: Shah, Arya, et al.
Published: (2025) -
Unsupervised Object Localization in the Era of Self-Supervised ViTs: A Survey
by: Siméoni, Oriane, et al.
Published: (2023) -
Pretrained ViTs Yield Versatile Representations For Medical Images
by: Matsoukas, Christos, et al.
Published: (2023)