Saved in:
| Main Authors: | Nikzad, Nick, Liao, Yi, Gao, Yongsheng, Zhou, Jun |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.19850 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CSA-Net: Channel-wise Spatially Autocorrelated Attention Networks
by: Nikzad, Nick, et al.
Published: (2024)
by: Nikzad, Nick, et al.
Published: (2024)
TraNCE: Transformative Non-linear Concept Explainer for CNNs
by: Akpudo, Ugochukwu Ejike, et al.
Published: (2025)
by: Akpudo, Ugochukwu Ejike, et al.
Published: (2025)
Neuron Abandoning Attention Flow: Visual Explanation of Dynamics inside CNN Models
by: Liao, Yi, et al.
Published: (2024)
by: Liao, Yi, et al.
Published: (2024)
Pruning One More Token is Enough: Leveraging Latency-Workload Non-Linearities for Vision Transformers on the Edge
by: Eliopoulos, Nick John, et al.
Published: (2024)
by: Eliopoulos, Nick John, et al.
Published: (2024)
SPoT: Subpixel Placement of Tokens in Vision Transformers
by: Hjelkrem-Tan, Martine, et al.
Published: (2025)
by: Hjelkrem-Tan, Martine, et al.
Published: (2025)
Uncertainty-DTW for Sequences and Visual Tokens
by: Wang, Lei, et al.
Published: (2026)
by: Wang, Lei, et al.
Published: (2026)
Token Turing Machines are Efficient Vision Models
by: Jajal, Purvish, et al.
Published: (2024)
by: Jajal, Purvish, et al.
Published: (2024)
Positional Encodings Anchor Spatial Structure in Vision Transformers: A Geometric Perspective on Robustness
by: Mannes, Mahmoud
Published: (2026)
by: Mannes, Mahmoud
Published: (2026)
DPL: Decoupled Prototype Learning for Enhancing Robustness of Vision-Language Transformers to Missing Modalities
by: Lu, Jueqing, et al.
Published: (2025)
by: Lu, Jueqing, et al.
Published: (2025)
AdaPerceiver: Transformers with Adaptive Width, Depth, and Tokens
by: Jajal, Purvish, et al.
Published: (2025)
by: Jajal, Purvish, et al.
Published: (2025)
ZACH-ViT: A Zero-Token Vision Transformer with ShuffleStrides Data Augmentation for Robust Lung Ultrasound Classification
by: Angelakis, Athanasios, et al.
Published: (2025)
by: Angelakis, Athanasios, et al.
Published: (2025)
Robustness Tokens: Towards Adversarial Robustness of Transformers
by: Pulfer, Brian, et al.
Published: (2025)
by: Pulfer, Brian, et al.
Published: (2025)
Energy-Regularized Spatial Masking: A Novel Approach to Enhancing Robustness and Interpretability in Vision Models
by: Devynck, Tom, et al.
Published: (2026)
by: Devynck, Tom, et al.
Published: (2026)
Dynamic Accumulated Attention Map for Interpreting Evolution of Decision-Making in Vision Transformer
by: Liao, Yi, et al.
Published: (2025)
by: Liao, Yi, et al.
Published: (2025)
Leveraging Registers in Vision Transformers for Robust Adaptation
by: Yellapragada, Srikar, et al.
Published: (2025)
by: Yellapragada, Srikar, et al.
Published: (2025)
ToFe: Lagged Token Freezing and Reusing for Efficient Vision Transformer Inference
by: Zhang, Haoyue, et al.
Published: (2025)
by: Zhang, Haoyue, et al.
Published: (2025)
Inducing Spatial Locality in Vision Transformers through the Training Protocol
by: Toledo, Eduardo Santiago, et al.
Published: (2026)
by: Toledo, Eduardo Santiago, et al.
Published: (2026)
From Edges to Depth: Probing the Spatial Hierarchy in Vision Transformers
by: Sanghavi, Jainum
Published: (2026)
by: Sanghavi, Jainum
Published: (2026)
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers
by: Liu, Haoyang, et al.
Published: (2024)
by: Liu, Haoyang, et al.
Published: (2024)
Privacy-Aware Video Anomaly Detection through Orthogonal Subspace Projection
by: Wang, Lei, et al.
Published: (2026)
by: Wang, Lei, et al.
Published: (2026)
Mechanistic Understandings of Representation Vulnerabilities and Engineering Robust Vision Transformers
by: Islam, Chashi Mahiul, et al.
Published: (2025)
by: Islam, Chashi Mahiul, et al.
Published: (2025)
Trust-Aware Joint Feature-Prediction Discrepancy for Robust Domain Adaptation
by: Ding, Xi, et al.
Published: (2026)
by: Ding, Xi, et al.
Published: (2026)
Enhancing JEPAs with Spatial Conditioning: Robust and Efficient Representation Learning
by: Littwin, Etai, et al.
Published: (2024)
by: Littwin, Etai, et al.
Published: (2024)
CMAViT: Integrating Climate, Managment, and Remote Sensing Data for Crop Yield Estimation with Multimodel Vision Transformers
by: Kamangir, Hamid, et al.
Published: (2024)
by: Kamangir, Hamid, et al.
Published: (2024)
MABViT -- Modified Attention Block Enhances Vision Transformers
by: Ramesh, Mahesh, et al.
Published: (2023)
by: Ramesh, Mahesh, et al.
Published: (2023)
Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training
by: Chen, Yangyi, et al.
Published: (2025)
by: Chen, Yangyi, et al.
Published: (2025)
LayerShuffle: Enhancing Robustness in Vision Transformers by Randomizing Layer Execution Order
by: Freiberger, Matthias, et al.
Published: (2024)
by: Freiberger, Matthias, et al.
Published: (2024)
Token Caching for Diffusion Transformer Acceleration
by: Lou, Jinming, et al.
Published: (2024)
by: Lou, Jinming, et al.
Published: (2024)
TORE: Token Recycling in Vision Transformers for Efficient Active Visual Exploration
by: Olszewski, Jan, et al.
Published: (2023)
by: Olszewski, Jan, et al.
Published: (2023)
Propensity-driven Uncertainty Learning for Sample Exploration in Source-Free Active Domain Adaptation
by: Pan, Zicheng, et al.
Published: (2025)
by: Pan, Zicheng, et al.
Published: (2025)
ELSA: Exact Linear-Scan Attention for Fast and Memory-Light Vision Transformers
by: Hsu, Chih-Chung, et al.
Published: (2026)
by: Hsu, Chih-Chung, et al.
Published: (2026)
Split Adaptation for Pre-trained Vision Transformers
by: Wang, Lixu, et al.
Published: (2025)
by: Wang, Lixu, et al.
Published: (2025)
Efficient Visual Transformer by Learnable Token Merging
by: Wang, Yancheng, et al.
Published: (2024)
by: Wang, Yancheng, et al.
Published: (2024)
DeSparsify: Adversarial Attack Against Token Sparsification Mechanisms in Vision Transformers
by: Yehezkel, Oryan, et al.
Published: (2024)
by: Yehezkel, Oryan, et al.
Published: (2024)
Data-independent Module-aware Pruning for Hierarchical Vision Transformers
by: He, Yang, et al.
Published: (2024)
by: He, Yang, et al.
Published: (2024)
SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization
by: Hu, Xixu, et al.
Published: (2024)
by: Hu, Xixu, et al.
Published: (2024)
HQViT: Hybrid Quantum Vision Transformer for Image Classification
by: Zhang, Hui, et al.
Published: (2025)
by: Zhang, Hui, et al.
Published: (2025)
Unlocking Noise-Resistant Vision: Key Architectural Secrets for Robust Models
by: Kim, Bum Jun, et al.
Published: (2025)
by: Kim, Bum Jun, et al.
Published: (2025)
Chain-of-Thought Prompting for Demographic Inference with Large Multimodal Models
by: Yu, Yongsheng, et al.
Published: (2024)
by: Yu, Yongsheng, et al.
Published: (2024)
CubistMerge: Spatial-Preserving Token Merging For Diverse ViT Backbones
by: Gong, Wenyi, et al.
Published: (2025)
by: Gong, Wenyi, et al.
Published: (2025)
Similar Items
-
CSA-Net: Channel-wise Spatially Autocorrelated Attention Networks
by: Nikzad, Nick, et al.
Published: (2024) -
TraNCE: Transformative Non-linear Concept Explainer for CNNs
by: Akpudo, Ugochukwu Ejike, et al.
Published: (2025) -
Neuron Abandoning Attention Flow: Visual Explanation of Dynamics inside CNN Models
by: Liao, Yi, et al.
Published: (2024) -
Pruning One More Token is Enough: Leveraging Latency-Workload Non-Linearities for Vision Transformers on the Edge
by: Eliopoulos, Nick John, et al.
Published: (2024) -
SPoT: Subpixel Placement of Tokens in Vision Transformers
by: Hjelkrem-Tan, Martine, et al.
Published: (2025)