:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Naruko, Takahiro, Akutsu, Hiroaki
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2506.01519
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Comprehensive Survey of Model Compression and Speed up for Vision Transformers
by: Chen, Feiyang, et al.
Published: (2024)

Patch Pruning Strategy Based on Robust Statistical Measures of Attention Weight Diversity in Vision Transformers
by: Igaue, Yuki, et al.
Published: (2025)

There is More to Attention: Statistical Filtering Enhances Explanations in Vision Transformers
by: Ayyar, Meghna P, et al.
Published: (2025)

ToSA: Token Selective Attention for Efficient Vision Transformers
by: Singh, Manish Kumar, et al.
Published: (2024)

Variation-aware Vision Token Dropping for Faster Large Vision-Language Models
by: Chen, Junjie, et al.
Published: (2025)

Attention Debiasing for Token Pruning in Vision Language Models
by: Zhao, Kai, et al.
Published: (2025)

PolaFormer: Polarity-aware Linear Attention for Vision Transformers
by: Meng, Weikang, et al.
Published: (2025)

HEART-VIT: Hessian-Guided Efficient Dynamic Attention and Token Pruning in Vision Transformer
by: Uddin, Mohammad Helal, et al.
Published: (2025)

TCSAFormer: Efficient Vision Transformer with Token Compression and Sparse Attention for Medical Image Segmentation
by: Xia, Zunhui, et al.
Published: (2025)

Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers
by: Lee, Sanghyeok, et al.
Published: (2024)

Vision Transformer with Super Token Sampling
by: Huang, Huaibo, et al.
Published: (2022)

Tracking Small Birds by Detection Candidate Region Filtering and Detection History-aware Association
by: Liu, Tingwei, et al.
Published: (2024)

Decorrelation Speeds Up Vision Transformers
by: Carrigg, Kieran, et al.
Published: (2025)

ViTGuard: Attention-aware Detection against Adversarial Examples for Vision Transformer
by: Sun, Shihua, et al.
Published: (2024)

SkipViT: Speeding Up Vision Transformers with a Token-Level Skip Connection
by: Ataiefard, Foozhan, et al.
Published: (2024)

SPOT: Sparsification with Attention Dynamics via Token Relevance in Vision Transformers
by: Schlesinger, Oded, et al.
Published: (2025)

HiPrune: Hierarchical Attention for Efficient Token Pruning in Vision-Language Models
by: Liu, Jizhihui, et al.
Published: (2025)

LearnPruner: Rethinking Attention-based Token Pruning in Vision Language Models
by: Takezoe, Rinyoichi, et al.
Published: (2026)

Visual-Word Tokenizer: Beyond Fixed Sets of Tokens in Vision Transformers
by: Gee, Leonidas, et al.
Published: (2024)

Superpixel Tokenization for Vision Transformers: Preserving Semantic Integrity in Visual Tokens
by: Lew, Jaihyun, et al.
Published: (2024)

Representative Attention For Vision Transformers
by: Li, Yuntong, et al.
Published: (2026)

Vision Transformers with Hierarchical Attention
by: Liu, Yun, et al.
Published: (2021)

Fairness-aware Vision Transformer via Debiased Self-Attention
by: Qiang, Yao, et al.
Published: (2023)

Token Transforming: A Unified and Training-Free Token Compression Framework for Vision Transformer Acceleration
by: Zeng, Fanhu, et al.
Published: (2025)

Wavelet-Based Image Tokenizer for Vision Transformers
by: Zhu, Zhenhai, et al.
Published: (2024)

Flatness-aware Curriculum Learning via Adversarial Difficulty
by: Aizawa, Hiroaki, et al.
Published: (2025)

Vision Transformers are Circulant Attention Learners
by: Han, Dongchen, et al.
Published: (2025)

FViT: A Focal Vision Transformer with Gabor Filter
by: Shi, Yulong, et al.
Published: (2024)

Structured Initialization for Attention in Vision Transformers
by: Zheng, Jianqiao, et al.
Published: (2024)

Multi-manifold Attention for Vision Transformers
by: Konstantinidis, Dimitrios, et al.
Published: (2022)

HAViT: Historical Attention Vision Transformer
by: Banik, Swarnendu, et al.
Published: (2026)

PPT: Token Pruning and Pooling for Efficient Vision Transformers
by: Wu, Xinjian, et al.
Published: (2023)

ViTOC: Vision Transformer and Object-aware Captioner
by: Huang, Feiyang
Published: (2024)

Stable at Any Speed: Speed-Driven Multi-Object Tracking with Learnable Kalman Filtering
by: Gong, Yan, et al.
Published: (2025)

PAINT: Paying Attention to INformed Tokens to Mitigate Hallucination in Large Vision-Language Model
by: Arif, Kazi Hasan Ibn, et al.
Published: (2025)

Token Transformation Matters: Towards Faithful Post-hoc Explanation for Vision Transformer
by: Wu, Junyi, et al.
Published: (2024)

Multi-dimension Transformer with Attention-based Filtering for Medical Image Segmentation
by: Wang, Wentao, et al.
Published: (2024)

Context-Aware Token Pruning and Discriminative Selective Attention for Transformer Tracking
by: Kugarajeevan, Janani, et al.
Published: (2025)

Attention-aware Social Graph Transformer Networks for Stochastic Trajectory Prediction
by: Liu, Yao, et al.
Published: (2023)

Polyline Path Masked Attention for Vision Transformer
by: Zhao, Zhongchen, et al.
Published: (2025)