Saved in:
| Main Authors: | Ignat, Polezhaev, Igor, Goncharenko, Natalya, Iurina |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.19501 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Case-Enhanced Vision Transformer: Improving Explanations of Image Similarity with a ViT-based Similarity Metric
by: Zhao, Ziwei, et al.
Published: (2024)
by: Zhao, Ziwei, et al.
Published: (2024)
SAC-ViT: Semantic-Aware Clustering Vision Transformer with Early Exit
by: Hu, Youbing, et al.
Published: (2025)
by: Hu, Youbing, et al.
Published: (2025)
ButterflyViT: 354$\times$ Expert Compression for Edge Vision Transformers
by: Karmore, Aryan
Published: (2026)
by: Karmore, Aryan
Published: (2026)
ViT-2SPN: Vision Transformer-based Dual-Stream Self-Supervised Pretraining Networks for Retinal OCT Classification
by: Saraei, Mohammadreza, et al.
Published: (2025)
by: Saraei, Mohammadreza, et al.
Published: (2025)
Beyond Translation: Cross-Cultural Meme Transcreation with Vision-Language Models
by: Zhao, Yuming, et al.
Published: (2026)
by: Zhao, Yuming, et al.
Published: (2026)
LF-ViT: Reducing Spatial Redundancy in Vision Transformer for Efficient Image Recognition
by: Hu, Youbing, et al.
Published: (2024)
by: Hu, Youbing, et al.
Published: (2024)
DFQ-ViT: Data-Free Quantization for Vision Transformers without Fine-tuning
by: Tong, Yujia, et al.
Published: (2025)
by: Tong, Yujia, et al.
Published: (2025)
bViT: Investigating Single-Block Recurrence in Vision Transformers for Image Recognition
by: Byra, Michal, et al.
Published: (2026)
by: Byra, Michal, et al.
Published: (2026)
GTP-ViT: Efficient Vision Transformers via Graph-based Token Propagation
by: Xu, Xuwei, et al.
Published: (2023)
by: Xu, Xuwei, et al.
Published: (2023)
FasterViT: Fast Vision Transformers with Hierarchical Attention
by: Hatamizadeh, Ali, et al.
Published: (2023)
by: Hatamizadeh, Ali, et al.
Published: (2023)
Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
by: Shi, Huihong, et al.
Published: (2024)
by: Shi, Huihong, et al.
Published: (2024)
JetViT: Efficient High-Resolution Vision Transformer with Post-Training Attention Search
by: Zou, Dongyun, et al.
Published: (2026)
by: Zou, Dongyun, et al.
Published: (2026)
ViTs are Everywhere: A Comprehensive Study Showcasing Vision Transformers in Different Domain
by: Mia, Md Sohag, et al.
Published: (2023)
by: Mia, Md Sohag, et al.
Published: (2023)
CascadedViT: Cascaded Chunk-FeedForward and Cascaded Group Attention Vision Transformer
by: Sivakumar, Srivathsan, et al.
Published: (2025)
by: Sivakumar, Srivathsan, et al.
Published: (2025)
Tiny-ViT: A Compact Vision Transformer for Efficient and Explainable Potato Leaf Disease Classification
by: Mia, Shakil, et al.
Published: (2026)
by: Mia, Shakil, et al.
Published: (2026)
IPTQ-ViT: Post-Training Quantization of Non-linear Functions for Integer-only Vision Transformers
by: Kim, Gihwan, et al.
Published: (2025)
by: Kim, Gihwan, et al.
Published: (2025)
VariViT: A Vision Transformer for Variable Image Sizes
by: Varma, Aswathi, et al.
Published: (2026)
by: Varma, Aswathi, et al.
Published: (2026)
ScriptViT: Vision Transformer-Based Personalized Handwriting Generation
by: Acharya, Sajjan, et al.
Published: (2025)
by: Acharya, Sajjan, et al.
Published: (2025)
Octic Vision Transformers: Quicker ViTs Through Equivariance
by: Nordström, David, et al.
Published: (2025)
by: Nordström, David, et al.
Published: (2025)
Event-Based Eye Tracking. 2025 Event-based Vision Workshop
by: Chen, Qinyu, et al.
Published: (2025)
by: Chen, Qinyu, et al.
Published: (2025)
CornViT: A Multi-Stage Convolutional Vision Transformer Framework for Hierarchical Corn Kernel Analysis
by: Erukude, Sai Teja, et al.
Published: (2025)
by: Erukude, Sai Teja, et al.
Published: (2025)
RePaViT: Scalable Vision Transformer Acceleration via Structural Reparameterization on Feedforward Network Layers
by: Xu, Xuwei, et al.
Published: (2025)
by: Xu, Xuwei, et al.
Published: (2025)
ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling
by: Yan, Siming, et al.
Published: (2024)
by: Yan, Siming, et al.
Published: (2024)
Q-HyViT: Post-Training Quantization of Hybrid Vision Transformers with Bridge Block Reconstruction for IoT Systems
by: Lee, Jemin, et al.
Published: (2023)
by: Lee, Jemin, et al.
Published: (2023)
[Re] Improving Interpretation Faithfulness for Vision Transformers
by: Kurek, Izabela, et al.
Published: (2025)
by: Kurek, Izabela, et al.
Published: (2025)
Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers
by: Li, Zhengang, et al.
Published: (2024)
by: Li, Zhengang, et al.
Published: (2024)
TAP-ViTs: Task-Adaptive Pruning for On-Device Deployment of Vision Transformers
by: Wang, Zhibo, et al.
Published: (2026)
by: Wang, Zhibo, et al.
Published: (2026)
TinyViT-Batten: Few-Shot Vision Transformer with Explainable Attention for Early Batten-Disease Detection on Pediatric MRI
by: Uppalapati, Khartik, et al.
Published: (2025)
by: Uppalapati, Khartik, et al.
Published: (2025)
MAVEN A Multi-Agent Framework for Multicultural Text-to-Video Generation
by: Li, Shuowei, et al.
Published: (2026)
by: Li, Shuowei, et al.
Published: (2026)
When Cultures Meet: Multicultural Text-to-Image Generation
by: Bhalerao, Parth, et al.
Published: (2025)
by: Bhalerao, Parth, et al.
Published: (2025)
Event-Based Eye Tracking. AIS 2024 Challenge Survey
by: Wang, Zuowen, et al.
Published: (2024)
by: Wang, Zuowen, et al.
Published: (2024)
Evaluating Image-Based Face and Eye Tracking with Event Cameras
by: Iddrisu, Khadija, et al.
Published: (2024)
by: Iddrisu, Khadija, et al.
Published: (2024)
Improved Ear Verification with Vision Transformers and Overlapping Patches
by: Arun, Deeksha, et al.
Published: (2025)
by: Arun, Deeksha, et al.
Published: (2025)
PEANO-ViT: Power-Efficient Approximations of Non-Linearities in Vision Transformers
by: Sadeghi, Mohammad Erfan, et al.
Published: (2024)
by: Sadeghi, Mohammad Erfan, et al.
Published: (2024)
Equi-ViT: Rotational Equivariant Vision Transformer for Robust Histopathology Analysis
by: Chen, Fuyao, et al.
Published: (2026)
by: Chen, Fuyao, et al.
Published: (2026)
LoViT: Long Video Transformer for Surgical Phase Recognition
by: Liu, Yang, et al.
Published: (2023)
by: Liu, Yang, et al.
Published: (2023)
Sub-token ViT Embedding via Stochastic Resonance Transformers
by: Lao, Dong, et al.
Published: (2023)
by: Lao, Dong, et al.
Published: (2023)
SkipViT: Speeding Up Vision Transformers with a Token-Level Skip Connection
by: Ataiefard, Foozhan, et al.
Published: (2024)
by: Ataiefard, Foozhan, et al.
Published: (2024)
The Progression of Transformers from Language to Vision to MOT: A Literature Review on Multi-Object Tracking with Transformers
by: Kamboj, Abhi
Published: (2024)
by: Kamboj, Abhi
Published: (2024)
COST: Contrastive One-Stage Transformer for Vision-Language Small Object Tracking
by: Zhang, Chunhui, et al.
Published: (2025)
by: Zhang, Chunhui, et al.
Published: (2025)
Similar Items
-
Case-Enhanced Vision Transformer: Improving Explanations of Image Similarity with a ViT-based Similarity Metric
by: Zhao, Ziwei, et al.
Published: (2024) -
SAC-ViT: Semantic-Aware Clustering Vision Transformer with Early Exit
by: Hu, Youbing, et al.
Published: (2025) -
ButterflyViT: 354$\times$ Expert Compression for Edge Vision Transformers
by: Karmore, Aryan
Published: (2026) -
ViT-2SPN: Vision Transformer-based Dual-Stream Self-Supervised Pretraining Networks for Retinal OCT Classification
by: Saraei, Mohammadreza, et al.
Published: (2025) -
Beyond Translation: Cross-Cultural Meme Transcreation with Vision-Language Models
by: Zhao, Yuming, et al.
Published: (2026)