:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wen, Qishuai, Huang, Zhiyuan, Meng, Xianghan, He, Wei, Li, Chun-Guang
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.01219
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Towards Interpretable and Efficient Attention: Compressing All by Contracting a Few
by: Wen, Qishuai, et al.
Published: (2025)

Rethinking Decoders for Transformer-based Semantic Segmentation: A Compression Perspective
by: Wen, Qishuai, et al.
Published: (2024)

Exploring a Principled Framework for Deep Subspace Clustering
by: Meng, Xianghan, et al.
Published: (2025)

SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning
by: Zhang, Jintao, et al.
Published: (2026)

Multi-Modal Representation Learning via Semi-Supervised Rate Reduction for Generalized Category Discovery
by: He, Wei, et al.
Published: (2026)

Jointly Learning Structured Representations and Stabilized Affinity for Human Motion Segmentation
by: Meng, Xianghan, et al.
Published: (2026)

Temporal Rate Reduction Clustering for Human Motion Segmentation
by: Meng, Xianghan, et al.
Published: (2025)

MoH: Multi-Head Attention as Mixture-of-Head Attention
by: Jin, Peng, et al.
Published: (2024)

Bootstrapping Top-down Information for Self-modulating Slot Attention
by: Kim, Dongwon, et al.
Published: (2024)

Mixture of Distributions Matters: Dynamic Sparse Attention for Efficient Video Diffusion Transformers
by: Liu, Yuxi, et al.
Published: (2026)

Elastic Attention Cores for Scalable Vision Transformers
by: Song, Alan Z., et al.
Published: (2026)

Learning Informative Attention Weights for Person Re-Identification
by: Wang, Yancheng, et al.
Published: (2025)

Clebsch-Gordan Transformer: Fast and Global Equivariant Attention
by: Howell, Owen Lewis, et al.
Published: (2025)

Reinforced Attention Learning
by: Li, Bangzheng, et al.
Published: (2026)

Memory Efficient Neural Processes via Constant Memory Attention Block
by: Feng, Leo, et al.
Published: (2023)

Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging
by: Shen, Li, et al.
Published: (2024)

Merging Multi-Task Models via Weight-Ensembling Mixture of Experts
by: Tang, Anke, et al.
Published: (2024)

MonarchRT: Efficient Attention for Real-Time Video Generation
by: Agarwal, Krish, et al.
Published: (2026)

SageAttention2++: A More Efficient Implementation of SageAttention2
by: Zhang, Jintao, et al.
Published: (2025)

Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Gate
by: Lee, Byung Hyun, et al.
Published: (2025)

Uncertainty-Guided Attention and Entropy-Weighted Loss for Precise Plant Seedling Segmentation
by: Ehab, Mohamed, et al.
Published: (2026)

A Scalable Attention-Based Approach for Image-to-3D Texture Mapping
by: Rampini, Arianna, et al.
Published: (2025)

GD-FPS: Growth-Driven Feedforward Parameter Selection for Efficient Fine-Tuning
by: Yang, Kenneth, et al.
Published: (2025)

Attention in Geometry: Scalable Spatial Modeling via Adaptive Density Fields and FAISS-Accelerated Kernels
by: Fan, Zhaowen
Published: (2026)

ELSA: Exact Linear-Scan Attention for Fast and Memory-Light Vision Transformers
by: Hsu, Chih-Chung, et al.
Published: (2026)

STAR: Stage-Wise Attention-Guided Token Reduction for Efficient Large Vision-Language Models Inference
by: Guo, Yichen, et al.
Published: (2025)

EDiT: Efficient Diffusion Transformers with Linear Compressed Attention
by: Becker, Philipp, et al.
Published: (2025)

Attention Guided Alignment in Efficient Vision-Language Models
by: Mahajan, Shweta, et al.
Published: (2025)

DataDAM: Efficient Dataset Distillation with Attention Matching
by: Sajedi, Ahmad, et al.
Published: (2023)

Synthesizer Based Efficient Self-Attention for Vision Tasks
by: Zhu, Guangyang, et al.
Published: (2022)

FasterViT: Fast Vision Transformers with Hierarchical Attention
by: Hatamizadeh, Ali, et al.
Published: (2023)

Shiva-DiT: Residual-Based Differentiable Top-$k$ Selection for Efficient Diffusion Transformers
by: Zhang, Jiaji, et al.
Published: (2026)

SmartFRZ: An Efficient Training Framework using Attention-Based Layer Freezing
by: Li, Sheng, et al.
Published: (2024)

Ensembling Pruned Attention Heads For Uncertainty-Aware Efficient Transformers
by: Gabetni, Firas, et al.
Published: (2025)

ASAP: Attention-Shift-Aware Pruning for Efficient LVLM Inference
by: Pathak, Surendra, et al.
Published: (2026)

AnchorFormer: Differentiable Anchor Attention for Efficient Vision Transformer
by: Shan, Jiquan, et al.
Published: (2025)

Beyond Top Activations: Efficient and Reliable Crowdsourced Evaluation of Automated Interpretability
by: Oikarinen, Tuomas, et al.
Published: (2025)

Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level
by: Hassani, Ali, et al.
Published: (2024)

Top-K Pairwise Ranking: Bridging the Gap Among Ranking-Based Measures for Multi-Label Classification
by: Wang, Zitai, et al.
Published: (2024)

Reflecting Topology Consistency and Abnormality via Learnable Attentions for Airway Labeling
by: Li, Chenyu, et al.
Published: (2024)