Saved in:
| Main Author: | Huang, Yufeng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.17334 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Superlinear Multi-Step Attention
by: Huang, Yufeng
Published: (2026)
by: Huang, Yufeng
Published: (2026)
A Provable Expressiveness Hierarchy in Hybrid Linear-Full Attention
by: Ye, Xiaowei, et al.
Published: (2026)
by: Ye, Xiaowei, et al.
Published: (2026)
Efficient Attention: Attention with Linear Complexities
by: Shen, Zhuoran, et al.
Published: (2018)
by: Shen, Zhuoran, et al.
Published: (2018)
Log-Linear Attention
by: Guo, Han, et al.
Published: (2025)
by: Guo, Han, et al.
Published: (2025)
Why Softmax Attention Outperforms Linear Attention
by: Deng, Yichuan, et al.
Published: (2023)
by: Deng, Yichuan, et al.
Published: (2023)
Exact Linear Attention
by: Ou, Weinuo
Published: (2026)
by: Ou, Weinuo
Published: (2026)
Kaczmarz Linear Attention
by: Zou, Jiaxuan, et al.
Published: (2026)
by: Zou, Jiaxuan, et al.
Published: (2026)
SEA: Sparse Linear Attention with Estimated Attention Mask
by: Lee, Heejun, et al.
Published: (2023)
by: Lee, Heejun, et al.
Published: (2023)
Rethinking Transformer Connectivity: TLinFormer, A Path to Exact, Full Context-Aware Linear Attention
by: Tang, Zhongpan
Published: (2025)
by: Tang, Zhongpan
Published: (2025)
Efficiently Dispatching Flash Attention For Partially Filled Attention Masks
by: Sharma, Agniv, et al.
Published: (2024)
by: Sharma, Agniv, et al.
Published: (2024)
An Analysis of Linear Complexity Attention Substitutes with BEST-RQ
by: Whetten, Ryan, et al.
Published: (2024)
by: Whetten, Ryan, et al.
Published: (2024)
Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regression
by: Zuo, Yifei, et al.
Published: (2025)
by: Zuo, Yifei, et al.
Published: (2025)
Transolver is a Linear Transformer: Revisiting Physics-Attention through the Lens of Linear Attention
by: Hu, Wenjie, et al.
Published: (2025)
by: Hu, Wenjie, et al.
Published: (2025)
Token Sample Complexity of Attention
by: Bohbot, Léa, et al.
Published: (2025)
by: Bohbot, Léa, et al.
Published: (2025)
Degrees of Freedom for Linear Attention: Distilling Softmax Attention with Optimal Feature Efficiency
by: Nishikawa, Naoki, et al.
Published: (2025)
by: Nishikawa, Naoki, et al.
Published: (2025)
Cottention: Linear Transformers With Cosine Attention
by: Mongaras, Gabriel, et al.
Published: (2024)
by: Mongaras, Gabriel, et al.
Published: (2024)
Linear Attention Sequence Parallelism
by: Sun, Weigao, et al.
Published: (2024)
by: Sun, Weigao, et al.
Published: (2024)
The Key to State Reduction in Linear Attention: A Rank-based Perspective
by: Nazari, Philipp, et al.
Published: (2026)
by: Nazari, Philipp, et al.
Published: (2026)
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
by: Zhang, Yufeng, et al.
Published: (2022)
by: Zhang, Yufeng, et al.
Published: (2022)
MDN: Parallelizing Stepwise Momentum for Delta Linear Attention
by: Huang, Yulong, et al.
Published: (2026)
by: Huang, Yulong, et al.
Published: (2026)
Quantum Complex-Valued Self-Attention Model
by: Chen, Fu, et al.
Published: (2025)
by: Chen, Fu, et al.
Published: (2025)
Linear Attention for Joint Power Optimization and User-Centric Clustering in Cell-Free Networks
by: Chafaa, Irched, et al.
Published: (2025)
by: Chafaa, Irched, et al.
Published: (2025)
InAttention: Linear Context Scaling for Transformers
by: Eisner, Joseph
Published: (2024)
by: Eisner, Joseph
Published: (2024)
Linear Memory SE(2) Invariant Attention
by: Pronovost, Ethan, et al.
Published: (2025)
by: Pronovost, Ethan, et al.
Published: (2025)
Training Dynamics of In-Context Learning in Linear Attention
by: Zhang, Yedi, et al.
Published: (2025)
by: Zhang, Yedi, et al.
Published: (2025)
PowerAttention: Exponentially Scaling of Receptive Fields for Effective Sparse Attention
by: Chen, Lida, et al.
Published: (2025)
by: Chen, Lida, et al.
Published: (2025)
Global Attention with Linear Complexity for Exascale Generative Data Assimilation in Earth System Prediction
by: Wang, Xiao, et al.
Published: (2026)
by: Wang, Xiao, et al.
Published: (2026)
Softmax as Linear Attention in the Large-Prompt Regime: a Measure-based Perspective
by: Boursier, Etienne, et al.
Published: (2025)
by: Boursier, Etienne, et al.
Published: (2025)
Stochastic Attention: Connectome-Inspired Randomized Routing for Expressive Linear-Time Attention
by: Jin, Zehao, et al.
Published: (2026)
by: Jin, Zehao, et al.
Published: (2026)
Higher-order Linear Attention
by: Zhang, Yifan, et al.
Published: (2025)
by: Zhang, Yifan, et al.
Published: (2025)
Enhancing Linear Attention with Residual Learning
by: Lai, Xunhao, et al.
Published: (2025)
by: Lai, Xunhao, et al.
Published: (2025)
Attention-based clustering
by: Maulen-Soto, Rodrigo, et al.
Published: (2025)
by: Maulen-Soto, Rodrigo, et al.
Published: (2025)
Kimi Linear: An Expressive, Efficient Attention Architecture
by: Kimi Team, et al.
Published: (2025)
by: Kimi Team, et al.
Published: (2025)
Neural Attention Search Linear: Towards Adaptive Token-Level Hybrid Attention Models
by: Deng, Difan, et al.
Published: (2026)
by: Deng, Difan, et al.
Published: (2026)
Alleviating Forgetfulness of Linear Attention by Hybrid Sparse Attention and Contextualized Learnable Token Eviction
by: He, Mutian, et al.
Published: (2025)
by: He, Mutian, et al.
Published: (2025)
Linear Predictability of Attention Heads in Large Language Models
by: Shaikh, Khalid, et al.
Published: (2026)
by: Shaikh, Khalid, et al.
Published: (2026)
WildCat: Near-Linear Attention in Theory and Practice
by: Schröder, Tobias, et al.
Published: (2026)
by: Schröder, Tobias, et al.
Published: (2026)
Hybrid Focal and Full-Range Attention Based Graph Transformers
by: Zhu, Minhong, et al.
Published: (2023)
by: Zhu, Minhong, et al.
Published: (2023)
LUNA: Linear Universal Neural Attention with Generalization Guarantees
by: Shahbazi, Ashkan, et al.
Published: (2025)
by: Shahbazi, Ashkan, et al.
Published: (2025)
Adaptive Memory Decay for Log-Linear Attention
by: Amin, Yaxita, et al.
Published: (2026)
by: Amin, Yaxita, et al.
Published: (2026)
Similar Items
-
Superlinear Multi-Step Attention
by: Huang, Yufeng
Published: (2026) -
A Provable Expressiveness Hierarchy in Hybrid Linear-Full Attention
by: Ye, Xiaowei, et al.
Published: (2026) -
Efficient Attention: Attention with Linear Complexities
by: Shen, Zhuoran, et al.
Published: (2018) -
Log-Linear Attention
by: Guo, Han, et al.
Published: (2025) -
Why Softmax Attention Outperforms Linear Attention
by: Deng, Yichuan, et al.
Published: (2023)