:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zou, Jiaxuan, Ren, Ruifeng, Liu, Yong
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.08587
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Revisiting Transformers through the Lens of Low Entropy and Dynamic Sparsity
by: Ren, Ruifeng, et al.
Published: (2025)

Exploring the Limitations of Mamba in COPY and CoT Reasoning
by: Ren, Ruifeng, et al.
Published: (2024)

T-SKM-Net: Trainable Neural Network Framework for Linear Constraint Satisfaction via Sampling Kaczmarz-Motzkin Method
by: Zhu, Haoyu, et al.
Published: (2025)

Transolver is a Linear Transformer: Revisiting Physics-Attention through the Lens of Linear Attention
by: Hu, Wenjie, et al.
Published: (2025)

Capabilities and Fundamental Limits of Latent Chain-of-Thought
by: Zou, Jiaxuan, et al.
Published: (2026)

KVBuffer: IO-aware Serving for Linear Attention
by: Zou, Longwei, et al.
Published: (2026)

Compositional Generalization from Learned Skills via CoT Training: A Theoretical and Structural Analysis for Reasoning
by: Yao, Xinhao, et al.
Published: (2025)

Effective Frontiers: A Unification of Neural Scaling Laws
by: Zou, Jiaxuan, et al.
Published: (2026)

Superiority of Multi-Head Attention in In-Context Linear Regression
by: Cui, Yingqian, et al.
Published: (2024)

Exact Linear Attention
by: Ou, Weinuo
Published: (2026)

Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regression
by: Zuo, Yifei, et al.
Published: (2025)

State Rank Dynamics in Linear Attention LLMs
by: Sun, Ao, et al.
Published: (2026)

Enhancing Linear Attention with Residual Learning
by: Lai, Xunhao, et al.
Published: (2025)

Adaptive Memory Decay for Log-Linear Attention
by: Amin, Yaxita, et al.
Published: (2026)

Linear Attention is Enough in Spatial-Temporal Forecasting
by: Ning, Xinyu
Published: (2024)

Linear Attention for Efficient Bidirectional Sequence Modeling
by: Afzal, Arshia, et al.
Published: (2025)

Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
by: Beck, Maximilian, et al.
Published: (2025)

Employee Turnover Prediction: A Cross-component Attention Transformer with Consideration of Competitor Influence and Contagious Effect
by: Liu, Hao, et al.
Published: (2025)

Beyond Linearity in Attention Projections: The Case for Nonlinear Queries
by: Karbevski, Marko
Published: (2026)

ZeroS: Zero-Sum Linear Attention for Efficient Transformers
by: Lu, Jiecheng, et al.
Published: (2026)

RACE Attention: A Strictly Linear-Time Attention Layer for Training on Outrageously Large Contexts
by: Joshi, Sahil, et al.
Published: (2025)

Higher-order Linear Attention
by: Zhang, Yifan, et al.
Published: (2025)

Learning under Quantization for High-Dimensional Linear Regression
by: Zhang, Dechen, et al.
Published: (2025)

E2Former-V2: On-the-Fly Equivariant Attention with Linear Activation Memory
by: Huang, Lin, et al.
Published: (2026)

Learning Advanced Self-Attention for Linear Transformers in the Singular Value Domain
by: Wi, Hyowon, et al.
Published: (2025)

MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map
by: Chou, Yuhong, et al.
Published: (2024)

LNUCB-TA: Linear-nonlinear Hybrid Bandit Learning with Temporal Attention
by: Khosravi, Hamed, et al.
Published: (2025)

Scaling Laws for Precision in High-Dimensional Linear Regression
by: Zhang, Dechen, et al.
Published: (2026)

Efficient Linear Attention for Multivariate Time Series Modeling via Entropy Equality
by: Zhang, Mingtao, et al.
Published: (2025)

Linear Transformers as VAR Models: Aligning Autoregressive Attention Mechanisms with Autoregressive Forecasting
by: Lu, Jiecheng, et al.
Published: (2025)

Geometry-Aware Contrastive Learning for Few-Shot Automatic Modulation Recognition
by: Zhao, Guanqun, et al.
Published: (2026)

Attention-Aided MMSE for OFDM Channel Estimation: Learning Linear Filters with Attention
by: Ha, TaeJun, et al.
Published: (2025)

In-Context Learning in Linear vs. Quadratic Attention Models: An Empirical Study on Regression Tasks
by: Goel, Ayush, et al.
Published: (2026)

Global Attention with Linear Complexity for Exascale Generative Data Assimilation in Earth System Prediction
by: Wang, Xiao, et al.
Published: (2026)

MiniCPM-SALA: Hybridizing Sparse and Linear Attention for Efficient Long-Context Modeling
by: MiniCPM Team, et al.
Published: (2026)

FAME: Adaptive Functional Attention with Expert Routing for Function-on-Function Regression
by: Gao, Yifei, et al.
Published: (2025)

Towards Theoretical Understanding of Transformer Test-Time Computing: Investigation on In-Context Linear Regression
by: Chen, Xingwu, et al.
Published: (2025)

Parallax: Parameterized Local Linear Attention for Language Modeling
by: Zuo, Yifei, et al.
Published: (2026)

Human-like Cognitive Generalization for Large Models via Brain-in-the-loop Supervision
by: Chen, Jiaxuan, et al.
Published: (2025)

Pretrained battery transformer (PBT): A foundation model for universal battery life prediction
by: Tan, Ruifeng, et al.
Published: (2025)