:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Meng, Weikang, Huo, Liangyu, Luo, Yadan, Guan, Jiawen, Zhang, Jingyi, Li, Yingjian, Zhang, Zheng
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2602.02180
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MirrorLA: Reflecting Feature Map for Vision Linear Attention
by: Meng, Weikang, et al.
Published: (2026)

Norm$\times$Direction: Restoring the Missing Query Norm in Vision Linear Attention
by: Meng, Weikang, et al.
Published: (2025)

PolaFormer: Polarity-aware Linear Attention for Vision Transformers
by: Meng, Weikang, et al.
Published: (2025)

IntraSlice: Towards High-Performance Structural Pruning with Block-Intra PCA for LLMs
by: Li, Meng, et al.
Published: (2026)

Neural Attention Search Linear: Towards Adaptive Token-Level Hybrid Attention Models
by: Deng, Difan, et al.
Published: (2026)

Alleviating Forgetfulness of Linear Attention by Hybrid Sparse Attention and Contextualized Learnable Token Eviction
by: He, Mutian, et al.
Published: (2025)

A Finite Sample Analysis of Distributional TD Learning with Linear Function Approximation
by: Peng, Yang, et al.
Published: (2025)

Implicit Bias in Deep Linear Discriminant Analysis
by: Li, Jiawen
Published: (2026)

Recurrent Attention-based Token Selection for Efficient Streaming Video-LLMs
by: Dorovatas, Vaggelis, et al.
Published: (2025)

Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference
by: Qiu, Quantong, et al.
Published: (2026)

Benign Overfitting in Token Selection of Attention Mechanism
by: Sakamoto, Keitaro, et al.
Published: (2024)

ConjNorm: Tractable Density Estimation for Out-of-Distribution Detection
by: Peng, Bo, et al.
Published: (2024)

Token Statistics Transformer: Linear-Time Attention via Variational Rate Reduction
by: Wu, Ziyang, et al.
Published: (2024)

Mixture of Layers with Hybrid Attention
by: Ternovtsii, Ivan, et al.
Published: (2026)

Is Less More? Exploring Token Condensation as Training-free Test-time Adaptation
by: Wang, Zixin, et al.
Published: (2024)

State Rank Dynamics in Linear Attention LLMs
by: Sun, Ao, et al.
Published: (2026)

Between the Layers Lies the Truth: Uncertainty Estimation in LLMs Using Intra-Layer Local Information Scores
by: Badash, Zvi N., et al.
Published: (2026)

CAOTE: KV Cache Selection for LLMs via Attention Output Error-Based Token Eviction
by: Goel, Raghavv, et al.
Published: (2025)

QTALE: Quantization-Robust Token-Adaptive Layer Execution for LLMs
by: Noh, Kanghyun, et al.
Published: (2026)

High-Dimensional Analysis of Single-Layer Attention for Sparse-Token Classification
by: Barnfield, Nicholas, et al.
Published: (2025)

LayerBoost: Layer-Aware Attention Reduction for Efficient LLMs
by: Souibgui, Mohamed Ali, et al.
Published: (2026)

Token Sparse Attention: Efficient Long-Context Inference with Interleaved Token Selection
by: Jo, Dongwon, et al.
Published: (2026)

Design Conditions for Intra-Group Learning of Sequence-Level Rewards: Token Gradient Cancellation
by: Ding, Fei, et al.
Published: (2026)

Statistical Efficiency of Distributional Temporal Difference Learning and Freedman's Inequality in Hilbert Spaces
by: Peng, Yang, et al.
Published: (2024)

Federated Reinforcement Learning with Constraint Heterogeneity
by: Jin, Hao, et al.
Published: (2024)

Conformal Selective Acting: Anytime-Valid Risk Control for RLVR-Trained LLMs
by: Khosravi, Hamed, et al.
Published: (2026)

MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection
by: Chen, Zhuoxiao, et al.
Published: (2024)

Geometric Analysis of Token Selection in Multi-Head Attention
by: Mudarisov, Timur, et al.
Published: (2026)

Attention with Trained Embeddings Provably Selects Important Tokens
by: Wu, Diyuan, et al.
Published: (2025)

Adaptive Time Series Reasoning via Segment Selection
by: Messica, Shvat, et al.
Published: (2026)

Kimi Linear: An Expressive, Efficient Attention Architecture
by: Kimi Team, et al.
Published: (2025)

Robust Hallucination Detection in LLMs via Adaptive Token Selection
by: Niu, Mengjia, et al.
Published: (2025)

MiniCPM-SALA: Hybridizing Sparse and Linear Attention for Efficient Long-Context Modeling
by: MiniCPM Team, et al.
Published: (2026)

Adaptive Layer Selection for Layer-Wise Token Pruning in LLM Inference
by: Taniguchi, Rei, et al.
Published: (2026)

TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
by: Wu, Wei, et al.
Published: (2024)

CodeMerge: Codebook-Guided Model Merging for Robust Test-Time Adaptation in Autonomous Driving
by: Yang, Huitong, et al.
Published: (2025)

Beyond Semantic Manipulation: Token-Space Attacks on Reward Models
by: Zhang, Yuheng, et al.
Published: (2026)

Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
by: Guo, Tianyu, et al.
Published: (2024)

Modern Neuromorphic AI: From Intra-Token to Inter-Token Processing
by: Simeone, Osvaldo
Published: (2026)

RAM-Net: Expressive Linear Attention with Selectively Addressable Memory
by: Xiao, Kaicheng, et al.
Published: (2026)