Saved in:
| Main Authors: | Meng, Weikang, Huo, Liangyu, Luo, Yadan, Wang, Yaowei, Li, Yingjian, Zhang, Zheng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.04346 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Norm$\times$Direction: Restoring the Missing Query Norm in Vision Linear Attention
by: Meng, Weikang, et al.
Published: (2025)
by: Meng, Weikang, et al.
Published: (2025)
STILL: Selecting Tokens for Intra-Layer Hybrid Attention to Linearize LLMs
by: Meng, Weikang, et al.
Published: (2026)
by: Meng, Weikang, et al.
Published: (2026)
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
by: Meng, Weikang, et al.
Published: (2025)
by: Meng, Weikang, et al.
Published: (2025)
MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map
by: Chou, Yuhong, et al.
Published: (2024)
by: Chou, Yuhong, et al.
Published: (2024)
PLS in the Mirror of Self-Attention
by: Jiangsheng, et al.
Published: (2026)
by: Jiangsheng, et al.
Published: (2026)
A Finite Sample Analysis of Distributional TD Learning with Linear Function Approximation
by: Peng, Yang, et al.
Published: (2025)
by: Peng, Yang, et al.
Published: (2025)
Provable Ordering and Continuity in Vision-Language Pretraining for Generalizable Embodied Agents
by: Zhang, Zhizhen, et al.
Published: (2025)
by: Zhang, Zhizhen, et al.
Published: (2025)
Multi-refined Feature Enhanced Sentiment Analysis Using Contextual Instruction
by: Atandoh, Peter, et al.
Published: (2025)
by: Atandoh, Peter, et al.
Published: (2025)
Statistical Test for Attention Map in Vision Transformer
by: Shiraishi, Tomohiro, et al.
Published: (2024)
by: Shiraishi, Tomohiro, et al.
Published: (2024)
ConjNorm: Tractable Density Estimation for Out-of-Distribution Detection
by: Peng, Bo, et al.
Published: (2024)
by: Peng, Bo, et al.
Published: (2024)
Degrees of Freedom for Linear Attention: Distilling Softmax Attention with Optimal Feature Efficiency
by: Nishikawa, Naoki, et al.
Published: (2025)
by: Nishikawa, Naoki, et al.
Published: (2025)
Scaling Linear Attention with Sparse State Expansion
by: Pan, Yuqi, et al.
Published: (2025)
by: Pan, Yuqi, et al.
Published: (2025)
PCaM: A Progressive Focus Attention-Based Information Fusion Method for Improving Vision Transformer Domain Adaptation
by: Zang, Zelin, et al.
Published: (2025)
by: Zang, Zelin, et al.
Published: (2025)
A Library of Mirrors: Deep Neural Nets in Low Dimensions are Convex Lasso Models with Reflection Features
by: Zeger, Emi, et al.
Published: (2024)
by: Zeger, Emi, et al.
Published: (2024)
MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection
by: Chen, Zhuoxiao, et al.
Published: (2024)
by: Chen, Zhuoxiao, et al.
Published: (2024)
Nonstationary Generalized Linear Bandits with Discounted Online Mirror Descent
by: Lee, Joongkyu, et al.
Published: (2026)
by: Lee, Joongkyu, et al.
Published: (2026)
EfficientECG: Cross-Attention with Feature Fusion for Efficient Electrocardiogram Classification
by: Deng, Hanhui, et al.
Published: (2025)
by: Deng, Hanhui, et al.
Published: (2025)
Beyond Euclidean Proximity: Repairing Latent World Models with Horizon-Matched Trajectory Reachability Metrics
by: Li, Liangyu, et al.
Published: (2026)
by: Li, Liangyu, et al.
Published: (2026)
CodeMerge: Codebook-Guided Model Merging for Robust Test-Time Adaptation in Autonomous Driving
by: Yang, Huitong, et al.
Published: (2025)
by: Yang, Huitong, et al.
Published: (2025)
RoPE Attention Can Be Trained in Almost Linear Time
by: Cao, Yang, et al.
Published: (2024)
by: Cao, Yang, et al.
Published: (2024)
Emotion Collider: Dual Hyperbolic Mirror Manifolds for Sentiment Recovery via Anti Emotion Reflection
by: Fu, Rong, et al.
Published: (2026)
by: Fu, Rong, et al.
Published: (2026)
Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regression
by: Zuo, Yifei, et al.
Published: (2025)
by: Zuo, Yifei, et al.
Published: (2025)
Statistical Efficiency of Distributional Temporal Difference Learning and Freedman's Inequality in Hilbert Spaces
by: Peng, Yang, et al.
Published: (2024)
by: Peng, Yang, et al.
Published: (2024)
Federated Reinforcement Learning with Constraint Heterogeneity
by: Jin, Hao, et al.
Published: (2024)
by: Jin, Hao, et al.
Published: (2024)
Kimi Linear: An Expressive, Efficient Attention Architecture
by: Kimi Team, et al.
Published: (2025)
by: Kimi Team, et al.
Published: (2025)
Dense Feature Learning via Linear Structure Preservation in Medical Data
by: Zhang, Yuanyun, et al.
Published: (2026)
by: Zhang, Yuanyun, et al.
Published: (2026)
Class-Discriminative Attention Maps for Vision Transformers
by: Brocki, Lennart, et al.
Published: (2023)
by: Brocki, Lennart, et al.
Published: (2023)
Hallucination Detection in LLMs Using Spectral Features of Attention Maps
by: Binkowski, Jakub, et al.
Published: (2025)
by: Binkowski, Jakub, et al.
Published: (2025)
Higher-order Linear Attention
by: Zhang, Yifan, et al.
Published: (2025)
by: Zhang, Yifan, et al.
Published: (2025)
DistZO2: High-Throughput and Memory-Efficient Zeroth-Order Fine-tuning LLMs with Distributed Parallel Computing
by: Wang, Liangyu, et al.
Published: (2025)
by: Wang, Liangyu, et al.
Published: (2025)
Log-Linear Attention
by: Guo, Han, et al.
Published: (2025)
by: Guo, Han, et al.
Published: (2025)
KoopGen: Koopman Generator Networks for Representing and Predicting Dynamical Systems with Continuous Spectra
by: Su, Liangyu, et al.
Published: (2026)
by: Su, Liangyu, et al.
Published: (2026)
Adaptive Memory Decay for Log-Linear Attention
by: Amin, Yaxita, et al.
Published: (2026)
by: Amin, Yaxita, et al.
Published: (2026)
Linearized Diffusion Map
by: Candanedo, Julio
Published: (2025)
by: Candanedo, Julio
Published: (2025)
Is Less More? Exploring Token Condensation as Training-free Test-time Adaptation
by: Wang, Zixin, et al.
Published: (2024)
by: Wang, Zixin, et al.
Published: (2024)
Iwin Transformer: Hierarchical Vision Transformer using Interleaved Windows
by: Huo, Simin, et al.
Published: (2025)
by: Huo, Simin, et al.
Published: (2025)
Linear Attention with Global Context: A Multipole Attention Mechanism for Vision and Physics
by: Colagrande, Alex, et al.
Published: (2025)
by: Colagrande, Alex, et al.
Published: (2025)
On-the-Fly Adaptive Distillation of Transformer to Dual-State Linear Attention
by: Ro, Yeonju, et al.
Published: (2025)
by: Ro, Yeonju, et al.
Published: (2025)
GNN-based Path-aware multi-view Circuit Learning for Technology Mapping
by: Jiang, Wentao, et al.
Published: (2026)
by: Jiang, Wentao, et al.
Published: (2026)
Implicit Bias of Mirror Flow in Homogeneous Neural Networks: Sparse and Dense Feature Learning
by: Jacobs, Tom, et al.
Published: (2026)
by: Jacobs, Tom, et al.
Published: (2026)
Similar Items
-
Norm$\times$Direction: Restoring the Missing Query Norm in Vision Linear Attention
by: Meng, Weikang, et al.
Published: (2025) -
STILL: Selecting Tokens for Intra-Layer Hybrid Attention to Linearize LLMs
by: Meng, Weikang, et al.
Published: (2026) -
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
by: Meng, Weikang, et al.
Published: (2025) -
MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map
by: Chou, Yuhong, et al.
Published: (2024) -
PLS in the Mirror of Self-Attention
by: Jiangsheng, et al.
Published: (2026)