:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Meng, Weikang, Huo, Liangyu, Luo, Yadan, Wang, Yaowei, Li, Yingjian, Zhang, Zheng
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2602.04346
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Norm$\times$Direction: Restoring the Missing Query Norm in Vision Linear Attention
by: Meng, Weikang, et al.
Published: (2025)

STILL: Selecting Tokens for Intra-Layer Hybrid Attention to Linearize LLMs
by: Meng, Weikang, et al.
Published: (2026)

PolaFormer: Polarity-aware Linear Attention for Vision Transformers
by: Meng, Weikang, et al.
Published: (2025)

MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map
by: Chou, Yuhong, et al.
Published: (2024)

PLS in the Mirror of Self-Attention
by: Jiangsheng, et al.
Published: (2026)

A Finite Sample Analysis of Distributional TD Learning with Linear Function Approximation
by: Peng, Yang, et al.
Published: (2025)

Provable Ordering and Continuity in Vision-Language Pretraining for Generalizable Embodied Agents
by: Zhang, Zhizhen, et al.
Published: (2025)

Multi-refined Feature Enhanced Sentiment Analysis Using Contextual Instruction
by: Atandoh, Peter, et al.
Published: (2025)

Statistical Test for Attention Map in Vision Transformer
by: Shiraishi, Tomohiro, et al.
Published: (2024)

ConjNorm: Tractable Density Estimation for Out-of-Distribution Detection
by: Peng, Bo, et al.
Published: (2024)

Degrees of Freedom for Linear Attention: Distilling Softmax Attention with Optimal Feature Efficiency
by: Nishikawa, Naoki, et al.
Published: (2025)

Scaling Linear Attention with Sparse State Expansion
by: Pan, Yuqi, et al.
Published: (2025)

PCaM: A Progressive Focus Attention-Based Information Fusion Method for Improving Vision Transformer Domain Adaptation
by: Zang, Zelin, et al.
Published: (2025)

A Library of Mirrors: Deep Neural Nets in Low Dimensions are Convex Lasso Models with Reflection Features
by: Zeger, Emi, et al.
Published: (2024)

MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection
by: Chen, Zhuoxiao, et al.
Published: (2024)

Nonstationary Generalized Linear Bandits with Discounted Online Mirror Descent
by: Lee, Joongkyu, et al.
Published: (2026)

EfficientECG: Cross-Attention with Feature Fusion for Efficient Electrocardiogram Classification
by: Deng, Hanhui, et al.
Published: (2025)

Beyond Euclidean Proximity: Repairing Latent World Models with Horizon-Matched Trajectory Reachability Metrics
by: Li, Liangyu, et al.
Published: (2026)

CodeMerge: Codebook-Guided Model Merging for Robust Test-Time Adaptation in Autonomous Driving
by: Yang, Huitong, et al.
Published: (2025)

RoPE Attention Can Be Trained in Almost Linear Time
by: Cao, Yang, et al.
Published: (2024)

Emotion Collider: Dual Hyperbolic Mirror Manifolds for Sentiment Recovery via Anti Emotion Reflection
by: Fu, Rong, et al.
Published: (2026)

Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regression
by: Zuo, Yifei, et al.
Published: (2025)

Statistical Efficiency of Distributional Temporal Difference Learning and Freedman's Inequality in Hilbert Spaces
by: Peng, Yang, et al.
Published: (2024)

Federated Reinforcement Learning with Constraint Heterogeneity
by: Jin, Hao, et al.
Published: (2024)

Kimi Linear: An Expressive, Efficient Attention Architecture
by: Kimi Team, et al.
Published: (2025)

Dense Feature Learning via Linear Structure Preservation in Medical Data
by: Zhang, Yuanyun, et al.
Published: (2026)

Class-Discriminative Attention Maps for Vision Transformers
by: Brocki, Lennart, et al.
Published: (2023)

Hallucination Detection in LLMs Using Spectral Features of Attention Maps
by: Binkowski, Jakub, et al.
Published: (2025)

Higher-order Linear Attention
by: Zhang, Yifan, et al.
Published: (2025)

DistZO2: High-Throughput and Memory-Efficient Zeroth-Order Fine-tuning LLMs with Distributed Parallel Computing
by: Wang, Liangyu, et al.
Published: (2025)

Log-Linear Attention
by: Guo, Han, et al.
Published: (2025)

KoopGen: Koopman Generator Networks for Representing and Predicting Dynamical Systems with Continuous Spectra
by: Su, Liangyu, et al.
Published: (2026)

Adaptive Memory Decay for Log-Linear Attention
by: Amin, Yaxita, et al.
Published: (2026)

Linearized Diffusion Map
by: Candanedo, Julio
Published: (2025)

Is Less More? Exploring Token Condensation as Training-free Test-time Adaptation
by: Wang, Zixin, et al.
Published: (2024)

Iwin Transformer: Hierarchical Vision Transformer using Interleaved Windows
by: Huo, Simin, et al.
Published: (2025)

Linear Attention with Global Context: A Multipole Attention Mechanism for Vision and Physics
by: Colagrande, Alex, et al.
Published: (2025)

On-the-Fly Adaptive Distillation of Transformer to Dual-State Linear Attention
by: Ro, Yeonju, et al.
Published: (2025)

GNN-based Path-aware multi-view Circuit Learning for Technology Mapping
by: Jiang, Wentao, et al.
Published: (2026)

Implicit Bias of Mirror Flow in Homogeneous Neural Networks: Sparse and Dense Feature Learning
by: Jacobs, Tom, et al.
Published: (2026)