Saved in:
| Main Author: | Chenebaux, Maixent |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.24809 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MKA: Memory-Keyed Attention for Efficient Long-Context Reasoning
by: Liu, Dong, et al.
Published: (2026)
by: Liu, Dong, et al.
Published: (2026)
When Reasoning Meets Compression: Understanding the Effects of LLMs Compression on Large Reasoning Models
by: Zhang, Nan, et al.
Published: (2025)
by: Zhang, Nan, et al.
Published: (2025)
Pay Attention to Small Weights
by: Zhou, Chao, et al.
Published: (2025)
by: Zhou, Chao, et al.
Published: (2025)
Paged Attention Meets FlexAttention: Unlocking Long-Context Efficiency in Deployed Inference
by: Joshi, Thomas, et al.
Published: (2025)
by: Joshi, Thomas, et al.
Published: (2025)
Probing the Limits of Compressive Memory: A Study of Infini-Attention in Small-Scale Pretraining
by: Huang, Ruizhe, et al.
Published: (2025)
by: Huang, Ruizhe, et al.
Published: (2025)
Predicting LLM Reasoning Performance with Small Proxy Model
by: Koh, Woosung, et al.
Published: (2025)
by: Koh, Woosung, et al.
Published: (2025)
Large Language Models Meet Graph Neural Networks for Text-Numeric Graph Reasoning
by: Song, Haoran, et al.
Published: (2025)
by: Song, Haoran, et al.
Published: (2025)
Enhancing Reasoning with Collaboration and Memory
by: Michelman, Julie, et al.
Published: (2025)
by: Michelman, Julie, et al.
Published: (2025)
Quantization Meets Reasoning: Exploring and Mitigating Degradation of Low-Bit LLMs in Mathematical Reasoning
by: Li, Zhen, et al.
Published: (2025)
by: Li, Zhen, et al.
Published: (2025)
Adaptive Memory Decay for Log-Linear Attention
by: Amin, Yaxita, et al.
Published: (2026)
by: Amin, Yaxita, et al.
Published: (2026)
SeerAttention-R: Sparse Attention Adaptation for Long Reasoning
by: Gao, Yizhao, et al.
Published: (2025)
by: Gao, Yizhao, et al.
Published: (2025)
Interpretable Concept-Based Memory Reasoning
by: Debot, David, et al.
Published: (2024)
by: Debot, David, et al.
Published: (2024)
RaaS: Reasoning-Aware Attention Sparsity for Efficient LLM Reasoning
by: Hu, Junhao, et al.
Published: (2025)
by: Hu, Junhao, et al.
Published: (2025)
Echo State Transformer: Attention Over Finite Memories
by: Bendi-Ouis, Yannis, et al.
Published: (2025)
by: Bendi-Ouis, Yannis, et al.
Published: (2025)
Towards Reasoning Ability of Small Language Models
by: Srivastava, Gaurav, et al.
Published: (2025)
by: Srivastava, Gaurav, et al.
Published: (2025)
Enhancing Reasoning Capabilities of Small Language Models with Blueprints and Prompt Template Search
by: Han, Dongge, et al.
Published: (2025)
by: Han, Dongge, et al.
Published: (2025)
A Comprehensive Benchmark on Spectral GNNs: The Impact on Efficiency, Memory, and Effectiveness
by: Liao, Ningyi, et al.
Published: (2024)
by: Liao, Ningyi, et al.
Published: (2024)
Introducing Spectral Attention for Long-Range Dependency in Time Series Forecasting
by: Kang, Bong Gyun, et al.
Published: (2024)
by: Kang, Bong Gyun, et al.
Published: (2024)
Attend or Perish: Benchmarking Attention in Algorithmic Reasoning
by: Spiegel, Michal, et al.
Published: (2025)
by: Spiegel, Michal, et al.
Published: (2025)
LOOKAT: Lookup-Optimized Key-Attention for Memory-Efficient Transformers
by: Karmore, Aryan
Published: (2026)
by: Karmore, Aryan
Published: (2026)
QFlash: Bridging Quantization and Memory Efficiency in Vision Transformer Attention
by: Oh, Sehyeon, et al.
Published: (2026)
by: Oh, Sehyeon, et al.
Published: (2026)
Disentangling Recall and Reasoning in Transformer Models through Layer-wise Attention and Activation Analysis
by: Fartale, Harshwardhan, et al.
Published: (2025)
by: Fartale, Harshwardhan, et al.
Published: (2025)
Rank-Aware Spectral Bounds on Attention Logits for Stable Low-Precision Training
by: Emadi, Seyed Morteza
Published: (2026)
by: Emadi, Seyed Morteza
Published: (2026)
Small Generalizable Prompt Predictive Models Can Steer Efficient RL Post-Training of Large Reasoning Models
by: Qu, Yun, et al.
Published: (2026)
by: Qu, Yun, et al.
Published: (2026)
Diffusion Models Meet Contextual Bandits
by: Aouali, Imad
Published: (2024)
by: Aouali, Imad
Published: (2024)
RAST: Reasoning Activation in LLMs via Small-model Transfer
by: Ouyang, Siru, et al.
Published: (2025)
by: Ouyang, Siru, et al.
Published: (2025)
AdaTKG: Adaptive Memory for Temporal Knowledge Graph Reasoning
by: Lee, Seunghan, et al.
Published: (2026)
by: Lee, Seunghan, et al.
Published: (2026)
Scaling Reasoning without Attention
by: Zhao, Xueliang, et al.
Published: (2025)
by: Zhao, Xueliang, et al.
Published: (2025)
AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation
by: Deiseroth, Björn, et al.
Published: (2023)
by: Deiseroth, Björn, et al.
Published: (2023)
DiffCLIP: Differential Attention Meets CLIP
by: Hammoud, Hasan Abed Al Kader, et al.
Published: (2025)
by: Hammoud, Hasan Abed Al Kader, et al.
Published: (2025)
Attention as Binding: A Vector-Symbolic Perspective on Transformer Reasoning
by: Dhayalkar, Sahil Rajesh
Published: (2025)
by: Dhayalkar, Sahil Rajesh
Published: (2025)
FROST: Filtering Reasoning Outliers with Attention for Efficient Reasoning
by: Luo, Haozheng, et al.
Published: (2026)
by: Luo, Haozheng, et al.
Published: (2026)
Can only LLMs do Reasoning?: Potential of Small Language Models in Task Planning
by: Choi, Gawon, et al.
Published: (2024)
by: Choi, Gawon, et al.
Published: (2024)
PeSANet: Physics-encoded Spectral Attention Network for Simulating PDE-Governed Complex Systems
by: Wan, Han, et al.
Published: (2025)
by: Wan, Han, et al.
Published: (2025)
E2Former-V2: On-the-Fly Equivariant Attention with Linear Activation Memory
by: Huang, Lin, et al.
Published: (2026)
by: Huang, Lin, et al.
Published: (2026)
Identity Bridge: Enabling Implicit Reasoning via Shared Latent Memory
by: Lin, Pengxiao, et al.
Published: (2025)
by: Lin, Pengxiao, et al.
Published: (2025)
Direct Reasoning Optimization: Token-Level Reasoning Reflectivity Meets Rubric Gates for Unverifiable Tasks
by: Xu, Yifei, et al.
Published: (2025)
by: Xu, Yifei, et al.
Published: (2025)
When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
by: You, Haoran, et al.
Published: (2024)
by: You, Haoran, et al.
Published: (2024)
Improving Chain-of-Thought for Logical Reasoning via Attention-Aware Intervention
by: Phuong, Nguyen Minh, et al.
Published: (2026)
by: Phuong, Nguyen Minh, et al.
Published: (2026)
Interpretable Hierarchical Concept Reasoning through Attention-Guided Graph Learning
by: Debot, David, et al.
Published: (2025)
by: Debot, David, et al.
Published: (2025)
Similar Items
-
MKA: Memory-Keyed Attention for Efficient Long-Context Reasoning
by: Liu, Dong, et al.
Published: (2026) -
When Reasoning Meets Compression: Understanding the Effects of LLMs Compression on Large Reasoning Models
by: Zhang, Nan, et al.
Published: (2025) -
Pay Attention to Small Weights
by: Zhou, Chao, et al.
Published: (2025) -
Paged Attention Meets FlexAttention: Unlocking Long-Context Efficiency in Deployed Inference
by: Joshi, Thomas, et al.
Published: (2025) -
Probing the Limits of Compressive Memory: A Study of Infini-Attention in Small-Scale Pretraining
by: Huang, Ruizhe, et al.
Published: (2025)