Saved in:
| Main Author: | Dentamaro, Vincenzo |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.08637 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Architecture-Agnostic Curriculum Learning for Document Understanding: Empirical Evidence from Text-Only and Multimodal
by: Hamdan, Mohammed, et al.
Published: (2026)
by: Hamdan, Mohammed, et al.
Published: (2026)
Star Attention: Efficient LLM Inference over Long Sequences
by: Acharya, Shantanu, et al.
Published: (2024)
by: Acharya, Shantanu, et al.
Published: (2024)
LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid
by: Sun, Weigao, et al.
Published: (2025)
by: Sun, Weigao, et al.
Published: (2025)
SEAL: Scaling to Emphasize Attention for Long-Context Retrieval
by: Lee, Changhun, et al.
Published: (2025)
by: Lee, Changhun, et al.
Published: (2025)
RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale
by: Goldstein, Daniel, et al.
Published: (2025)
by: Goldstein, Daniel, et al.
Published: (2025)
Higher-order Linear Attention
by: Zhang, Yifan, et al.
Published: (2025)
by: Zhang, Yifan, et al.
Published: (2025)
Training Tensor Attention Efficiently: From Cubic to Almost Linear Time
by: Cao, Yang, et al.
Published: (2024)
by: Cao, Yang, et al.
Published: (2024)
Learning Linear Attention in Polynomial Time
by: Yau, Morris, et al.
Published: (2024)
by: Yau, Morris, et al.
Published: (2024)
Native Hybrid Attention for Efficient Sequence Modeling
by: Du, Jusen, et al.
Published: (2025)
by: Du, Jusen, et al.
Published: (2025)
MiniCPM-SALA: Hybridizing Sparse and Linear Attention for Efficient Long-Context Modeling
by: MiniCPM Team, et al.
Published: (2026)
by: MiniCPM Team, et al.
Published: (2026)
RoPE Attention Can Be Trained in Almost Linear Time
by: Cao, Yang, et al.
Published: (2024)
by: Cao, Yang, et al.
Published: (2024)
Scaling Reasoning without Attention
by: Zhao, Xueliang, et al.
Published: (2025)
by: Zhao, Xueliang, et al.
Published: (2025)
Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts
by: Chen, Yingfa, et al.
Published: (2026)
by: Chen, Yingfa, et al.
Published: (2026)
Parallax: Parameterized Local Linear Attention for Language Modeling
by: Zuo, Yifei, et al.
Published: (2026)
by: Zuo, Yifei, et al.
Published: (2026)
Hallucination Detection in LLMs Using Spectral Features of Attention Maps
by: Binkowski, Jakub, et al.
Published: (2025)
by: Binkowski, Jakub, et al.
Published: (2025)
SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention
by: Zhu, Qianchao, et al.
Published: (2024)
by: Zhu, Qianchao, et al.
Published: (2024)
HSR-Enhanced Sparse Attention Acceleration
by: Chen, Bo, et al.
Published: (2024)
by: Chen, Bo, et al.
Published: (2024)
Untangling Component Imbalance in Hybrid Linear Attention Conversion Methods
by: Benfeghoul, Martin, et al.
Published: (2025)
by: Benfeghoul, Martin, et al.
Published: (2025)
Scaling Bidirectional Spans and Span Violations in Attention Mechanism
by: Kim, Jongwook, et al.
Published: (2025)
by: Kim, Jongwook, et al.
Published: (2025)
Enhancing Rare Codes via Probability-Biased Directed Graph Attention for Long-Tail ICD Coding
by: Chen, Tianlei, et al.
Published: (2025)
by: Chen, Tianlei, et al.
Published: (2025)
Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix
by: Liang, Yingyu, et al.
Published: (2024)
by: Liang, Yingyu, et al.
Published: (2024)
Cost-Optimal Grouped-Query Attention for Long-Context Modeling
by: Chen, Yingfa, et al.
Published: (2025)
by: Chen, Yingfa, et al.
Published: (2025)
MoBA: Mixture of Block Attention for Long-Context LLMs
by: Lu, Enzhe, et al.
Published: (2025)
by: Lu, Enzhe, et al.
Published: (2025)
HiCI: Hierarchical Construction-Integration for Long-Context Attention
by: Zeng, Xiangyu, et al.
Published: (2026)
by: Zeng, Xiangyu, et al.
Published: (2026)
Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study
by: Tan, Shawn, et al.
Published: (2024)
by: Tan, Shawn, et al.
Published: (2024)
Aligning Human and Machine Attention for Enhanced Supervised Learning
by: Chriqui, Avihay, et al.
Published: (2025)
by: Chriqui, Avihay, et al.
Published: (2025)
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning
by: Ling Team, et al.
Published: (2025)
by: Ling Team, et al.
Published: (2025)
How Sparse Attention Approximates Exact Attention? Your Attention is Naturally $n^C$-Sparse
by: Deng, Yichuan, et al.
Published: (2024)
by: Deng, Yichuan, et al.
Published: (2024)
DashAttention: Differentiable and Adaptive Sparse Hierarchical Attention
by: Huang, Yuxiang, et al.
Published: (2026)
by: Huang, Yuxiang, et al.
Published: (2026)
MUPAX: Multidimensional Problem Agnostic eXplainable AI
by: Dentamaro, Vincenzo, et al.
Published: (2025)
by: Dentamaro, Vincenzo, et al.
Published: (2025)
When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
by: You, Haoran, et al.
Published: (2024)
by: You, Haoran, et al.
Published: (2024)
Evaluating Very Long-Term Conversational Memory of LLM Agents
by: Maharana, Adyasha, et al.
Published: (2024)
by: Maharana, Adyasha, et al.
Published: (2024)
Attention Needs to Focus: A Unified Perspective on Attention Allocation
by: Fu, Zichuan, et al.
Published: (2026)
by: Fu, Zichuan, et al.
Published: (2026)
Efficiently Dispatching Flash Attention For Partially Filled Attention Masks
by: Sharma, Agniv, et al.
Published: (2024)
by: Sharma, Agniv, et al.
Published: (2024)
Tactic: Adaptive Sparse Attention with Clustering and Distribution Fitting for Long-Context LLMs
by: Zhu, Kan, et al.
Published: (2025)
by: Zhu, Kan, et al.
Published: (2025)
Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization
by: Hsieh, Cheng-Yu, et al.
Published: (2024)
by: Hsieh, Cheng-Yu, et al.
Published: (2024)
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
by: Yuan, Jingyang, et al.
Published: (2025)
by: Yuan, Jingyang, et al.
Published: (2025)
Eigen Attention: Attention in Low-Rank Space for KV Cache Compression
by: Saxena, Utkarsh, et al.
Published: (2024)
by: Saxena, Utkarsh, et al.
Published: (2024)
Depth-Recurrent Attention Mixtures: Giving Latent Reasoning the Attention it Deserves
by: Knupp, Jonas, et al.
Published: (2026)
by: Knupp, Jonas, et al.
Published: (2026)
Detecting Hallucinations in SpeechLLMs at Inference Time Using Attention Maps
by: Waldendorf, Jonas, et al.
Published: (2026)
by: Waldendorf, Jonas, et al.
Published: (2026)
Similar Items
-
Architecture-Agnostic Curriculum Learning for Document Understanding: Empirical Evidence from Text-Only and Multimodal
by: Hamdan, Mohammed, et al.
Published: (2026) -
Star Attention: Efficient LLM Inference over Long Sequences
by: Acharya, Shantanu, et al.
Published: (2024) -
LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid
by: Sun, Weigao, et al.
Published: (2025) -
SEAL: Scaling to Emphasize Attention for Long-Context Retrieval
by: Lee, Changhun, et al.
Published: (2025) -
RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale
by: Goldstein, Daniel, et al.
Published: (2025)