Saved in:
| Main Authors: | Siems, Julien, Grazzi, Riccardo, Kalinin, Kirill, Ballani, Hitesh, Rahmani, Babak |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.14814 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products
by: Siems, Julien, et al.
Published: (2025)
by: Siems, Julien, et al.
Published: (2025)
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
by: Grazzi, Riccardo, et al.
Published: (2024)
by: Grazzi, Riccardo, et al.
Published: (2024)
Implicit Language Models are RNNs: Balancing Parallelization and Expressivity
by: Schöne, Mark, et al.
Published: (2025)
by: Schöne, Mark, et al.
Published: (2025)
Is Mamba Capable of In-Context Learning?
by: Grazzi, Riccardo, et al.
Published: (2024)
by: Grazzi, Riccardo, et al.
Published: (2024)
OptRot: Mitigating Weight Outliers via Data-Free Rotations for Post-Training Quantization
by: Gadhikar, Advait, et al.
Published: (2025)
by: Gadhikar, Advait, et al.
Published: (2025)
Why Are Linear RNNs More Parallelizable?
by: Merrill, William, et al.
Published: (2026)
by: Merrill, William, et al.
Published: (2026)
Debugging code world models
by: Rahmani, Babak
Published: (2026)
by: Rahmani, Babak
Published: (2026)
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
by: Sun, Yu, et al.
Published: (2024)
by: Sun, Yu, et al.
Published: (2024)
TempoPFN: Synthetic Pre-training of Linear RNNs for Zero-shot Time Series Forecasting
by: Moroshan, Vladyslav, et al.
Published: (2025)
by: Moroshan, Vladyslav, et al.
Published: (2025)
Uncovering the Computational Roles of Nonlinearity in Sequence Modeling Using Almost-Linear RNNs
by: Brenner, Manuel, et al.
Published: (2025)
by: Brenner, Manuel, et al.
Published: (2025)
On Efficiently Representing Regular Languages as RNNs
by: Svete, Anej, et al.
Published: (2024)
by: Svete, Anej, et al.
Published: (2024)
RNNs are not Transformers (Yet): The Key Bottleneck on In-context Retrieval
by: Wen, Kaiyue, et al.
Published: (2024)
by: Wen, Kaiyue, et al.
Published: (2024)
Comba: Improving Bilinear RNNs with Closed-loop Control
by: Hu, Jiaxi, et al.
Published: (2025)
by: Hu, Jiaxi, et al.
Published: (2025)
Does Transformer Interpretability Transfer to RNNs?
by: Paulo, Gonçalo, et al.
Published: (2024)
by: Paulo, Gonçalo, et al.
Published: (2024)
Compositional Reasoning with Transformers, RNNs, and Chain of Thought
by: Yehudai, Gilad, et al.
Published: (2025)
by: Yehudai, Gilad, et al.
Published: (2025)
An enhanced Teaching-Learning-Based Optimization (TLBO) with Grey Wolf Optimizer (GWO) for text feature selection and clustering
by: Azarshab, Mahsa, et al.
Published: (2024)
by: Azarshab, Mahsa, et al.
Published: (2024)
Diable: Efficient Dialogue State Tracking as Operations on Tables
by: Lesci, Pietro, et al.
Published: (2023)
by: Lesci, Pietro, et al.
Published: (2023)
Variable-Length Semantic IDs for Recommender Systems
by: Khrylchenko, Kirill
Published: (2026)
by: Khrylchenko, Kirill
Published: (2026)
Enhancing Transformer RNNs with Multiple Temporal Perspectives
by: Dumitru, Razvan-Gabriel, et al.
Published: (2024)
by: Dumitru, Razvan-Gabriel, et al.
Published: (2024)
Scaling Linear Attention with Sparse State Expansion
by: Pan, Yuqi, et al.
Published: (2025)
by: Pan, Yuqi, et al.
Published: (2025)
Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking
by: Zhang, Yifan, et al.
Published: (2025)
by: Zhang, Yifan, et al.
Published: (2025)
Structure-Guided Entity Resolution: Fine-Tuning LLMs for Robust Name Matching in Complex Linguistic Contexts
by: Chourasia, Shivam, et al.
Published: (2026)
by: Chourasia, Shivam, et al.
Published: (2026)
HGRN2: Gated Linear RNNs with State Expansion
by: Qin, Zhen, et al.
Published: (2024)
by: Qin, Zhen, et al.
Published: (2024)
A Lightweight Method to Disrupt Memorized Sequences in LLM
by: Prashant, Parjanya Prajakta, et al.
Published: (2025)
by: Prashant, Parjanya Prajakta, et al.
Published: (2025)
Rethinking State Tracking in Recurrent Models Through Error Control Dynamics
by: Chung, Jiwan, et al.
Published: (2026)
by: Chung, Jiwan, et al.
Published: (2026)
Revisiting Bi-Linear State Transitions in Recurrent Neural Networks
by: Ebrahimi, M. Reza, et al.
Published: (2025)
by: Ebrahimi, M. Reza, et al.
Published: (2025)
Clinical QA 2.0: Multi-Task Learning for Answer Extraction and Categorization
by: Pattnayak, Priyaranjan, et al.
Published: (2025)
by: Pattnayak, Priyaranjan, et al.
Published: (2025)
DREaM: Drug-Drug Relation Extraction via Transfer Learning Method
by: Fata, Ali, et al.
Published: (2025)
by: Fata, Ali, et al.
Published: (2025)
(How) Do Language Models Track State?
by: Li, Belinda Z., et al.
Published: (2025)
by: Li, Belinda Z., et al.
Published: (2025)
Convergence Properties of Stochastic Hypergradients
by: Grazzi, Riccardo, et al.
Published: (2020)
by: Grazzi, Riccardo, et al.
Published: (2020)
Injecting linguistic knowledge into BERT for Dialogue State Tracking
by: Feng, Xiaohan, et al.
Published: (2023)
by: Feng, Xiaohan, et al.
Published: (2023)
$K^4$: Online Log Anomaly Detection Via Unsupervised Typicality Learning
by: Chen, Weicong, et al.
Published: (2025)
by: Chen, Weicong, et al.
Published: (2025)
MaxCode: A Max-Reward Reinforcement Learning Framework for Automated Code Optimization
by: Ou, Jiefu, et al.
Published: (2026)
by: Ou, Jiefu, et al.
Published: (2026)
Learning and Transferring Sparse Contextual Bigrams with Linear Transformers
by: Ren, Yunwei, et al.
Published: (2024)
by: Ren, Yunwei, et al.
Published: (2024)
Toward a Flexible Framework for Linear Representation Hypothesis Using Maximum Likelihood Estimation
by: Nguyen, Trung, et al.
Published: (2025)
by: Nguyen, Trung, et al.
Published: (2025)
Teaching LLM to Reason: Reinforcement Learning from Algorithmic Problems without Code
by: Bao, Keqin, et al.
Published: (2025)
by: Bao, Keqin, et al.
Published: (2025)
Skewed Memorization in Large Language Models: Quantification and Decomposition
by: Li, Hao, et al.
Published: (2025)
by: Li, Hao, et al.
Published: (2025)
Suggesting Code Edits in Interactive Machine Learning Notebooks Using Large Language Models
by: Jin, Bihui, et al.
Published: (2025)
by: Jin, Bihui, et al.
Published: (2025)
HealthQ: Unveiling Questioning Capabilities of LLM Chains in Healthcare Conversations
by: Wang, Ziyu, et al.
Published: (2024)
by: Wang, Ziyu, et al.
Published: (2024)
BenchHub: A Unified Benchmark Suite for Holistic and Customizable LLM Evaluation
by: Kim, Eunsu, et al.
Published: (2025)
by: Kim, Eunsu, et al.
Published: (2025)
Similar Items
-
DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products
by: Siems, Julien, et al.
Published: (2025) -
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
by: Grazzi, Riccardo, et al.
Published: (2024) -
Implicit Language Models are RNNs: Balancing Parallelization and Expressivity
by: Schöne, Mark, et al.
Published: (2025) -
Is Mamba Capable of In-Context Learning?
by: Grazzi, Riccardo, et al.
Published: (2024) -
OptRot: Mitigating Weight Outliers via Data-Free Rotations for Post-Training Quantization
by: Gadhikar, Advait, et al.
Published: (2025)