:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Siems, Julien, Grazzi, Riccardo, Kalinin, Kirill, Ballani, Hitesh, Rahmani, Babak
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Computation and Language
Online Access:	https://arxiv.org/abs/2602.14814
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products
by: Siems, Julien, et al.
Published: (2025)

Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
by: Grazzi, Riccardo, et al.
Published: (2024)

Implicit Language Models are RNNs: Balancing Parallelization and Expressivity
by: Schöne, Mark, et al.
Published: (2025)

Is Mamba Capable of In-Context Learning?
by: Grazzi, Riccardo, et al.
Published: (2024)

OptRot: Mitigating Weight Outliers via Data-Free Rotations for Post-Training Quantization
by: Gadhikar, Advait, et al.
Published: (2025)

Why Are Linear RNNs More Parallelizable?
by: Merrill, William, et al.
Published: (2026)

Debugging code world models
by: Rahmani, Babak
Published: (2026)

Learning to (Learn at Test Time): RNNs with Expressive Hidden States
by: Sun, Yu, et al.
Published: (2024)

TempoPFN: Synthetic Pre-training of Linear RNNs for Zero-shot Time Series Forecasting
by: Moroshan, Vladyslav, et al.
Published: (2025)

Uncovering the Computational Roles of Nonlinearity in Sequence Modeling Using Almost-Linear RNNs
by: Brenner, Manuel, et al.
Published: (2025)

On Efficiently Representing Regular Languages as RNNs
by: Svete, Anej, et al.
Published: (2024)

RNNs are not Transformers (Yet): The Key Bottleneck on In-context Retrieval
by: Wen, Kaiyue, et al.
Published: (2024)

Comba: Improving Bilinear RNNs with Closed-loop Control
by: Hu, Jiaxi, et al.
Published: (2025)

Does Transformer Interpretability Transfer to RNNs?
by: Paulo, Gonçalo, et al.
Published: (2024)

Compositional Reasoning with Transformers, RNNs, and Chain of Thought
by: Yehudai, Gilad, et al.
Published: (2025)

An enhanced Teaching-Learning-Based Optimization (TLBO) with Grey Wolf Optimizer (GWO) for text feature selection and clustering
by: Azarshab, Mahsa, et al.
Published: (2024)

Diable: Efficient Dialogue State Tracking as Operations on Tables
by: Lesci, Pietro, et al.
Published: (2023)

Variable-Length Semantic IDs for Recommender Systems
by: Khrylchenko, Kirill
Published: (2026)

Enhancing Transformer RNNs with Multiple Temporal Perspectives
by: Dumitru, Razvan-Gabriel, et al.
Published: (2024)

Scaling Linear Attention with Sparse State Expansion
by: Pan, Yuqi, et al.
Published: (2025)

Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking
by: Zhang, Yifan, et al.
Published: (2025)

Structure-Guided Entity Resolution: Fine-Tuning LLMs for Robust Name Matching in Complex Linguistic Contexts
by: Chourasia, Shivam, et al.
Published: (2026)

HGRN2: Gated Linear RNNs with State Expansion
by: Qin, Zhen, et al.
Published: (2024)

A Lightweight Method to Disrupt Memorized Sequences in LLM
by: Prashant, Parjanya Prajakta, et al.
Published: (2025)

Rethinking State Tracking in Recurrent Models Through Error Control Dynamics
by: Chung, Jiwan, et al.
Published: (2026)

Revisiting Bi-Linear State Transitions in Recurrent Neural Networks
by: Ebrahimi, M. Reza, et al.
Published: (2025)

Clinical QA 2.0: Multi-Task Learning for Answer Extraction and Categorization
by: Pattnayak, Priyaranjan, et al.
Published: (2025)

DREaM: Drug-Drug Relation Extraction via Transfer Learning Method
by: Fata, Ali, et al.
Published: (2025)

(How) Do Language Models Track State?
by: Li, Belinda Z., et al.
Published: (2025)

Convergence Properties of Stochastic Hypergradients
by: Grazzi, Riccardo, et al.
Published: (2020)

Injecting linguistic knowledge into BERT for Dialogue State Tracking
by: Feng, Xiaohan, et al.
Published: (2023)

$K^4$: Online Log Anomaly Detection Via Unsupervised Typicality Learning
by: Chen, Weicong, et al.
Published: (2025)

MaxCode: A Max-Reward Reinforcement Learning Framework for Automated Code Optimization
by: Ou, Jiefu, et al.
Published: (2026)

Learning and Transferring Sparse Contextual Bigrams with Linear Transformers
by: Ren, Yunwei, et al.
Published: (2024)

Toward a Flexible Framework for Linear Representation Hypothesis Using Maximum Likelihood Estimation
by: Nguyen, Trung, et al.
Published: (2025)

Teaching LLM to Reason: Reinforcement Learning from Algorithmic Problems without Code
by: Bao, Keqin, et al.
Published: (2025)

Skewed Memorization in Large Language Models: Quantification and Decomposition
by: Li, Hao, et al.
Published: (2025)

Suggesting Code Edits in Interactive Machine Learning Notebooks Using Large Language Models
by: Jin, Bihui, et al.
Published: (2025)

HealthQ: Unveiling Questioning Capabilities of LLM Chains in Healthcare Conversations
by: Wang, Ziyu, et al.
Published: (2024)

BenchHub: A Unified Benchmark Suite for Holistic and Customizable LLM Evaluation
by: Kim, Eunsu, et al.
Published: (2025)