Saved in:
| Main Authors: | Ebrahimi, M. Reza, Memisevic, Roland |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.21749 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
On the "Induction Bias" in Sequence Models
by: Ebrahimi, M. Reza, et al.
Published: (2026)
by: Ebrahimi, M. Reza, et al.
Published: (2026)
Multi-Draft Speculative Sampling: Canonical Decomposition and Theoretical Limits
by: Khisti, Ashish, et al.
Published: (2024)
by: Khisti, Ashish, et al.
Published: (2024)
Your Context Is Not an Array: Unveiling Random Access Limitations in Transformers
by: Ebrahimi, MohammadReza, et al.
Published: (2024)
by: Ebrahimi, MohammadReza, et al.
Published: (2024)
Delayed Attention Training Improves Length Generalization in Transformer--RNN Hybrids
by: Phan, Buu, et al.
Published: (2025)
by: Phan, Buu, et al.
Published: (2025)
Advancing Regular Language Reasoning in Linear Recurrent Neural Networks
by: Fan, Ting-Han, et al.
Published: (2023)
by: Fan, Ting-Han, et al.
Published: (2023)
Optimal Decay Spectra for Linear Recurrences
by: Cao, Yang
Published: (2026)
by: Cao, Yang
Published: (2026)
On the Representational Capacity of Recurrent Neural Language Models
by: Nowak, Franz, et al.
Published: (2023)
by: Nowak, Franz, et al.
Published: (2023)
Hybrid Quantum-Classical Recurrent Neural Networks
by: Xu, Wenduan
Published: (2025)
by: Xu, Wenduan
Published: (2025)
Automated SNOMED CT Concept Annotation in Clinical Text Using Bi-GRU Neural Networks
by: Noori, Ali, et al.
Published: (2025)
by: Noori, Ali, et al.
Published: (2025)
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
by: De, Soham, et al.
Published: (2024)
by: De, Soham, et al.
Published: (2024)
VisualRWKV: Exploring Recurrent Neural Networks for Visual Language Models
by: Hou, Haowen, et al.
Published: (2024)
by: Hou, Haowen, et al.
Published: (2024)
Depression detection from Social Media Bangla Text Using Recurrent Neural Networks
by: Ahmed, Sultan, et al.
Published: (2024)
by: Ahmed, Sultan, et al.
Published: (2024)
Dissecting Linear Recurrent Models: How Different Gating Strategies Drive Selectivity and Generalization
by: Bouhadjar, Younes, et al.
Published: (2026)
by: Bouhadjar, Younes, et al.
Published: (2026)
Look, Remember and Reason: Grounded reasoning in videos with language models
by: Bhattacharyya, Apratim, et al.
Published: (2023)
by: Bhattacharyya, Apratim, et al.
Published: (2023)
A Novel Recurrent Neural Network Framework for Prediction and Treatment of Oncogenic Mutation Progression
by: Parthasarathy, Rishab, et al.
Published: (2025)
by: Parthasarathy, Rishab, et al.
Published: (2025)
Liger: Linearizing Large Language Models to Gated Recurrent Structures
by: Lan, Disen, et al.
Published: (2025)
by: Lan, Disen, et al.
Published: (2025)
Associative-State Universal Transformers: Sparse Retrieval Meets Structured Recurrence
by: Xiao, Liu
Published: (2026)
by: Xiao, Liu
Published: (2026)
Rethinking State Tracking in Recurrent Models Through Error Control Dynamics
by: Chung, Jiwan, et al.
Published: (2026)
by: Chung, Jiwan, et al.
Published: (2026)
Scaling Linear Attention with Sparse State Expansion
by: Pan, Yuqi, et al.
Published: (2025)
by: Pan, Yuqi, et al.
Published: (2025)
Exploring Major Transitions in the Evolution of Biological Cognition With Artificial Neural Networks
by: Voudouris, Konstantinos, et al.
Published: (2025)
by: Voudouris, Konstantinos, et al.
Published: (2025)
State Stream Transformer (SST) V2: Parallel Training of Nonlinear Recurrence for Latent Space Reasoning
by: Aviss, Thea
Published: (2026)
by: Aviss, Thea
Published: (2026)
On The Expressivity of Recurrent Neural Cascades
by: Knorozova, Nadezda Alexandrovna, et al.
Published: (2023)
by: Knorozova, Nadezda Alexandrovna, et al.
Published: (2023)
From Out-of-Distribution Detection to Hallucination Detection: A Geometric View
by: Liu, Litian, et al.
Published: (2026)
by: Liu, Litian, et al.
Published: (2026)
Detection of Opioid Users from Reddit Posts via an Attention-based Bidirectional Recurrent Neural Network
by: Wang, Yuchen, et al.
Published: (2024)
by: Wang, Yuchen, et al.
Published: (2024)
Learning State-Tracking from Code Using Linear RNNs
by: Siems, Julien, et al.
Published: (2026)
by: Siems, Julien, et al.
Published: (2026)
Vector Quantized Latent Concepts: A Scalable Alternative to Clustering-Based Concept Discovery
by: Yu, Xuemin, et al.
Published: (2026)
by: Yu, Xuemin, et al.
Published: (2026)
BiHRNN -- Bi-Directional Hierarchical Recurrent Neural Network for Inflation Forecasting
by: Vilenko, Maya
Published: (2025)
by: Vilenko, Maya
Published: (2025)
Neural Isomorphic Fields: A Transformer-based Algebraic Numerical Embedding
by: Sadeghi, Hamidreza, et al.
Published: (2026)
by: Sadeghi, Hamidreza, et al.
Published: (2026)
Attention-Based Recurrent Neural Network For Automatic Behavior Laying Hen Recognition
by: Laleye, Fréjus A. A., et al.
Published: (2024)
by: Laleye, Fréjus A. A., et al.
Published: (2024)
Neural Attention Search Linear: Towards Adaptive Token-Level Hybrid Attention Models
by: Deng, Difan, et al.
Published: (2026)
by: Deng, Difan, et al.
Published: (2026)
Max-pooling Network Revisited: Analyzing the Role of Semantic Probability in Multiple Instance Learning for Hallucination Detection
by: Fujikawa, Shota, et al.
Published: (2026)
by: Fujikawa, Shota, et al.
Published: (2026)
BiDoRA: Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation
by: Qin, Peijia, et al.
Published: (2024)
by: Qin, Peijia, et al.
Published: (2024)
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
by: Grazzi, Riccardo, et al.
Published: (2024)
by: Grazzi, Riccardo, et al.
Published: (2024)
REQUAL-LM: Reliability and Equity through Aggregation in Large Language Models
by: Ebrahimi, Sana, et al.
Published: (2024)
by: Ebrahimi, Sana, et al.
Published: (2024)
GateLoop: Fully Data-Controlled Linear Recurrence for Sequence Modeling
by: Katsch, Tobias
Published: (2023)
by: Katsch, Tobias
Published: (2023)
A New Method for Cross-Lingual-based Semantic Role Labeling
by: Ebrahimi, Mohammad, et al.
Published: (2024)
by: Ebrahimi, Mohammad, et al.
Published: (2024)
Replacing thinking with tool usage enables reasoning in small language models
by: Rainone, Corrado, et al.
Published: (2025)
by: Rainone, Corrado, et al.
Published: (2025)
In-context Learning and Gradient Descent Revisited
by: Deutch, Gilad, et al.
Published: (2023)
by: Deutch, Gilad, et al.
Published: (2023)
Structured Recurrent Mixers for Massively Parallelized Sequence Generation
by: Badger, Benjamin L.
Published: (2026)
by: Badger, Benjamin L.
Published: (2026)
Convolutional Neural Networks for Toxic Comment Classification
by: Georgakopoulos, Spiros V., et al.
Published: (2018)
by: Georgakopoulos, Spiros V., et al.
Published: (2018)
Similar Items
-
On the "Induction Bias" in Sequence Models
by: Ebrahimi, M. Reza, et al.
Published: (2026) -
Multi-Draft Speculative Sampling: Canonical Decomposition and Theoretical Limits
by: Khisti, Ashish, et al.
Published: (2024) -
Your Context Is Not an Array: Unveiling Random Access Limitations in Transformers
by: Ebrahimi, MohammadReza, et al.
Published: (2024) -
Delayed Attention Training Improves Length Generalization in Transformer--RNN Hybrids
by: Phan, Buu, et al.
Published: (2025) -
Advancing Regular Language Reasoning in Linear Recurrent Neural Networks
by: Fan, Ting-Han, et al.
Published: (2023)