Saved in:
| Main Authors: | Sieber, Jerome, Orvieto, Antonio, Zeilinger, Melanie N., Alonso, Carmen Amo |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.09389 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks
by: Sieber, Jerome, et al.
Published: (2024)
by: Sieber, Jerome, et al.
Published: (2024)
State Space Models as Foundation Models: A Control Theoretic Overview
by: Alonso, Carmen Amo, et al.
Published: (2024)
by: Alonso, Carmen Amo, et al.
Published: (2024)
Task-Level Insights from Eigenvalues across Sequence Models
by: Rickenbach, Rahel, et al.
Published: (2025)
by: Rickenbach, Rahel, et al.
Published: (2025)
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
by: Joseph, Federico Arangath, et al.
Published: (2024)
by: Joseph, Federico Arangath, et al.
Published: (2024)
Recurrent neural networks: vanishing and exploding gradients are not the end of the story
by: Zucchet, Nicolas, et al.
Published: (2024)
by: Zucchet, Nicolas, et al.
Published: (2024)
Can you Finetune your Binoculars? Embedding Text Watermarks into the Weights of Large Language Models
by: Elhassan, Fay, et al.
Published: (2025)
by: Elhassan, Fay, et al.
Published: (2025)
(Almost) Free Modality Stitching of Foundation Models
by: Singh, Jaisidh, et al.
Published: (2025)
by: Singh, Jaisidh, et al.
Published: (2025)
GenCtrl -- A Formal Controllability Toolkit for Generative Models
by: Cheng, Emily, et al.
Published: (2026)
by: Cheng, Emily, et al.
Published: (2026)
How does the optimizer implicitly bias the model merging loss landscape?
by: Zhang, Chenxiang, et al.
Published: (2025)
by: Zhang, Chenxiang, et al.
Published: (2025)
Sampling-Based Safe Reinforcement Learning
by: Vignola, Luca, et al.
Published: (2026)
by: Vignola, Luca, et al.
Published: (2026)
Language Models for Controllable DNA Sequence Design
by: Su, Xingyu, et al.
Published: (2025)
by: Su, Xingyu, et al.
Published: (2025)
Generalized Interpolating Discrete Diffusion
by: von Rütte, Dimitri, et al.
Published: (2025)
by: von Rütte, Dimitri, et al.
Published: (2025)
GRASP: Deterministic argument ranking in interaction graphs
by: Misra, Diganta, et al.
Published: (2026)
by: Misra, Diganta, et al.
Published: (2026)
PAINET: A Principled Efficient Transformer for 3D Dynamics Modeling
by: Yang, Kai, et al.
Published: (2025)
by: Yang, Kai, et al.
Published: (2025)
Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient
by: Wang, George, et al.
Published: (2024)
by: Wang, George, et al.
Published: (2024)
From Coefficients to Directions: Rethinking Model Merging with Directional Alignment
by: Chen, Zhikang, et al.
Published: (2025)
by: Chen, Zhikang, et al.
Published: (2025)
Reinforcement Learning for Sequence Design Leveraging Protein Language Models
by: Subramanian, Jithendaraa, et al.
Published: (2024)
by: Subramanian, Jithendaraa, et al.
Published: (2024)
Context-Former: Stitching via Latent Conditioned Sequence Modeling
by: Zhang, Ziqi, et al.
Published: (2024)
by: Zhang, Ziqi, et al.
Published: (2024)
GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification
by: Gan, Wangjie, et al.
Published: (2026)
by: Gan, Wangjie, et al.
Published: (2026)
ProtFlow: Fast Protein Sequence Design via Flow Matching on Compressed Protein Language Model Embeddings
by: Kong, Zitai, et al.
Published: (2025)
by: Kong, Zitai, et al.
Published: (2025)
Property-Isometric Variational Autoencoders for Sequence Modeling and Design
by: Sadeghi, Elham, et al.
Published: (2025)
by: Sadeghi, Elham, et al.
Published: (2025)
Pragmatist Intelligence: Where the Principle of Usefulness Can Take ANNs
by: Bikić, Antonio, et al.
Published: (2024)
by: Bikić, Antonio, et al.
Published: (2024)
Design Principles for Falsifiable, Replicable and Reproducible Empirical ML Research
by: Vranješ, Daniel, et al.
Published: (2024)
by: Vranješ, Daniel, et al.
Published: (2024)
Verification of the Implicit World Model in a Generative Model via Adversarial Sequences
by: Balogh, András, et al.
Published: (2026)
by: Balogh, András, et al.
Published: (2026)
Coefficient Decomposition for Spectral Graph Convolution
by: Huang, Feng, et al.
Published: (2024)
by: Huang, Feng, et al.
Published: (2024)
Improving Protein Sequence Design through Designability Preference Optimization
by: Xue, Fanglei, et al.
Published: (2025)
by: Xue, Fanglei, et al.
Published: (2025)
Tackling Data Corruption in Offline Reinforcement Learning via Sequence Modeling
by: Xu, Jiawei, et al.
Published: (2024)
by: Xu, Jiawei, et al.
Published: (2024)
An Uncertainty Principle for Linear Recurrent Neural Networks
by: François, Alexandre, et al.
Published: (2025)
by: François, Alexandre, et al.
Published: (2025)
Neural Networks: According to the Principles of Grassmann Algebra
by: Zarezadeh, Z., et al.
Published: (2025)
by: Zarezadeh, Z., et al.
Published: (2025)
Interpretable DNA Sequence Classification via Dynamic Feature Generation in Decision Trees
by: Huynh, Nicolas, et al.
Published: (2026)
by: Huynh, Nicolas, et al.
Published: (2026)
SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking
by: Cundy, Chris, et al.
Published: (2023)
by: Cundy, Chris, et al.
Published: (2023)
Scalable Hyperparameter-Divergent Ensemble Training with Automatic Learning Rate Exploration for Large Models
by: Cheng, Hailing, et al.
Published: (2026)
by: Cheng, Hailing, et al.
Published: (2026)
The Lifecycle Principle: Stabilizing Dynamic Neural Networks with State Memory
by: Yang, Zichuan
Published: (2025)
by: Yang, Zichuan
Published: (2025)
Principled RL for Diffusion LLMs Emerges from a Sequence-Level Perspective
by: Ou, Jingyang, et al.
Published: (2025)
by: Ou, Jingyang, et al.
Published: (2025)
The Principles of Diffusion Models
by: Lai, Chieh-Hsin, et al.
Published: (2025)
by: Lai, Chieh-Hsin, et al.
Published: (2025)
Decision Mamba: Reinforcement Learning via Sequence Modeling with Selective State Spaces
by: Ota, Toshihiro
Published: (2024)
by: Ota, Toshihiro
Published: (2024)
Principle-Evolvable Scientific Discovery via Uncertainty Minimization
by: Pu, Yingming, et al.
Published: (2026)
by: Pu, Yingming, et al.
Published: (2026)
MolChord: Structure-Sequence Alignment for Protein-Guided Drug Design
by: Zhang, Wei, et al.
Published: (2025)
by: Zhang, Wei, et al.
Published: (2025)
DeepACTIF: Efficient Feature Attribution via Activation Traces in Neural Sequence Models
by: Hosp, Benedikt W.
Published: (2025)
by: Hosp, Benedikt W.
Published: (2025)
Generalized Learning of Coefficients in Spectral Graph Convolutional Networks
by: Coşkun, Mustafa, et al.
Published: (2024)
by: Coşkun, Mustafa, et al.
Published: (2024)
Similar Items
-
Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks
by: Sieber, Jerome, et al.
Published: (2024) -
State Space Models as Foundation Models: A Control Theoretic Overview
by: Alonso, Carmen Amo, et al.
Published: (2024) -
Task-Level Insights from Eigenvalues across Sequence Models
by: Rickenbach, Rahel, et al.
Published: (2025) -
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
by: Joseph, Federico Arangath, et al.
Published: (2024) -
Recurrent neural networks: vanishing and exploding gradients are not the end of the story
by: Zucchet, Nicolas, et al.
Published: (2024)