:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ran-Milo, Yuval, Alexander, Yotam, Mendel, Shahar, Cohen, Nadav
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2601.15158
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Provable Benefits of Complex Parameterizations for Structured State Space Models
by: Ran-Milo, Yuval, et al.
Published: (2024)

Do Neural Networks Need Gradient Descent to Generalize? A Theoretical Study
by: Alexander, Yotam, et al.
Published: (2025)

A Mechanistic Account of Attention Sinks in GPT-2: One Circuit, Broader Implications for Mitigation
by: Ran-Milo, Yuval, et al.
Published: (2026)

What Makes Data Suitable for a Locally Connected Neural Network? A Necessary and Sufficient Condition Based on Quantum Entanglement
by: Alexander, Yotam, et al.
Published: (2023)

Attention Sinks Are Provably Necessary in Softmax Transformers: Evidence from Trigger-Conditional Tasks
by: Ran-Milo, Yuval
Published: (2026)

Implicit Bias of Policy Gradient in Linear Quadratic Control: Extrapolation to Unseen Initial States
by: Razin, Noam, et al.
Published: (2024)

Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning
by: Cohen, Nadav, et al.
Published: (2024)

Metastable Dynamics of Chain-of-Thought Reasoning: Provable Benefits of Search, RL and Distillation
by: Kim, Juno, et al.
Published: (2025)

RL in Name Only? Analyzing the Structural Assumptions in RL post-training for LLMs
by: Samineni, Soumya Rani, et al.
Published: (2025)

Inertial Navigation Meets Deep Learning: A Survey of Current Trends and Future Directions
by: Cohen, Nadav, et al.
Published: (2023)

Transformers Provably Learn Chain-of-Thought Reasoning with Length Generalization
by: Huang, Yu, et al.
Published: (2025)

A Deep Dive into Scaling RL for Code Generation with Synthetic Data and Curricula
by: Sancaktar, Cansu, et al.
Published: (2026)

Rethinking Adversarial Policies: A Generalized Attack Formulation and Provable Defense in RL
by: Liu, Xiangyu, et al.
Published: (2023)

Improving Transformer World Models for Data-Efficient RL
by: Dedieu, Antoine, et al.
Published: (2025)

$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training
by: Zhou, Jin Peng, et al.
Published: (2025)

Efficient Online RL Fine Tuning with Offline Pre-trained Policy Only
by: Xiao, Wei, et al.
Published: (2025)

You Need Reasoning to Learn Reasoning: The Limitations of Label-Free RL in Weak Base Models
by: Roy, Shuvendu, et al.
Published: (2025)

Steering LLM Reasoning Through Bias-Only Adaptation
by: Sinii, Viacheslav, et al.
Published: (2025)

Boundary on the Table: Efficient Black-Box Decision-Based Attacks for Structured Data
by: Kazoom, Roie, et al.
Published: (2025)

Token-Efficient RL for LLM Reasoning
by: Lee, Alan, et al.
Published: (2025)

RL for Reasoning by Adaptively Revealing Rationales
by: Amani, Mohammad Hossein, et al.
Published: (2025)

Unsupervised Representation Learning - an Invariant Risk Minimization Perspective
by: Norman, Yotam, et al.
Published: (2025)

The Implicit Bias of Structured State Space Models Can Be Poisoned With Clean Labels
by: Slutzky, Yonatan, et al.
Published: (2024)

Transformers Provably Implement In-Context Reinforcement Learning with Policy Improvement
by: Liang, Haodong, et al.
Published: (2026)

Φ-Noise: Training-Free Temporal Video Conditioning via Phase-Based Noise Manipulation
by: Abramovich, Ofir, et al.
Published: (2026)

Mathematical Models of Computation in Superposition
by: Hänni, Kaarel, et al.
Published: (2024)

Transformers Provably Learn Algorithmic Solutions for Graph Connectivity, But Only with the Right Data
by: Ye, Qilin, et al.
Published: (2025)

On the Provable Performance Guarantee of Efficient Reasoning Models
by: Zeng, Hao, et al.
Published: (2025)

Randomized Antipodal Search Done Right for Data Pareto Improvement of LLM Unlearning
by: Liu, Ziwen, et al.
Published: (2026)

Human Activity Recognition Based on Electrocardiogram Data Only
by: Montazeri, Sina, et al.
Published: (2025)

Provable Training Data Identification for Large Language Models
by: Liu, Zhenlong, et al.
Published: (2025)

Fourier Sliced-Wasserstein Embedding for Multisets and Measures
by: Amir, Tal, et al.
Published: (2025)

On the (Non) Injectivity of Piecewise Linear Janossy Pooling
by: Reshef, Ilai, et al.
Published: (2025)

On the Expressive Power of Sparse Geometric MPNNs
by: Sverdlov, Yonatan, et al.
Published: (2024)

Provably Overwhelming Transformer Models with Designed Inputs
by: Stambler, Lev, et al.
Published: (2025)

Revisiting Glorot Initialization for Long-Range Linear Recurrences
by: Bar, Noga, et al.
Published: (2025)

A Hierarchical Language Model with Predictable Scaling Laws and Provable Benefits of Reasoning
by: Gaitonde, Jason, et al.
Published: (2026)

Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use
by: Goldie, Anna, et al.
Published: (2025)

Accelerating RL for LLM Reasoning with Optimal Advantage Regression
by: Brantley, Kianté, et al.
Published: (2025)

VI-CuRL: Stabilizing Verifier-Independent RL Reasoning via Confidence-Guided Variance Reduction
by: Cai, Xin-Qiang, et al.
Published: (2026)