Saved in:
| Main Authors: | Sunkaraneni, Varun, Beneventano, Pierfrancesco, Neumarker, Riccardo, Poggio, Tomaso, Galanti, Tomer |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.14163 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
pAI/MSc: ML Theory Research with Humans on the Loop
by: Abdelmoneum, Mahmoud, et al.
Published: (2026)
by: Abdelmoneum, Mahmoud, et al.
Published: (2026)
Tool Building as a Path to "Superintelligence"
by: Koplow, David, et al.
Published: (2026)
by: Koplow, David, et al.
Published: (2026)
Distribution-Aware Algorithm Design with LLM Agents
by: Koganti, Saharsh, et al.
Published: (2026)
by: Koganti, Saharsh, et al.
Published: (2026)
The Generalized Turing Test: A Foundation for Comparing Intelligence
by: Mitropolsky, Daniel, et al.
Published: (2026)
by: Mitropolsky, Daniel, et al.
Published: (2026)
How Neural Networks Learn the Support is an Implicit Regularization Effect of SGD
by: Beneventano, Pierfrancesco, et al.
Published: (2024)
by: Beneventano, Pierfrancesco, et al.
Published: (2024)
Does Weight Decay Enhance Training Stability?
by: Saether, Marius, et al.
Published: (2026)
by: Saether, Marius, et al.
Published: (2026)
Too Sharp, Too Sure: When Calibration Follows Curvature
by: Morosini, Alessandro, et al.
Published: (2026)
by: Morosini, Alessandro, et al.
Published: (2026)
Hierarchical Reasoning Models: Perspectives and Misconceptions
by: Ge, Renee, et al.
Published: (2025)
by: Ge, Renee, et al.
Published: (2025)
On the Power of Decision Trees in Auto-Regressive Language Modeling
by: Gan, Yulu, et al.
Published: (2024)
by: Gan, Yulu, et al.
Published: (2024)
Momentum Further Constrains Sharpness at the Edge of Stochastic Stability
by: Andreyev, Arseniy, et al.
Published: (2026)
by: Andreyev, Arseniy, et al.
Published: (2026)
Formation of Representations in Neural Networks
by: Ziyin, Liu, et al.
Published: (2024)
by: Ziyin, Liu, et al.
Published: (2024)
SGD and Weight Decay Secretly Minimize the Rank of Your Neural Network
by: Galanti, Tomer, et al.
Published: (2022)
by: Galanti, Tomer, et al.
Published: (2022)
Same Error, Different Function: The Optimizer as an Implicit Prior in Financial Time Series
by: Cortesi, Federico Vittorio, et al.
Published: (2026)
by: Cortesi, Federico Vittorio, et al.
Published: (2026)
Do Deep Networks Forget Initialization? A Forgetting-Time View of Practical Inductive Bias
by: Das, Mohua, et al.
Published: (2026)
by: Das, Mohua, et al.
Published: (2026)
DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization
by: Li, Gang, et al.
Published: (2025)
by: Li, Gang, et al.
Published: (2025)
The Fair Language Model Paradox
by: Pinto, Andrea, et al.
Published: (2024)
by: Pinto, Andrea, et al.
Published: (2024)
Retrieval Is Not Enough: Why Organizational AI Needs Epistemic Infrastructure
by: Bottino, Federico, et al.
Published: (2026)
by: Bottino, Federico, et al.
Published: (2026)
Directional Neural Collapse Explains Few-Shot Transfer in Self-Supervised Learning
by: Luthra, Achleshwar, et al.
Published: (2026)
by: Luthra, Achleshwar, et al.
Published: (2026)
On the Trajectories of SGD Without Replacement
by: Beneventano, Pierfrancesco
Published: (2023)
by: Beneventano, Pierfrancesco
Published: (2023)
Parameter Symmetry Potentially Unifies Deep Learning Theory
by: Ziyin, Liu, et al.
Published: (2025)
by: Ziyin, Liu, et al.
Published: (2025)
Position: A Theory of Deep Learning Must Include Compositional Sparsity
by: Danhofer, David A., et al.
Published: (2025)
by: Danhofer, David A., et al.
Published: (2025)
Probing Neural Topology of Large Language Models
by: Zheng, Yu, et al.
Published: (2025)
by: Zheng, Yu, et al.
Published: (2025)
Intelligence Analysis of Language Models
by: Galanti, Liane, et al.
Published: (2024)
by: Galanti, Liane, et al.
Published: (2024)
AgentFixer: From Failure Detection to Fix Recommendations in LLM Agentic Systems
by: Mulian, Hadar, et al.
Published: (2026)
by: Mulian, Hadar, et al.
Published: (2026)
TRAIL: Trace Reasoning and Agentic Issue Localization
by: Deshpande, Darshan, et al.
Published: (2025)
by: Deshpande, Darshan, et al.
Published: (2025)
Edge of Stochastic Stability: Revisiting the Edge of Stability for SGD
by: Andreyev, Arseniy, et al.
Published: (2024)
by: Andreyev, Arseniy, et al.
Published: (2024)
Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks
by: Beneventano, Pierfrancesco, et al.
Published: (2025)
by: Beneventano, Pierfrancesco, et al.
Published: (2025)
Training the Untrainable: Introducing Inductive Bias via Representational Alignment
by: Subramaniam, Vighnesh, et al.
Published: (2024)
by: Subramaniam, Vighnesh, et al.
Published: (2024)
SemanticALLI: Caching Reasoning, Not Just Responses, in Agentic Systems
by: Chillara, Varun, et al.
Published: (2026)
by: Chillara, Varun, et al.
Published: (2026)
On efficiently computable functions, deep networks and sparse compositionality
by: Poggio, Tomaso
Published: (2025)
by: Poggio, Tomaso
Published: (2025)
The Seeds of Scheming: Weakness of Will in the Building Blocks of Agentic Systems
by: Yang, Robert
Published: (2025)
by: Yang, Robert
Published: (2025)
Agentic Risk-Aware Set-Based Engineering Design
by: Kumar, Varun, et al.
Published: (2026)
by: Kumar, Varun, et al.
Published: (2026)
Weakly Supervised Text-to-SQL Parsing through Question Decomposition
by: Wolfson, Tomer, et al.
Published: (2021)
by: Wolfson, Tomer, et al.
Published: (2021)
BoostTaxo: Zero-Shot Taxonomy Induction via Boosting-Style Agentic Reasoning and Constraint-Aware Calibration
by: Ling, Yancheng, et al.
Published: (2026)
by: Ling, Yancheng, et al.
Published: (2026)
A Survey of Reasoning and Agentic Systems in Time Series with Large Language Models
by: Chang, Ching, et al.
Published: (2025)
by: Chang, Ching, et al.
Published: (2025)
Can Reasoning Models Reason about Hardware? An Agentic HLS Perspective
by: Collini, Luca, et al.
Published: (2025)
by: Collini, Luca, et al.
Published: (2025)
CaseEdit: Enhancing Localized Commonsense Reasoning via Null-Space Constrained Knowledge Editing in Small Parameter Language Models
by: Reddy, Varun, et al.
Published: (2025)
by: Reddy, Varun, et al.
Published: (2025)
Game Networks
by: La Mura, Pierfrancesco
Published: (2013)
by: La Mura, Pierfrancesco
Published: (2013)
Agentic Reasoning for Large Language Models
by: Wei, Tianxin, et al.
Published: (2026)
by: Wei, Tianxin, et al.
Published: (2026)
Distributed Speculative Inference (DSI): Speculation Parallelism for Provably Faster Lossless Language Model Inference
by: Timor, Nadav, et al.
Published: (2024)
by: Timor, Nadav, et al.
Published: (2024)
Similar Items
-
pAI/MSc: ML Theory Research with Humans on the Loop
by: Abdelmoneum, Mahmoud, et al.
Published: (2026) -
Tool Building as a Path to "Superintelligence"
by: Koplow, David, et al.
Published: (2026) -
Distribution-Aware Algorithm Design with LLM Agents
by: Koganti, Saharsh, et al.
Published: (2026) -
The Generalized Turing Test: A Foundation for Comparing Intelligence
by: Mitropolsky, Daniel, et al.
Published: (2026) -
How Neural Networks Learn the Support is an Implicit Regularization Effect of SGD
by: Beneventano, Pierfrancesco, et al.
Published: (2024)