Saved in:
| Main Authors: | Trikha, Akshay, Chu, Kyle, Gosai, Advait, Szachta, Parker, Weiner, Eric |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.21811 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
The Data Efficiency Frontier of Financial Foundation Models: Scaling Laws from Continued Pretraining
by: Ponnock, Jesse
Published: (2025)
by: Ponnock, Jesse
Published: (2025)
Fusing Rewards and Preferences in Reinforcement Learning
by: Khorasani, Sadegh, et al.
Published: (2025)
by: Khorasani, Sadegh, et al.
Published: (2025)
Versatile Ordering Network: An Attention-based Neural Network for Ordering Across Scales and Quality Metrics
by: Yu, Zehua, et al.
Published: (2024)
by: Yu, Zehua, et al.
Published: (2024)
2Mamba2Furious: Linear in Complexity, Competitive in Accuracy
by: Mongaras, Gabriel, et al.
Published: (2026)
by: Mongaras, Gabriel, et al.
Published: (2026)
Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design
by: Alabdulmohsin, Ibrahim, et al.
Published: (2023)
by: Alabdulmohsin, Ibrahim, et al.
Published: (2023)
Variance Is Not Importance: Structural Analysis of Transformer Compressibility Across Model Scales
by: Salfati, Samuel
Published: (2026)
by: Salfati, Samuel
Published: (2026)
Closing the Curvature Gap: Full Transformer Hessians and Their Implications for Scaling Laws
by: Petrov, Egor, et al.
Published: (2025)
by: Petrov, Egor, et al.
Published: (2025)
Kolmogorov Arnold Networks and Multi-Layer Perceptrons: A Paradigm Shift in Neural Modelling
by: Gaonkar, Aradhya, et al.
Published: (2026)
by: Gaonkar, Aradhya, et al.
Published: (2026)
Latent Instruction Representation Alignment: defending against jailbreaks, backdoors and undesired knowledge in LLMs
by: Easley, Eric, et al.
Published: (2026)
by: Easley, Eric, et al.
Published: (2026)
Self-Expanding Neural Networks
by: Mitchell, Rupert, et al.
Published: (2023)
by: Mitchell, Rupert, et al.
Published: (2023)
CoxSE: Exploring the Potential of Self-Explaining Neural Networks with Cox Proportional Hazards Model for Survival Analysis
by: Alabdallah, Abdallah, et al.
Published: (2024)
by: Alabdallah, Abdallah, et al.
Published: (2024)
Deep Memory Search: A Metaheuristic Approach for Optimizing Heuristic Search
by: Hedar, Abdel-Rahman, et al.
Published: (2024)
by: Hedar, Abdel-Rahman, et al.
Published: (2024)
Discrete Latent Structure in Neural Networks
by: Niculae, Vlad, et al.
Published: (2023)
by: Niculae, Vlad, et al.
Published: (2023)
Understanding Boolean Function Learnability on Deep Neural Networks: PAC Learning Meets Neurosymbolic Models
by: Nicolau, Marcio, et al.
Published: (2020)
by: Nicolau, Marcio, et al.
Published: (2020)
Implicit Regularization and Generalization in Overparameterized Neural Networks
by: Johannsen, Zeran
Published: (2026)
by: Johannsen, Zeran
Published: (2026)
Uncertainty Quantification in Multivariable Regression for Material Property Prediction with Bayesian Neural Networks
by: Li, Longze, et al.
Published: (2023)
by: Li, Longze, et al.
Published: (2023)
Pre-trained Models Perform the Best When Token Distributions Follow Zipf's Law
by: He, Yanjin, et al.
Published: (2025)
by: He, Yanjin, et al.
Published: (2025)
Strengthening the Internal Adversarial Robustness in Lifted Neural Networks
by: Zach, Christopher
Published: (2025)
by: Zach, Christopher
Published: (2025)
Sparse Concept Anchoring for Interpretable and Controllable Neural Representations
by: Fraser, Sandy, et al.
Published: (2025)
by: Fraser, Sandy, et al.
Published: (2025)
The Bayesian Confidence (BACON) Estimator for Deep Neural Networks
by: Kee, Patrick D., et al.
Published: (2024)
by: Kee, Patrick D., et al.
Published: (2024)
Scaling Laws in the Tiny Regime: How Small Models Change Their Mistakes
by: Alnemari, Mohammed, et al.
Published: (2026)
by: Alnemari, Mohammed, et al.
Published: (2026)
Multiple Token Divergence: Measuring and Steering In-Context Computation Density
by: Herrmann, Vincent, et al.
Published: (2025)
by: Herrmann, Vincent, et al.
Published: (2025)
Neural Reasoning Networks: Efficient Interpretable Neural Networks With Automatic Textual Explanations
by: Carrow, Stephen, et al.
Published: (2024)
by: Carrow, Stephen, et al.
Published: (2024)
Learning Useful Representations of Recurrent Neural Network Weight Matrices
by: Herrmann, Vincent, et al.
Published: (2024)
by: Herrmann, Vincent, et al.
Published: (2024)
Expressivity of Graph Neural Networks Through the Lens of Adversarial Robustness
by: Campi, Francesco, et al.
Published: (2023)
by: Campi, Francesco, et al.
Published: (2023)
Graph Neural Network Based Action Ranking for Planning
by: Mangannavar, Rajesh, et al.
Published: (2024)
by: Mangannavar, Rajesh, et al.
Published: (2024)
Normalization Layer Per-Example Gradients are Sufficient to Predict Gradient Noise Scale in Transformers
by: Gray, Gavia, et al.
Published: (2024)
by: Gray, Gavia, et al.
Published: (2024)
AI and Machine Learning Approaches for Predicting Nanoparticles Toxicity The Critical Role of Physiochemical Properties
by: Yousaf, Iqra
Published: (2024)
by: Yousaf, Iqra
Published: (2024)
ReBoot: Encrypted Training of Deep Neural Networks with CKKS Bootstrapping
by: Pirillo, Alberto, et al.
Published: (2025)
by: Pirillo, Alberto, et al.
Published: (2025)
Playing Hex and Counter Wargames using Reinforcement Learning and Recurrent Neural Networks
by: Palma, Guilherme, et al.
Published: (2025)
by: Palma, Guilherme, et al.
Published: (2025)
ProbeScale: Probing Analysis to Optimize Neural Scaling Laws for Efficient Small Language Model Inference
by: Das, Sourav
Published: (2026)
by: Das, Sourav
Published: (2026)
Exploring Neural Granger Causality with xLSTMs: Unveiling Temporal Dependencies in Complex Data
by: Poonia, Harsh, et al.
Published: (2025)
by: Poonia, Harsh, et al.
Published: (2025)
FluidWorld: Reaction-Diffusion Dynamics as a Predictive Substrate for World Models
by: Polly, Fabien
Published: (2026)
by: Polly, Fabien
Published: (2026)
Pruning Spurious Subgraphs for Graph Out-of-Distribution Generalization
by: Yao, Tianjun, et al.
Published: (2025)
by: Yao, Tianjun, et al.
Published: (2025)
SPACeR: Self-Play Anchoring with Centralized Reference Models
by: Chang, Wei-Jer, et al.
Published: (2025)
by: Chang, Wei-Jer, et al.
Published: (2025)
Newtonian and Lagrangian Neural Networks: A Comparison Towards Efficient Inverse Dynamics Identification
by: Trinh, Minh, et al.
Published: (2025)
by: Trinh, Minh, et al.
Published: (2025)
The Geometry of Thought: How Scale Restructures Reasoning In Large Language Models
by: Anderson, Samuel Cyrenius
Published: (2026)
by: Anderson, Samuel Cyrenius
Published: (2026)
Scaling Trends for Multi-Hop Contextual Reasoning in Mid-Scale Language Models
by: Steele, Brady, et al.
Published: (2026)
by: Steele, Brady, et al.
Published: (2026)
Prompting Neural-Guided Equation Discovery Based on Residuals
by: Brugger, Jannis, et al.
Published: (2025)
by: Brugger, Jannis, et al.
Published: (2025)
Entropy Aware Message Passing in Graph Neural Networks
by: Nazari, Philipp, et al.
Published: (2024)
by: Nazari, Philipp, et al.
Published: (2024)
Similar Items
-
The Data Efficiency Frontier of Financial Foundation Models: Scaling Laws from Continued Pretraining
by: Ponnock, Jesse
Published: (2025) -
Fusing Rewards and Preferences in Reinforcement Learning
by: Khorasani, Sadegh, et al.
Published: (2025) -
Versatile Ordering Network: An Attention-based Neural Network for Ordering Across Scales and Quality Metrics
by: Yu, Zehua, et al.
Published: (2024) -
2Mamba2Furious: Linear in Complexity, Competitive in Accuracy
by: Mongaras, Gabriel, et al.
Published: (2026) -
Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design
by: Alabdulmohsin, Ibrahim, et al.
Published: (2023)