:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Trikha, Akshay, Chu, Kyle, Gosai, Advait, Szachta, Parker, Weiner, Eric
Format:	Preprint
Published:	2025
Subjects:	Machine Learning I.2.6
Online Access:	https://arxiv.org/abs/2509.21811
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

The Data Efficiency Frontier of Financial Foundation Models: Scaling Laws from Continued Pretraining
by: Ponnock, Jesse
Published: (2025)

Fusing Rewards and Preferences in Reinforcement Learning
by: Khorasani, Sadegh, et al.
Published: (2025)

Versatile Ordering Network: An Attention-based Neural Network for Ordering Across Scales and Quality Metrics
by: Yu, Zehua, et al.
Published: (2024)

2Mamba2Furious: Linear in Complexity, Competitive in Accuracy
by: Mongaras, Gabriel, et al.
Published: (2026)

Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design
by: Alabdulmohsin, Ibrahim, et al.
Published: (2023)

Variance Is Not Importance: Structural Analysis of Transformer Compressibility Across Model Scales
by: Salfati, Samuel
Published: (2026)

Closing the Curvature Gap: Full Transformer Hessians and Their Implications for Scaling Laws
by: Petrov, Egor, et al.
Published: (2025)

Kolmogorov Arnold Networks and Multi-Layer Perceptrons: A Paradigm Shift in Neural Modelling
by: Gaonkar, Aradhya, et al.
Published: (2026)

Latent Instruction Representation Alignment: defending against jailbreaks, backdoors and undesired knowledge in LLMs
by: Easley, Eric, et al.
Published: (2026)

Self-Expanding Neural Networks
by: Mitchell, Rupert, et al.
Published: (2023)

CoxSE: Exploring the Potential of Self-Explaining Neural Networks with Cox Proportional Hazards Model for Survival Analysis
by: Alabdallah, Abdallah, et al.
Published: (2024)

Deep Memory Search: A Metaheuristic Approach for Optimizing Heuristic Search
by: Hedar, Abdel-Rahman, et al.
Published: (2024)

Discrete Latent Structure in Neural Networks
by: Niculae, Vlad, et al.
Published: (2023)

Understanding Boolean Function Learnability on Deep Neural Networks: PAC Learning Meets Neurosymbolic Models
by: Nicolau, Marcio, et al.
Published: (2020)

Implicit Regularization and Generalization in Overparameterized Neural Networks
by: Johannsen, Zeran
Published: (2026)

Uncertainty Quantification in Multivariable Regression for Material Property Prediction with Bayesian Neural Networks
by: Li, Longze, et al.
Published: (2023)

Pre-trained Models Perform the Best When Token Distributions Follow Zipf's Law
by: He, Yanjin, et al.
Published: (2025)

Strengthening the Internal Adversarial Robustness in Lifted Neural Networks
by: Zach, Christopher
Published: (2025)

Sparse Concept Anchoring for Interpretable and Controllable Neural Representations
by: Fraser, Sandy, et al.
Published: (2025)

The Bayesian Confidence (BACON) Estimator for Deep Neural Networks
by: Kee, Patrick D., et al.
Published: (2024)

Scaling Laws in the Tiny Regime: How Small Models Change Their Mistakes
by: Alnemari, Mohammed, et al.
Published: (2026)

Multiple Token Divergence: Measuring and Steering In-Context Computation Density
by: Herrmann, Vincent, et al.
Published: (2025)

Neural Reasoning Networks: Efficient Interpretable Neural Networks With Automatic Textual Explanations
by: Carrow, Stephen, et al.
Published: (2024)

Learning Useful Representations of Recurrent Neural Network Weight Matrices
by: Herrmann, Vincent, et al.
Published: (2024)

Expressivity of Graph Neural Networks Through the Lens of Adversarial Robustness
by: Campi, Francesco, et al.
Published: (2023)

Graph Neural Network Based Action Ranking for Planning
by: Mangannavar, Rajesh, et al.
Published: (2024)

Normalization Layer Per-Example Gradients are Sufficient to Predict Gradient Noise Scale in Transformers
by: Gray, Gavia, et al.
Published: (2024)

AI and Machine Learning Approaches for Predicting Nanoparticles Toxicity The Critical Role of Physiochemical Properties
by: Yousaf, Iqra
Published: (2024)

ReBoot: Encrypted Training of Deep Neural Networks with CKKS Bootstrapping
by: Pirillo, Alberto, et al.
Published: (2025)

Playing Hex and Counter Wargames using Reinforcement Learning and Recurrent Neural Networks
by: Palma, Guilherme, et al.
Published: (2025)

ProbeScale: Probing Analysis to Optimize Neural Scaling Laws for Efficient Small Language Model Inference
by: Das, Sourav
Published: (2026)

Exploring Neural Granger Causality with xLSTMs: Unveiling Temporal Dependencies in Complex Data
by: Poonia, Harsh, et al.
Published: (2025)

FluidWorld: Reaction-Diffusion Dynamics as a Predictive Substrate for World Models
by: Polly, Fabien
Published: (2026)

Pruning Spurious Subgraphs for Graph Out-of-Distribution Generalization
by: Yao, Tianjun, et al.
Published: (2025)

SPACeR: Self-Play Anchoring with Centralized Reference Models
by: Chang, Wei-Jer, et al.
Published: (2025)

Newtonian and Lagrangian Neural Networks: A Comparison Towards Efficient Inverse Dynamics Identification
by: Trinh, Minh, et al.
Published: (2025)

The Geometry of Thought: How Scale Restructures Reasoning In Large Language Models
by: Anderson, Samuel Cyrenius
Published: (2026)

Scaling Trends for Multi-Hop Contextual Reasoning in Mid-Scale Language Models
by: Steele, Brady, et al.
Published: (2026)

Prompting Neural-Guided Equation Discovery Based on Residuals
by: Brugger, Jannis, et al.
Published: (2025)

Entropy Aware Message Passing in Graph Neural Networks
by: Nazari, Philipp, et al.
Published: (2024)