:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Olsen, Brian Richard, Fatehmanesh, Sam, Xiao, Frank, Kumarappan, Adarsh, Gajula, Anirudh
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2507.12709
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Towards Realistic Guarantees: A Probabilistic Certificate for SmoothLLM
by: Kumarappan, Adarsh, et al.
Published: (2025)

Automating Deception: Scalable Multi-Turn LLM Jailbreaks
by: Kumarappan, Adarsh, et al.
Published: (2025)

Not Just RLHF: Why Alignment Alone Won't Fix Multi-Agent Sycophancy
by: Kumarappan, Adarsh, et al.
Published: (2026)

SGD and Weight Decay Secretly Minimize the Rank of Your Neural Network
by: Galanti, Tomer, et al.
Published: (2022)

LeanAgent: Lifelong Learning for Formal Theorem Proving
by: Kumarappan, Adarsh, et al.
Published: (2024)

Sentiment-Aware Recommendation Systems in E-Commerce: A Review from a Natural Language Processing Perspective
by: Gajula, Yogesh
Published: (2025)

Optimal Condition for Initialization Variance in Deep Neural Networks: An SGD Dynamics Perspective
by: Horii, Hiroshi, et al.
Published: (2025)

DevBench: A Realistic, Developer-Informed Benchmark for Code Generation Models
by: Kumarappan, Adarsh, et al.
Published: (2026)

Memorization in Graph Neural Networks
by: Jamadandi, Adarsh, et al.
Published: (2025)

SGD with Partial Hessian for Deep Neural Networks Optimization
by: Sun, Ying, et al.
Published: (2024)

SGD as Free Energy Minimization: A Thermodynamic View on Neural Network Training
by: Sadrtdinov, Ildus, et al.
Published: (2025)

To Clip or not to Clip: the Dynamics of SGD with Gradient Clipping in High-Dimensions
by: Marshall, Noah, et al.
Published: (2024)

A Simplified Analysis of SGD for Linear Regression with Weight Averaging
by: Meterez, Alexandru, et al.
Published: (2025)

DP-SGD Without Clipping: The Lipschitz Neural Network Way
by: Bethune, Louis, et al.
Published: (2023)

Implicit Compressibility of Overparametrized Neural Networks Trained with Heavy-Tailed SGD
by: Wan, Yijun, et al.
Published: (2023)

Convergence of SGD for Training Neural Networks with Sliced Wasserstein Losses
by: Tanguy, Eloi
Published: (2023)

How Neural Networks Learn the Support is an Implicit Regularization Effect of SGD
by: Beneventano, Pierfrancesco, et al.
Published: (2024)

Style-based Clustering of Visual Artworks and the Play of Neural Style-Representations
by: Dangeti, Abhishek, et al.
Published: (2024)

Numerical simulation of transient heat conduction with moving heat source using Physics Informed Neural Networks
by: Kalyan, Anirudh, et al.
Published: (2025)

On the Learning Dynamics of Two-layer Linear Networks with Label Noise SGD
by: Zhang, Tongcheng, et al.
Published: (2026)

SGD-Based Knowledge Distillation with Bayesian Teachers: Theory and Guidelines
by: Morad, Itai, et al.
Published: (2026)

Weight Spectra Induced Efficient Model Adaptation
by: Si, Chongjie, et al.
Published: (2025)

Diffusion-Based Neural Network Weights Generation
by: Soro, Bedionita, et al.
Published: (2024)

A Generalized Singular Value Theory for Neural Networks
by: Brown, Brian Charles, et al.
Published: (2026)

From PowerSGD to PowerSGD+: Low-Rank Gradient Compression for Distributed Optimization with Convergence Guarantees
by: Xie, Shengping, et al.
Published: (2025)

Enhancing DP-SGD through Non-monotonous Adaptive Scaling Gradient Weight
by: Huang, Tao, et al.
Published: (2024)

Cooperative SGD with Dynamic Mixing Matrices
by: Sarkar, Soumya, et al.
Published: (2025)

Global Convergence of SGD On Two Layer Neural Nets
by: Gopalani, Pulkit, et al.
Published: (2022)

From Weight Perturbation to Feature Attribution for Explaining Fully Connected Neural Networks
by: Lymperopoulos, Thodoris, et al.
Published: (2026)

Weight Initialization and Variance Dynamics in Deep Neural Networks and Large Language Models
by: Han, Yankun
Published: (2025)

Accurate and Scalable Estimation of Epistemic Uncertainty for Graph Neural Networks
by: Trivedi, Puja, et al.
Published: (2024)

Balancing Utility and Privacy: Dynamically Private SGD with Random Projection
by: Jiang, Zhanhong, et al.
Published: (2025)

Less is More: Efficient Weight Farcasting with 1-Layer Neural Network
by: Shou, Xiao, et al.
Published: (2025)

On the Stability of Nonlinear Dynamics in GD and SGD: Beyond Quadratic Potentials
by: Mulayoff, Rotem, et al.
Published: (2026)

Comparing Spectral Bias and Robustness For Two-Layer Neural Networks: SGD vs Adaptive Random Fourier Features
by: Kammonen, Aku, et al.
Published: (2024)

DC-SGD: Differentially Private SGD with Dynamic Clipping through Gradient Norm Distribution Estimation
by: Wei, Chengkun, et al.
Published: (2025)

Single-Head Attention in High Dimensions: A Theory of Generalization, Weights Spectra, and Scaling Laws
by: Boncoraglio, Fabrizio, et al.
Published: (2025)

From Gradient Clipping to Normalization for Heavy Tailed SGD
by: Hübler, Florian, et al.
Published: (2024)

Signal Processing Meets SGD: From Momentum to Filter
by: Yao, Zhipeng, et al.
Published: (2023)

Deep Neural Network for Phonon-Assisted Optical Spectra in Semiconductors
by: Gu, Qiangqiang, et al.
Published: (2025)