:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Dimlioglu, Tolga, Topollai, Kristi, Choromanska, Anna
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.27739
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Outer-Momentum Restarting in High-Dimensional Two-Phase Optimization
by: Topollai, Kristi, et al.
Published: (2026)

Understanding Quantization of Optimizer States in LLM Pre-training: Dynamics of State Staleness and Effectiveness of State Resets
by: Topollai, Kristi, et al.
Published: (2026)

Task-Level Contrastiveness for Cross-Domain Few-Shot Learning
by: Topollai, Kristi, et al.
Published: (2025)

Adaptive Memory Momentum via a Model-Based Framework for Deep Learning Optimization
by: Topollai, Kristi, et al.
Published: (2025)

Communication-Efficient Distributed Training for Collaborative Flat Optima Recovery in Deep Learning
by: Dimlioglu, Tolga, et al.
Published: (2025)

GRAWA: Gradient-based Weighted Averaging for Distributed Training of Deep Learning Models
by: Dimlioglu, Tolga, et al.
Published: (2024)

Streamlining Industrial Contract Management with Retrieval-Augmented LLMs
by: Topollai, Kristi, et al.
Published: (2025)

OncoReason: Structuring Clinical Reasoning in LLMs for Robust and Interpretable Survival Prediction
by: Hemadri, Raghu Vamshi, et al.
Published: (2025)

Scaling-Aware Data Selection for End-to-End Autonomous Driving Systems
by: Dimlioglu, Tolga, et al.
Published: (2026)

SGD at the Edge of Stability: The Stochastic Sharpness Gap
by: Liao, Fangshuo, et al.
Published: (2026)

Minibatch and Local SGD: Algorithmic Stability and Linear Speedup in Generalization
by: Lei, Yunwen, et al.
Published: (2023)

PromptSplit: Revealing Prompt-Level Disagreement in Generative Models
by: Lotfian, Mehdi, et al.
Published: (2026)

ACE and Diverse Generalization via Selective Disagreement
by: Daniels, Oliver, et al.
Published: (2025)

Mitigating Spurious Correlations via Disagreement Probability
by: Han, Hyeonggeun, et al.
Published: (2024)

STABLEVAL: Disagreement-Aware and Stable Evaluation of AI Systems
by: Bonagiri, Akash, et al.
Published: (2026)

DIVE: Subgraph Disagreement for Graph Out-of-Distribution Generalization
by: Sun, Xin, et al.
Published: (2024)

Bootstrap SGD: Algorithmic Stability and Robustness
by: Christmann, Andreas, et al.
Published: (2024)

DARC: Disagreement-Aware Alignment via Risk-Constrained Decoding
by: Zou, Mingxi, et al.
Published: (2026)

The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective
by: Krishna, Satyapriya, et al.
Published: (2022)

Self-Supervised Representation Learning with Joint Embedding Predictive Architecture for Automotive LiDAR Object Detection
by: Zhu, Haoran, et al.
Published: (2025)

Sharpness-Aware Minimization in Logit Space Efficiently Enhances Direct Preference Optimization
by: Luo, Haocheng, et al.
Published: (2026)

Anon: Extrapolating Adaptivity Beyond SGD and Adam
by: Zhang, Yiheng, et al.
Published: (2026)

Scaling Laws of SignSGD in Linear Regression: When Does It Outperform SGD?
by: Kim, Jihwan, et al.
Published: (2026)

Accumulative SGD Influence Estimation for Data Attribution
by: Shi, Yunxiao, et al.
Published: (2025)

RQP-SGD: Differential Private Machine Learning through Noisy SGD and Randomized Quantization
by: Feng, Ce, et al.
Published: (2024)

Diagonalisation SGD: Fast & Convergent SGD for Non-Differentiable Models via Reparameterisation and Smoothing
by: Wagner, Dominik, et al.
Published: (2024)

DC-SGD: Differentially Private SGD with Dynamic Clipping through Gradient Norm Distribution Estimation
by: Wei, Chengkun, et al.
Published: (2025)

StoSignSGD: Unbiased Structural Stochasticity Fixes SignSGD for Training Large Language Models
by: Yu, Dingzhi, et al.
Published: (2026)

SWAN: SGD with Normalization and Whitening Enables Stateless LLM Training
by: Ma, Chao, et al.
Published: (2024)

EMoE: Training-Free Expert Disagreement for Uncertainty-Aware Text-to-Image Diffusion
by: Berry, Lucas, et al.
Published: (2025)

INO-SGD: Addressing Utility Imbalance under Individualized Differential Privacy
by: Tian, Xiao, et al.
Published: (2026)

On the Learning Dynamics of Two-layer Linear Networks with Label Noise SGD
by: Zhang, Tongcheng, et al.
Published: (2026)

Mixed-Sample SGD: an End-to-end Analysis of Supervised Transfer Learning
by: Deng, Yuyang, et al.
Published: (2025)

Learning from Disagreement: Clinician Overrides as Implicit Preference Signals for Clinical AI in Value-Based Care
by: Singh, Prabhjot, et al.
Published: (2026)

Connections between Schedule-Free Optimizers, AdEMAMix, and Accelerated SGD Variants
by: Morwani, Depen, et al.
Published: (2025)

Enhancing DP-SGD through Non-monotonous Adaptive Scaling Gradient Weight
by: Huang, Tao, et al.
Published: (2024)

APOLLO: SGD-like Memory, AdamW-level Performance
by: Zhu, Hanqing, et al.
Published: (2024)

Gradient-Direction Sensitivity Reveals Linear-Centroid Coupling Hidden by Optimizer Trajectories
by: Xu, Yongzhong
Published: (2026)

Do We Need Adam? Surprisingly Strong and Sparse Reinforcement Learning with SGD in LLMs
by: Mukherjee, Sagnik, et al.
Published: (2026)

Why Adam Can Beat SGD: Second-Moment Normalization Yields Sharper Tails
by: Jin, Ruinan, et al.
Published: (2026)