:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Fruytier, Quentin, Mokhtari, Aryan, Sanghavi, Sujay
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2411.06056
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Sculpting Latent Spaces With MMD: Disentanglement With Programmable Priors
by: Fruytier, Quentin, et al.
Published: (2025)

In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness
by: Collins, Liam, et al.
Published: (2024)

Adaptive and Optimal Second-order Optimistic Methods for Minimax Optimization
by: Jiang, Ruichen, et al.
Published: (2024)

Understanding Self-Supervised Learning via Gaussian Mixture Models
by: Bansal, Parikshit, et al.
Published: (2024)

Context-Free Synthetic Data Mitigates Forgetting
by: Bansal, Parikshit, et al.
Published: (2025)

Enabling Approximate Joint Sampling in Diffusion LMs
by: Bansal, Parikshit, et al.
Published: (2025)

Online Learning Guided Quasi-Newton Methods with Global Non-Asymptotic Convergence
by: Jiang, Ruichen, et al.
Published: (2024)

Geometric Median (GM) Matching for Robust Data Pruning
by: Acharya, Anish, et al.
Published: (2024)

Test-Time Speculation
by: Kumar, Avinash, et al.
Published: (2026)

A Mirror Descent Perspective of Smoothed Sign Descent
by: Wang, Shuyang, et al.
Published: (2024)

Transformers Learn Latent Mixture Models In-Context via Mirror Descent
by: D'Angelo, Francesco, et al.
Published: (2026)

HiSpec: Hierarchical Speculative Decoding for LLMs
by: Kumar, Avinash, et al.
Published: (2025)

Online Learning-guided Learning Rate Adaptation via Gradient Alignment
by: Jiang, Ruichen, et al.
Published: (2025)

Adaptive Optimization via Momentum on Variance-Normalized Gradients
by: Patitucci, Francisco, et al.
Published: (2026)

Blocking Bandits
by: Basu, Soumya, et al.
Published: (2019)

Improved Complexity for Smooth Nonconvex Optimization: A Two-Level Online Learning Approach with Quasi-Newton Methods
by: Jiang, Ruichen, et al.
Published: (2024)

Geometric Median Matching for Robust k-Subset Selection from Noisy Data
by: Acharya, Anish, et al.
Published: (2025)

Towards Quantifying the Preconditioning Effect of Adam
by: Das, Rudrajit, et al.
Published: (2024)

Understanding the Training Speedup from Sampling with Approximate Losses
by: Das, Rudrajit, et al.
Published: (2024)

Asymptotically-Optimal Gaussian Bandits with Side Observations
by: Atsidakou, Alexia, et al.
Published: (2025)

Finite-Time Logarithmic Bayes Regret Upper Bounds
by: Atsidakou, Alexia, et al.
Published: (2023)

Stochastic Newton Proximal Extragradient Method
by: Jiang, Ruichen, et al.
Published: (2024)

Provable Complexity Improvement of AdaGrad over SGD: Upper and Lower Bounds in Stochastic Non-Convex Optimization
by: Jiang, Ruichen, et al.
Published: (2024)

Improving Online-to-Nonconvex Conversion for Smooth Optimization via Double Optimism
by: Patitucci, Francisco, et al.
Published: (2025)

Curriculum Learning-Guided Progressive Distillation in Large Language Models
by: Cao, Jincheng, et al.
Published: (2026)

Adaptive Matrix Online Learning through Smoothing with Guarantees for Nonsmooth Nonconvex Optimization
by: Jiang, Ruichen, et al.
Published: (2026)

Value Mirror Descent for Reinforcement Learning
by: Jia, Zhichao, et al.
Published: (2026)

Upweighting Easy Samples in Fine-Tuning Mitigates Forgetting
by: Sanyal, Sunny, et al.
Published: (2025)

Time Weaver: A Conditional Time Series Generation Model
by: Narasimhan, Sai Shankar, et al.
Published: (2024)

Generalized Optimistic Methods for Convex-Concave Saddle Point Problems
by: Jiang, Ruichen, et al.
Published: (2022)

Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity
by: Hou, Charlie, et al.
Published: (2023)

Rank-Induced PL Mirror Descent: A Rank-Faithful Second-Order Algorithm for Sleeping Experts
by: Zhang, Tiantian
Published: (2025)

Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks
by: Collins, Liam, et al.
Published: (2023)

Positive Unlabeled Contrastive Learning
by: Acharya, Anish, et al.
Published: (2022)

Machine Unlearning under Overparameterization
by: Block, Jacob L., et al.
Published: (2025)

When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Models
by: Sanyal, Sunny, et al.
Published: (2024)

Entropy Aware Reward Guidance for Diffusion Language Model Alignment
by: Tejaswi, Atula, et al.
Published: (2026)

Rethinking Neural Network Learning Rates: A Stackelberg Perspective
by: Zeng, Sihan, et al.
Published: (2026)

Policy Mirror Descent with Lookahead
by: Protopapas, Kimon, et al.
Published: (2024)

Parameter-free Mirror Descent
by: Jacobsen, Andrew, et al.
Published: (2022)