Saved in:
| Main Authors: | Fruytier, Quentin, Mokhtari, Aryan, Sanghavi, Sujay |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.06056 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Sculpting Latent Spaces With MMD: Disentanglement With Programmable Priors
by: Fruytier, Quentin, et al.
Published: (2025)
by: Fruytier, Quentin, et al.
Published: (2025)
In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness
by: Collins, Liam, et al.
Published: (2024)
by: Collins, Liam, et al.
Published: (2024)
Adaptive and Optimal Second-order Optimistic Methods for Minimax Optimization
by: Jiang, Ruichen, et al.
Published: (2024)
by: Jiang, Ruichen, et al.
Published: (2024)
Understanding Self-Supervised Learning via Gaussian Mixture Models
by: Bansal, Parikshit, et al.
Published: (2024)
by: Bansal, Parikshit, et al.
Published: (2024)
Context-Free Synthetic Data Mitigates Forgetting
by: Bansal, Parikshit, et al.
Published: (2025)
by: Bansal, Parikshit, et al.
Published: (2025)
Enabling Approximate Joint Sampling in Diffusion LMs
by: Bansal, Parikshit, et al.
Published: (2025)
by: Bansal, Parikshit, et al.
Published: (2025)
Online Learning Guided Quasi-Newton Methods with Global Non-Asymptotic Convergence
by: Jiang, Ruichen, et al.
Published: (2024)
by: Jiang, Ruichen, et al.
Published: (2024)
Geometric Median (GM) Matching for Robust Data Pruning
by: Acharya, Anish, et al.
Published: (2024)
by: Acharya, Anish, et al.
Published: (2024)
Test-Time Speculation
by: Kumar, Avinash, et al.
Published: (2026)
by: Kumar, Avinash, et al.
Published: (2026)
A Mirror Descent Perspective of Smoothed Sign Descent
by: Wang, Shuyang, et al.
Published: (2024)
by: Wang, Shuyang, et al.
Published: (2024)
Transformers Learn Latent Mixture Models In-Context via Mirror Descent
by: D'Angelo, Francesco, et al.
Published: (2026)
by: D'Angelo, Francesco, et al.
Published: (2026)
HiSpec: Hierarchical Speculative Decoding for LLMs
by: Kumar, Avinash, et al.
Published: (2025)
by: Kumar, Avinash, et al.
Published: (2025)
Online Learning-guided Learning Rate Adaptation via Gradient Alignment
by: Jiang, Ruichen, et al.
Published: (2025)
by: Jiang, Ruichen, et al.
Published: (2025)
Adaptive Optimization via Momentum on Variance-Normalized Gradients
by: Patitucci, Francisco, et al.
Published: (2026)
by: Patitucci, Francisco, et al.
Published: (2026)
Blocking Bandits
by: Basu, Soumya, et al.
Published: (2019)
by: Basu, Soumya, et al.
Published: (2019)
Improved Complexity for Smooth Nonconvex Optimization: A Two-Level Online Learning Approach with Quasi-Newton Methods
by: Jiang, Ruichen, et al.
Published: (2024)
by: Jiang, Ruichen, et al.
Published: (2024)
Geometric Median Matching for Robust k-Subset Selection from Noisy Data
by: Acharya, Anish, et al.
Published: (2025)
by: Acharya, Anish, et al.
Published: (2025)
Towards Quantifying the Preconditioning Effect of Adam
by: Das, Rudrajit, et al.
Published: (2024)
by: Das, Rudrajit, et al.
Published: (2024)
Understanding the Training Speedup from Sampling with Approximate Losses
by: Das, Rudrajit, et al.
Published: (2024)
by: Das, Rudrajit, et al.
Published: (2024)
Asymptotically-Optimal Gaussian Bandits with Side Observations
by: Atsidakou, Alexia, et al.
Published: (2025)
by: Atsidakou, Alexia, et al.
Published: (2025)
Finite-Time Logarithmic Bayes Regret Upper Bounds
by: Atsidakou, Alexia, et al.
Published: (2023)
by: Atsidakou, Alexia, et al.
Published: (2023)
Stochastic Newton Proximal Extragradient Method
by: Jiang, Ruichen, et al.
Published: (2024)
by: Jiang, Ruichen, et al.
Published: (2024)
Provable Complexity Improvement of AdaGrad over SGD: Upper and Lower Bounds in Stochastic Non-Convex Optimization
by: Jiang, Ruichen, et al.
Published: (2024)
by: Jiang, Ruichen, et al.
Published: (2024)
Improving Online-to-Nonconvex Conversion for Smooth Optimization via Double Optimism
by: Patitucci, Francisco, et al.
Published: (2025)
by: Patitucci, Francisco, et al.
Published: (2025)
Curriculum Learning-Guided Progressive Distillation in Large Language Models
by: Cao, Jincheng, et al.
Published: (2026)
by: Cao, Jincheng, et al.
Published: (2026)
Adaptive Matrix Online Learning through Smoothing with Guarantees for Nonsmooth Nonconvex Optimization
by: Jiang, Ruichen, et al.
Published: (2026)
by: Jiang, Ruichen, et al.
Published: (2026)
Value Mirror Descent for Reinforcement Learning
by: Jia, Zhichao, et al.
Published: (2026)
by: Jia, Zhichao, et al.
Published: (2026)
Upweighting Easy Samples in Fine-Tuning Mitigates Forgetting
by: Sanyal, Sunny, et al.
Published: (2025)
by: Sanyal, Sunny, et al.
Published: (2025)
Time Weaver: A Conditional Time Series Generation Model
by: Narasimhan, Sai Shankar, et al.
Published: (2024)
by: Narasimhan, Sai Shankar, et al.
Published: (2024)
Generalized Optimistic Methods for Convex-Concave Saddle Point Problems
by: Jiang, Ruichen, et al.
Published: (2022)
by: Jiang, Ruichen, et al.
Published: (2022)
Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity
by: Hou, Charlie, et al.
Published: (2023)
by: Hou, Charlie, et al.
Published: (2023)
Rank-Induced PL Mirror Descent: A Rank-Faithful Second-Order Algorithm for Sleeping Experts
by: Zhang, Tiantian
Published: (2025)
by: Zhang, Tiantian
Published: (2025)
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks
by: Collins, Liam, et al.
Published: (2023)
by: Collins, Liam, et al.
Published: (2023)
Positive Unlabeled Contrastive Learning
by: Acharya, Anish, et al.
Published: (2022)
by: Acharya, Anish, et al.
Published: (2022)
Machine Unlearning under Overparameterization
by: Block, Jacob L., et al.
Published: (2025)
by: Block, Jacob L., et al.
Published: (2025)
When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Models
by: Sanyal, Sunny, et al.
Published: (2024)
by: Sanyal, Sunny, et al.
Published: (2024)
Entropy Aware Reward Guidance for Diffusion Language Model Alignment
by: Tejaswi, Atula, et al.
Published: (2026)
by: Tejaswi, Atula, et al.
Published: (2026)
Rethinking Neural Network Learning Rates: A Stackelberg Perspective
by: Zeng, Sihan, et al.
Published: (2026)
by: Zeng, Sihan, et al.
Published: (2026)
Policy Mirror Descent with Lookahead
by: Protopapas, Kimon, et al.
Published: (2024)
by: Protopapas, Kimon, et al.
Published: (2024)
Parameter-free Mirror Descent
by: Jacobsen, Andrew, et al.
Published: (2022)
by: Jacobsen, Andrew, et al.
Published: (2022)
Similar Items
-
Sculpting Latent Spaces With MMD: Disentanglement With Programmable Priors
by: Fruytier, Quentin, et al.
Published: (2025) -
In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness
by: Collins, Liam, et al.
Published: (2024) -
Adaptive and Optimal Second-order Optimistic Methods for Minimax Optimization
by: Jiang, Ruichen, et al.
Published: (2024) -
Understanding Self-Supervised Learning via Gaussian Mixture Models
by: Bansal, Parikshit, et al.
Published: (2024) -
Context-Free Synthetic Data Mitigates Forgetting
by: Bansal, Parikshit, et al.
Published: (2025)