:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Hsieh, Yu-Guan, Thornton, James, Ndiaye, Eugene, Klein, Michal, Cuturi, Marco, Ablin, Pierre
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2402.02998
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Shielded Diffusion: Generating Novel and Diverse Images using Sparse Repellency
by: Kirchhof, Michael, et al.
Published: (2024)

Learning Elastic Costs to Shape Monge Displacements
by: Klein, Michal, et al.
Published: (2023)

The Geometries of Truth Are Orthogonal Across Tasks
by: Azizian, Waiss, et al.
Published: (2025)

Nectar: Neural Estimation of Cached-Token Attention via Regression
by: Monteiro, João, et al.
Published: (2026)

Multivariate Conformal Prediction using Optimal Transport
by: Klein, Michal, et al.
Published: (2025)

Simple ReFlow: Improved Techniques for Fast Flow Models
by: Kim, Beomsu, et al.
Published: (2024)

Contrasting Multiple Representations with the Multi-Marginal Matching Gap
by: Piran, Zoe, et al.
Published: (2024)

The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
by: Saada, Thiziri Nait, et al.
Published: (2025)

Locking Pretrained Weights via Deep Low-Rank Residual Distillation
by: Sakamoto, Keitaro, et al.
Published: (2026)

Completed Hyperparameter Transfer across Modules, Width, Depth, Batch and Duration
by: Mlodozeniec, Bruno, et al.
Published: (2025)

Progressive Entropic Optimal Transport Solvers
by: Kassraie, Parnian, et al.
Published: (2024)

DynaMiCS: Fine-tuning LLMs with Performance Constraints using Dynamic Mixtures
by: Gualdoni, Eleonora, et al.
Published: (2026)

Scaling Laws for Forgetting during Finetuning with Pretraining Data Injection
by: Bethune, Louis, et al.
Published: (2025)

Scaling Categorical Flow Maps
by: Davis, Oscar, et al.
Published: (2026)

On Fitting Flow Models with Large Sinkhorn Couplings
by: Zhang, Stephen, et al.
Published: (2025)

Amortizing Maximum Inner Product Search with Learned Support Functions
by: Olausson, Theo X., et al.
Published: (2026)

Flow Matching with Semidiscrete Couplings
by: Mousavi-Hosseini, Alireza, et al.
Published: (2025)

Dynamic Gradient Alignment for Online Data Mixing
by: Fan, Simin, et al.
Published: (2024)

GENOT: Entropic (Gromov) Wasserstein Flow Matching with Applications to Single-Cell Genomics
by: Klein, Dominik, et al.
Published: (2023)

Learning Unmasking Policies for Diffusion Language Models
by: Jazbec, Metod, et al.
Published: (2025)

Beyond Uncertainty Sets: Leveraging Optimal Transport to Extend Conformal Predictive Distribution to Multivariate Settings
by: Ndiaye, Eugene
Published: (2025)

Omega: Optimistic EMA Gradients
by: Ramirez, Juan, et al.
Published: (2023)

On a Neural Implementation of Brenier's Polar Factorization
by: Vesseron, Nina, et al.
Published: (2024)

EMA Policy Gradient: Taming Reinforcement Learning for LLMs with EMA Anchor and Top-k KL
by: Zhang, Lunjun, et al.
Published: (2026)

The AdEMAMix Optimizer: Better, Faster, Older
by: Pagliardini, Matteo, et al.
Published: (2024)

How Smooth Is Attention?
by: Castin, Valérie, et al.
Published: (2023)

Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures
by: Vesseron, Nina, et al.
Published: (2025)

From Conformal Predictions to Confidence Regions
by: Guille-Escuret, Charles, et al.
Published: (2024)

Finite Sample Confidence Regions for Linear Regression Parameters Using Arbitrary Predictors
by: Guille-Escuret, Charles, et al.
Published: (2024)

Exact and Approximate Conformal Inference for Multi-Output Regression
by: Johnstone, Chancellor, et al.
Published: (2022)

Enhancing Hypergradients Estimation: A Study of Preconditioning and Reparameterization
by: Ye, Zhenzhang, et al.
Published: (2024)

MVICAD2: Multi-View Independent Component Analysis with Delays and Dilations
by: Heurtebise, Ambroise, et al.
Published: (2025)

A Specialized Semismooth Newton Method for Kernel-Based Optimal Transport
by: Lin, Tianyi, et al.
Published: (2023)

The Coupling Within: Flow Matching via Distilled Normalizing Flows
by: Berthelot, David, et al.
Published: (2026)

GS-EMA: Integrating Gradient Surgery Exponential Moving Average with Boundary-Aware Contrastive Learning for Enhanced Domain Generalization in Aneurysm Segmentation
by: Lin, Fengming, et al.
Published: (2024)

Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
by: Grangier, David, et al.
Published: (2024)

Need a Small Specialized Language Model? Plan Early!
by: Grangier, David, et al.
Published: (2024)

Scaling Laws for Mixture Pretraining Under Data Constraints
by: Sedova, Anastasiia, et al.
Published: (2026)

A framework for bilevel optimization that enables stochastic and global variance reduction algorithms
by: Dagréou, Mathieu, et al.
Published: (2022)

A Lower Bound and a Near-Optimal Algorithm for Bilevel Empirical Risk Minimization
by: Dagréou, Mathieu, et al.
Published: (2023)