Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Melnyk, Igor, Mroueh, Youssef, Belgodere, Brian, Rigotti, Mattia, Nitsure, Apoorva, Yurochkin, Mikhail, Greenewald, Kristjan, Navratil, Jiri, Ross, Jerret
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2406.05882
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917689169018880
author	Melnyk, Igor Mroueh, Youssef Belgodere, Brian Rigotti, Mattia Nitsure, Apoorva Yurochkin, Mikhail Greenewald, Kristjan Navratil, Jiri Ross, Jerret
author_facet	Melnyk, Igor Mroueh, Youssef Belgodere, Brian Rigotti, Mattia Nitsure, Apoorva Yurochkin, Mikhail Greenewald, Kristjan Navratil, Jiri Ross, Jerret
contents	Current LLM alignment techniques use pairwise human preferences at a sample level, and as such, they do not imply an alignment on the distributional level. We propose in this paper Alignment via Optimal Transport (AOT), a novel method for distributional preference alignment of LLMs. AOT aligns LLMs on unpaired preference data by making the reward distribution of the positive samples stochastically dominant in the first order on the distribution of negative samples. We introduce a convex relaxation of this first-order stochastic dominance and cast it as an optimal transport problem with a smooth and convex cost. Thanks to the one-dimensional nature of the resulting optimal transport problem and the convexity of the cost, it has a closed-form solution via sorting on empirical measures. We fine-tune LLMs with this AOT objective, which enables alignment by penalizing the violation of the stochastic dominance of the reward distribution of the positive samples on the reward distribution of the negative samples. We analyze the sample complexity of AOT by considering the dual of the OT problem and show that it converges at the parametric rate. Empirically, we show on a diverse set of alignment datasets and LLMs that AOT leads to state-of-the-art models in the 7B family of models when evaluated with Open LLM Benchmarks and AlpacaEval.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_05882
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Distributional Preference Alignment of LLMs via Optimal Transport Melnyk, Igor Mroueh, Youssef Belgodere, Brian Rigotti, Mattia Nitsure, Apoorva Yurochkin, Mikhail Greenewald, Kristjan Navratil, Jiri Ross, Jerret Machine Learning Current LLM alignment techniques use pairwise human preferences at a sample level, and as such, they do not imply an alignment on the distributional level. We propose in this paper Alignment via Optimal Transport (AOT), a novel method for distributional preference alignment of LLMs. AOT aligns LLMs on unpaired preference data by making the reward distribution of the positive samples stochastically dominant in the first order on the distribution of negative samples. We introduce a convex relaxation of this first-order stochastic dominance and cast it as an optimal transport problem with a smooth and convex cost. Thanks to the one-dimensional nature of the resulting optimal transport problem and the convexity of the cost, it has a closed-form solution via sorting on empirical measures. We fine-tune LLMs with this AOT objective, which enables alignment by penalizing the violation of the stochastic dominance of the reward distribution of the positive samples on the reward distribution of the negative samples. We analyze the sample complexity of AOT by considering the dual of the OT problem and show that it converges at the parametric rate. Empirically, we show on a diverse set of alignment datasets and LLMs that AOT leads to state-of-the-art models in the 7B family of models when evaluated with Open LLM Benchmarks and AlpacaEval.
title	Distributional Preference Alignment of LLMs via Optimal Transport
topic	Machine Learning
url	https://arxiv.org/abs/2406.05882

Similar Items