Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Rausch, Roman, Jansen, David, Singh, Sukhbinder, Orús, Román
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2512.03062
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866918229561049088
author	Rausch, Roman Jansen, David Singh, Sukhbinder Orús, Román
author_facet	Rausch, Roman Jansen, David Singh, Sukhbinder Orús, Román
contents	Large Language Models (LLMs) are very demanding in terms of their computational resources. Low-rank decompositions of LLM weights, e.g. via Singular Value Decomposition (SVD), is a promising approach for LLM compression, but presents several practical hurdles, e.g. selecting appropriate layer-wise ranks and getting rid of its parameter redundancy. In this work, we present two physics-inspired improvements to SVD LLM compression: (1) \textbf{FermiGrad}, a gradient-descent algorithm that determines globally optimal layer-wise ranks by relaxing the discrete singular-value truncation into a continuous optimization using the Fermi function; (2) \textbf{PivGa}, an additional \textit{lossless} compression of the low-rank factors that exploits the intrinsic gauge freedom in their parametrization.
format	Preprint
id	arxiv_https___arxiv_org_abs_2512_03062
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Globally optimized SVD compression of LLMs via Fermi-function-based rank selection and gauge fixing Rausch, Roman Jansen, David Singh, Sukhbinder Orús, Román Machine Learning Large Language Models (LLMs) are very demanding in terms of their computational resources. Low-rank decompositions of LLM weights, e.g. via Singular Value Decomposition (SVD), is a promising approach for LLM compression, but presents several practical hurdles, e.g. selecting appropriate layer-wise ranks and getting rid of its parameter redundancy. In this work, we present two physics-inspired improvements to SVD LLM compression: (1) \textbf{FermiGrad}, a gradient-descent algorithm that determines globally optimal layer-wise ranks by relaxing the discrete singular-value truncation into a continuous optimization using the Fermi function; (2) \textbf{PivGa}, an additional \textit{lossless} compression of the low-rank factors that exploits the intrinsic gauge freedom in their parametrization.
title	Globally optimized SVD compression of LLMs via Fermi-function-based rank selection and gauge fixing
topic	Machine Learning
url	https://arxiv.org/abs/2512.03062

Similar Items