Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.03062 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866918229561049088 |
|---|---|
| author | Rausch, Roman Jansen, David Singh, Sukhbinder Orús, Román |
| author_facet | Rausch, Roman Jansen, David Singh, Sukhbinder Orús, Román |
| contents | Large Language Models (LLMs) are very demanding in terms of their computational resources. Low-rank decompositions of LLM weights, e.g. via Singular Value Decomposition (SVD), is a promising approach for LLM compression, but presents several practical hurdles, e.g. selecting appropriate layer-wise ranks and getting rid of its parameter redundancy. In this work, we present two physics-inspired improvements to SVD LLM compression: (1) \textbf{FermiGrad}, a gradient-descent algorithm that determines globally optimal layer-wise ranks by relaxing the discrete singular-value truncation into a continuous optimization using the Fermi function; (2) \textbf{PivGa}, an additional \textit{lossless} compression of the low-rank factors that exploits the intrinsic gauge freedom in their parametrization. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2512_03062 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Globally optimized SVD compression of LLMs via Fermi-function-based rank selection and gauge fixing Rausch, Roman Jansen, David Singh, Sukhbinder Orús, Román Machine Learning Large Language Models (LLMs) are very demanding in terms of their computational resources. Low-rank decompositions of LLM weights, e.g. via Singular Value Decomposition (SVD), is a promising approach for LLM compression, but presents several practical hurdles, e.g. selecting appropriate layer-wise ranks and getting rid of its parameter redundancy. In this work, we present two physics-inspired improvements to SVD LLM compression: (1) \textbf{FermiGrad}, a gradient-descent algorithm that determines globally optimal layer-wise ranks by relaxing the discrete singular-value truncation into a continuous optimization using the Fermi function; (2) \textbf{PivGa}, an additional \textit{lossless} compression of the low-rank factors that exploits the intrinsic gauge freedom in their parametrization. |
| title | Globally optimized SVD compression of LLMs via Fermi-function-based rank selection and gauge fixing |
| topic | Machine Learning |
| url | https://arxiv.org/abs/2512.03062 |