Saved in:
Bibliographic Details
Main Author: Khilar, Snigdha Chandan
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2605.30836
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917547254743040
author Khilar, Snigdha Chandan
author_facet Khilar, Snigdha Chandan
contents Recent SVD based compression methods for large language models like SVD LLM and Basis Sharing can be unified under one optimization problem. While mathematical proofs and tests on Pythia models show this unified approach improves weight reconstruction error by up to 46% percent it fails in practical tasks. Downstream metrics like perplexity and accuracy severely degrade compared to standard per layer SVD LLM. The authors explain this failure mechanistically. Although the bundle method mathematically couples adjacent layers the transformer residual stream actually decouples them during forward passes. Thus per layer optimality matters more than joint cross layer optimization. The paper concludes that weight space reconstruction is a flawed objective for cross layer compression and future methods must focus on per layer activation reconstruction instead.
format Preprint
id arxiv_https___arxiv_org_abs_2605_30836
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Cross-Layer Subspace Coupling for LLM Compression: A Unifying Framework and Its Empirical Limits
Khilar, Snigdha Chandan
Machine Learning
Differential Geometry
Recent SVD based compression methods for large language models like SVD LLM and Basis Sharing can be unified under one optimization problem. While mathematical proofs and tests on Pythia models show this unified approach improves weight reconstruction error by up to 46% percent it fails in practical tasks. Downstream metrics like perplexity and accuracy severely degrade compared to standard per layer SVD LLM. The authors explain this failure mechanistically. Although the bundle method mathematically couples adjacent layers the transformer residual stream actually decouples them during forward passes. Thus per layer optimality matters more than joint cross layer optimization. The paper concludes that weight space reconstruction is a flawed objective for cross layer compression and future methods must focus on per layer activation reconstruction instead.
title Cross-Layer Subspace Coupling for LLM Compression: A Unifying Framework and Its Empirical Limits
topic Machine Learning
Differential Geometry
url https://arxiv.org/abs/2605.30836