Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Gao, Yifei, Ou, Jie, Wang, Lei, Xiao, Yuting, Xiang, Zhiyuan, Dai, Ruiting, Cheng, Jun
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence F.2.3
Online Access:	https://arxiv.org/abs/2406.16299
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913402577747968
author	Gao, Yifei Ou, Jie Wang, Lei Xiao, Yuting Xiang, Zhiyuan Dai, Ruiting Cheng, Jun
author_facet	Gao, Yifei Ou, Jie Wang, Lei Xiao, Yuting Xiang, Zhiyuan Dai, Ruiting Cheng, Jun
contents	Emergent Large Language Models (LLMs) use their extraordinary performance and powerful deduction capacity to discern from traditional language models. However, the expenses of computational resources and storage for these LLMs are stunning, quantization then arises as a trending conversation. To address accuracy decay caused by quantization, two streams of works in post-training quantization methods stand out. One uses other weights to compensate existing quantization error, while the other transfers the quantization difficulty to other parts in the model. Combining both merits, we introduce Learnable Singular value Increment (LSI) as an advanced solution. LSI uses Singular Value Decomposition to extract singular values of the weights and make them learnable to help weights compensate each other conditioned on activation. Incorporating LSI with existing techniques, we achieve state-of-the-art performance in diverse quantization settings, no matter in weight-only, weight-activation or extremely low bit scenarios. By unleashing the potential of LSI, efficient finetuning on quantized model is no longer a prohibitive problem.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_16299
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other Gao, Yifei Ou, Jie Wang, Lei Xiao, Yuting Xiang, Zhiyuan Dai, Ruiting Cheng, Jun Computation and Language Artificial Intelligence F.2.3 Emergent Large Language Models (LLMs) use their extraordinary performance and powerful deduction capacity to discern from traditional language models. However, the expenses of computational resources and storage for these LLMs are stunning, quantization then arises as a trending conversation. To address accuracy decay caused by quantization, two streams of works in post-training quantization methods stand out. One uses other weights to compensate existing quantization error, while the other transfers the quantization difficulty to other parts in the model. Combining both merits, we introduce Learnable Singular value Increment (LSI) as an advanced solution. LSI uses Singular Value Decomposition to extract singular values of the weights and make them learnable to help weights compensate each other conditioned on activation. Incorporating LSI with existing techniques, we achieve state-of-the-art performance in diverse quantization settings, no matter in weight-only, weight-activation or extremely low bit scenarios. By unleashing the potential of LSI, efficient finetuning on quantized model is no longer a prohibitive problem.
title	Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other
topic	Computation and Language Artificial Intelligence F.2.3
url	https://arxiv.org/abs/2406.16299

Similar Items