Guardado en:
Detalles Bibliográficos
Autores principales: Cong, Bai, Daheim, Nico, Shen, Yuesong, Cremers, Daniel, Yokota, Rio, Khan, Mohammad Emtiyaz, Möllenhoff, Thomas
Formato: Preprint
Publicado: 2024
Materias:
Acceso en línea:https://arxiv.org/abs/2411.04421
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866917832570175488
author Cong, Bai
Daheim, Nico
Shen, Yuesong
Cremers, Daniel
Yokota, Rio
Khan, Mohammad Emtiyaz
Möllenhoff, Thomas
author_facet Cong, Bai
Daheim, Nico
Shen, Yuesong
Cremers, Daniel
Yokota, Rio
Khan, Mohammad Emtiyaz
Möllenhoff, Thomas
contents We show that variational learning can significantly improve the accuracy and calibration of Low-Rank Adaptation (LoRA) without a substantial increase in the cost. We replace AdamW by the Improved Variational Online Newton (IVON) algorithm to finetune large language models. For Llama-2 with 7 billion parameters, IVON improves the accuracy over AdamW by 2.8% and expected calibration error by 4.6%. The accuracy is also better than the other Bayesian alternatives, yet the cost is lower and the implementation is easier. Our work provides additional evidence for the effectiveness of IVON for large language models. The code is available at https://github.com/team-approx-bayes/ivon-lora.
format Preprint
id arxiv_https___arxiv_org_abs_2411_04421
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Variational Low-Rank Adaptation Using IVON
Cong, Bai
Daheim, Nico
Shen, Yuesong
Cremers, Daniel
Yokota, Rio
Khan, Mohammad Emtiyaz
Möllenhoff, Thomas
Machine Learning
Artificial Intelligence
Computation and Language
We show that variational learning can significantly improve the accuracy and calibration of Low-Rank Adaptation (LoRA) without a substantial increase in the cost. We replace AdamW by the Improved Variational Online Newton (IVON) algorithm to finetune large language models. For Llama-2 with 7 billion parameters, IVON improves the accuracy over AdamW by 2.8% and expected calibration error by 4.6%. The accuracy is also better than the other Bayesian alternatives, yet the cost is lower and the implementation is easier. Our work provides additional evidence for the effectiveness of IVON for large language models. The code is available at https://github.com/team-approx-bayes/ivon-lora.
title Variational Low-Rank Adaptation Using IVON
topic Machine Learning
Artificial Intelligence
Computation and Language
url https://arxiv.org/abs/2411.04421