Guardado en:
| Autores principales: | , , , , , , |
|---|---|
| Formato: | Preprint |
| Publicado: |
2024
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2411.04421 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
| _version_ | 1866917832570175488 |
|---|---|
| author | Cong, Bai Daheim, Nico Shen, Yuesong Cremers, Daniel Yokota, Rio Khan, Mohammad Emtiyaz Möllenhoff, Thomas |
| author_facet | Cong, Bai Daheim, Nico Shen, Yuesong Cremers, Daniel Yokota, Rio Khan, Mohammad Emtiyaz Möllenhoff, Thomas |
| contents | We show that variational learning can significantly improve the accuracy and calibration of Low-Rank Adaptation (LoRA) without a substantial increase in the cost. We replace AdamW by the Improved Variational Online Newton (IVON) algorithm to finetune large language models. For Llama-2 with 7 billion parameters, IVON improves the accuracy over AdamW by 2.8% and expected calibration error by 4.6%. The accuracy is also better than the other Bayesian alternatives, yet the cost is lower and the implementation is easier. Our work provides additional evidence for the effectiveness of IVON for large language models. The code is available at https://github.com/team-approx-bayes/ivon-lora. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2411_04421 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | Variational Low-Rank Adaptation Using IVON Cong, Bai Daheim, Nico Shen, Yuesong Cremers, Daniel Yokota, Rio Khan, Mohammad Emtiyaz Möllenhoff, Thomas Machine Learning Artificial Intelligence Computation and Language We show that variational learning can significantly improve the accuracy and calibration of Low-Rank Adaptation (LoRA) without a substantial increase in the cost. We replace AdamW by the Improved Variational Online Newton (IVON) algorithm to finetune large language models. For Llama-2 with 7 billion parameters, IVON improves the accuracy over AdamW by 2.8% and expected calibration error by 4.6%. The accuracy is also better than the other Bayesian alternatives, yet the cost is lower and the implementation is easier. Our work provides additional evidence for the effectiveness of IVON for large language models. The code is available at https://github.com/team-approx-bayes/ivon-lora. |
| title | Variational Low-Rank Adaptation Using IVON |
| topic | Machine Learning Artificial Intelligence Computation and Language |
| url | https://arxiv.org/abs/2411.04421 |