Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2412.01930 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866908620977864704 |
|---|---|
| author | Chakravarthy, Anirudh S Zheng, Shuai Kyle Huang, Xin Hemachandra, Sachithra Zhang, Xiao Chai, Yuning Chen, Zhao |
| author_facet | Chakravarthy, Anirudh S Zheng, Shuai Kyle Huang, Xin Hemachandra, Sachithra Zhang, Xiao Chai, Yuning Chen, Zhao |
| contents | The fine-tuning of pre-trained models has become ubiquitous in generative AI, computer vision, and robotics. Although much attention has been paid to improving the efficiency of fine-tuning model, there has been less scholarship around fine-tuning specifically for improved model performance. To remedy this gap, we present PROFIT, one of the first optimizers designed to incrementally fine-tune converged models on new tasks and/or datasets. Unlike traditional optimizers such as SGD or Adam, which make minimal assumptions due to random initializations, PROFIT takes the properties of a converged model into account explicitly to regularize the optimization process. Employing a temporal gradient-orthogonalization process, PROFIT outperforms fine-tuning methods in various tasks, from image classification to multimodal language model training to large-scale motion prediction. Moreover, PROFIT is encapsulated as a modular optimizer, which makes it easy to integrate directly into any training pipeline with minimal engineering effort. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2412_01930 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | PROFIT: A Specialized Optimizer for Deep Fine Tuning Chakravarthy, Anirudh S Zheng, Shuai Kyle Huang, Xin Hemachandra, Sachithra Zhang, Xiao Chai, Yuning Chen, Zhao Computer Vision and Pattern Recognition The fine-tuning of pre-trained models has become ubiquitous in generative AI, computer vision, and robotics. Although much attention has been paid to improving the efficiency of fine-tuning model, there has been less scholarship around fine-tuning specifically for improved model performance. To remedy this gap, we present PROFIT, one of the first optimizers designed to incrementally fine-tune converged models on new tasks and/or datasets. Unlike traditional optimizers such as SGD or Adam, which make minimal assumptions due to random initializations, PROFIT takes the properties of a converged model into account explicitly to regularize the optimization process. Employing a temporal gradient-orthogonalization process, PROFIT outperforms fine-tuning methods in various tasks, from image classification to multimodal language model training to large-scale motion prediction. Moreover, PROFIT is encapsulated as a modular optimizer, which makes it easy to integrate directly into any training pipeline with minimal engineering effort. |
| title | PROFIT: A Specialized Optimizer for Deep Fine Tuning |
| topic | Computer Vision and Pattern Recognition |
| url | https://arxiv.org/abs/2412.01930 |