Saved in:
Bibliographic Details
Main Authors: Chakravarthy, Anirudh S, Zheng, Shuai Kyle, Huang, Xin, Hemachandra, Sachithra, Zhang, Xiao, Chai, Yuning, Chen, Zhao
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2412.01930
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908620977864704
author Chakravarthy, Anirudh S
Zheng, Shuai Kyle
Huang, Xin
Hemachandra, Sachithra
Zhang, Xiao
Chai, Yuning
Chen, Zhao
author_facet Chakravarthy, Anirudh S
Zheng, Shuai Kyle
Huang, Xin
Hemachandra, Sachithra
Zhang, Xiao
Chai, Yuning
Chen, Zhao
contents The fine-tuning of pre-trained models has become ubiquitous in generative AI, computer vision, and robotics. Although much attention has been paid to improving the efficiency of fine-tuning model, there has been less scholarship around fine-tuning specifically for improved model performance. To remedy this gap, we present PROFIT, one of the first optimizers designed to incrementally fine-tune converged models on new tasks and/or datasets. Unlike traditional optimizers such as SGD or Adam, which make minimal assumptions due to random initializations, PROFIT takes the properties of a converged model into account explicitly to regularize the optimization process. Employing a temporal gradient-orthogonalization process, PROFIT outperforms fine-tuning methods in various tasks, from image classification to multimodal language model training to large-scale motion prediction. Moreover, PROFIT is encapsulated as a modular optimizer, which makes it easy to integrate directly into any training pipeline with minimal engineering effort.
format Preprint
id arxiv_https___arxiv_org_abs_2412_01930
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle PROFIT: A Specialized Optimizer for Deep Fine Tuning
Chakravarthy, Anirudh S
Zheng, Shuai Kyle
Huang, Xin
Hemachandra, Sachithra
Zhang, Xiao
Chai, Yuning
Chen, Zhao
Computer Vision and Pattern Recognition
The fine-tuning of pre-trained models has become ubiquitous in generative AI, computer vision, and robotics. Although much attention has been paid to improving the efficiency of fine-tuning model, there has been less scholarship around fine-tuning specifically for improved model performance. To remedy this gap, we present PROFIT, one of the first optimizers designed to incrementally fine-tune converged models on new tasks and/or datasets. Unlike traditional optimizers such as SGD or Adam, which make minimal assumptions due to random initializations, PROFIT takes the properties of a converged model into account explicitly to regularize the optimization process. Employing a temporal gradient-orthogonalization process, PROFIT outperforms fine-tuning methods in various tasks, from image classification to multimodal language model training to large-scale motion prediction. Moreover, PROFIT is encapsulated as a modular optimizer, which makes it easy to integrate directly into any training pipeline with minimal engineering effort.
title PROFIT: A Specialized Optimizer for Deep Fine Tuning
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2412.01930