Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Tsao, Hsi-Ai, Hsiung, Lei, Chen, Pin-Yu, Ho, Tsung-Yi
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2409.01821
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909304108351488
author	Tsao, Hsi-Ai Hsiung, Lei Chen, Pin-Yu Ho, Tsung-Yi
author_facet	Tsao, Hsi-Ai Hsiung, Lei Chen, Pin-Yu Ho, Tsung-Yi
contents	Adapting pre-trained models to new tasks can exhibit varying effectiveness across datasets. Visual prompting, a state-of-the-art parameter-efficient transfer learning method, can significantly improve the performance of out-of-distribution tasks. On the other hand, linear probing, a standard transfer learning method, can sometimes become the best approach. We propose a log-likelihood ratio (LLR) approach to analyze the comparative benefits of visual prompting and linear probing. By employing the LLR score alongside resource-efficient visual prompts approximations, our cost-effective measure attains up to a 100-fold reduction in run time compared to full training, while achieving prediction accuracies up to 91%. The source code is available at https://github.com/IBM/VP-LLR.
format	Preprint
id	arxiv_https___arxiv_org_abs_2409_01821
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	When Does Visual Prompting Outperform Linear Probing for Vision-Language Models? A Likelihood Perspective Tsao, Hsi-Ai Hsiung, Lei Chen, Pin-Yu Ho, Tsung-Yi Computer Vision and Pattern Recognition Machine Learning Adapting pre-trained models to new tasks can exhibit varying effectiveness across datasets. Visual prompting, a state-of-the-art parameter-efficient transfer learning method, can significantly improve the performance of out-of-distribution tasks. On the other hand, linear probing, a standard transfer learning method, can sometimes become the best approach. We propose a log-likelihood ratio (LLR) approach to analyze the comparative benefits of visual prompting and linear probing. By employing the LLR score alongside resource-efficient visual prompts approximations, our cost-effective measure attains up to a 100-fold reduction in run time compared to full training, while achieving prediction accuracies up to 91%. The source code is available at https://github.com/IBM/VP-LLR.
title	When Does Visual Prompting Outperform Linear Probing for Vision-Language Models? A Likelihood Perspective
topic	Computer Vision and Pattern Recognition Machine Learning
url	https://arxiv.org/abs/2409.01821

Similar Items