Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yin, Dongshuo, Yang, Xue, Fan, Deng-Ping, Hu, Shi-Min
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.10513
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910018740158464
author	Yin, Dongshuo Yang, Xue Fan, Deng-Ping Hu, Shi-Min
author_facet	Yin, Dongshuo Yang, Xue Fan, Deng-Ping Hu, Shi-Min
contents	Deploying vision foundation models typically relies on efficient adaptation strategies, whereas conventional full fine-tuning suffers from prohibitive costs and low efficiency. While delta-tuning has proven effective in boosting the performance and efficiency of LLMs during adaptation, its advantages cannot be directly transferred to the fine-tuning pipeline of vision foundation models. To push the boundaries of adaptation efficiency for vision tasks, we propose an adapter with Complex Linear Projection Optimization (CoLin). For architecture, we design a novel low-rank complex adapter that introduces only about 1% parameters to the backbone. For efficiency, we theoretically prove that low-rank composite matrices suffer from severe convergence issues during training, and address this challenge with a tailored loss. Extensive experiments on object detection, segmentation, image classification, and rotated object detection (remote sensing scenario) demonstrate that CoLin outperforms both full fine-tuning and classical delta-tuning approaches with merely 1% parameters for the first time, providing a novel and efficient solution for deployment of vision foundation models. We release the code on https://github.com/DongshuoYin/CoLin.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_10513
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	1%>100%: High-Efficiency Visual Adapter with Complex Linear Projection Optimization Yin, Dongshuo Yang, Xue Fan, Deng-Ping Hu, Shi-Min Computer Vision and Pattern Recognition Artificial Intelligence Deploying vision foundation models typically relies on efficient adaptation strategies, whereas conventional full fine-tuning suffers from prohibitive costs and low efficiency. While delta-tuning has proven effective in boosting the performance and efficiency of LLMs during adaptation, its advantages cannot be directly transferred to the fine-tuning pipeline of vision foundation models. To push the boundaries of adaptation efficiency for vision tasks, we propose an adapter with Complex Linear Projection Optimization (CoLin). For architecture, we design a novel low-rank complex adapter that introduces only about 1% parameters to the backbone. For efficiency, we theoretically prove that low-rank composite matrices suffer from severe convergence issues during training, and address this challenge with a tailored loss. Extensive experiments on object detection, segmentation, image classification, and rotated object detection (remote sensing scenario) demonstrate that CoLin outperforms both full fine-tuning and classical delta-tuning approaches with merely 1% parameters for the first time, providing a novel and efficient solution for deployment of vision foundation models. We release the code on https://github.com/DongshuoYin/CoLin.
title	1%>100%: High-Efficiency Visual Adapter with Complex Linear Projection Optimization
topic	Computer Vision and Pattern Recognition Artificial Intelligence
url	https://arxiv.org/abs/2602.10513

Similar Items