MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Zhang, Zhenyu, Chen, Guangyao, Zou, Yixiong, Li, Yuhua, Li, Ruixuan
Natura:	Preprint
Pubblicazione:	2026
Soggetti:	Artificial Intelligence
Accesso online:	https://arxiv.org/abs/2603.05235
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866910042465239040
author	Zhang, Zhenyu Chen, Guangyao Zou, Yixiong Li, Yuhua Li, Ruixuan
author_facet	Zhang, Zhenyu Chen, Guangyao Zou, Yixiong Li, Yuhua Li, Ruixuan
contents	Source-Free Cross-Domain Few-Shot Learning (SF-CDFSL) focuses on fine-tuning with limited training data from target domains (e.g., medical or satellite images), where CLIP has recently shown promising results due to its generalizability to downstream tasks. Current works indicate CLIP's text encoder is more suitable for cross-domain tasks, however, we find that \textbf{removing certain middle layers of the text encoder can effectively improve performance in SF-CDFSL}, which we call the Lost Layers. In this paper, we delve into this phenomenon for a deeper understanding. We discover that instead of being harmful for the SF-CDFSL task, the information in these layers is actually beneficial, but visual gaps prevent this useful information from being fully utilized, making these layers seem redundant. Based on this understanding, unlike current works that simply remove these layers, we propose a method to teachs the model to \textbf{re-utilize} information in these lost layers at both the layer and encoder levels, guiding the re-learning of the visual branch under domain shifts. Our approach effectively addresses the issue of underutilized information in the text encoder. Extensive experiments across various settings, backbones (CLIP, SigLip, PE-Core), and tasks (4 CDFSL datasets and 10 Meta-dataset datasets) demonstrate the effectiveness of our method. Code is available at https://github.com/zhenyuZ-HUST/CVPR26-VtT.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_05235
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Reclaiming Lost Text Layers for Source-Free Cross-Domain Few-Shot Learning Zhang, Zhenyu Chen, Guangyao Zou, Yixiong Li, Yuhua Li, Ruixuan Artificial Intelligence Source-Free Cross-Domain Few-Shot Learning (SF-CDFSL) focuses on fine-tuning with limited training data from target domains (e.g., medical or satellite images), where CLIP has recently shown promising results due to its generalizability to downstream tasks. Current works indicate CLIP's text encoder is more suitable for cross-domain tasks, however, we find that \textbf{removing certain middle layers of the text encoder can effectively improve performance in SF-CDFSL}, which we call the Lost Layers. In this paper, we delve into this phenomenon for a deeper understanding. We discover that instead of being harmful for the SF-CDFSL task, the information in these layers is actually beneficial, but visual gaps prevent this useful information from being fully utilized, making these layers seem redundant. Based on this understanding, unlike current works that simply remove these layers, we propose a method to teachs the model to \textbf{re-utilize} information in these lost layers at both the layer and encoder levels, guiding the re-learning of the visual branch under domain shifts. Our approach effectively addresses the issue of underutilized information in the text encoder. Extensive experiments across various settings, backbones (CLIP, SigLip, PE-Core), and tasks (4 CDFSL datasets and 10 Meta-dataset datasets) demonstrate the effectiveness of our method. Code is available at https://github.com/zhenyuZ-HUST/CVPR26-VtT.
title	Reclaiming Lost Text Layers for Source-Free Cross-Domain Few-Shot Learning
topic	Artificial Intelligence
url	https://arxiv.org/abs/2603.05235

Documenti analoghi