Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Rebillard, Clea, Hurtado, Julio, Krutsylo, Andrii, Passaro, Lucia, Lomonaco, Vincenzo
Format: Preprint
Veröffentlicht: 2024
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2407.08279
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866916867549954048
author Rebillard, Clea
Hurtado, Julio
Krutsylo, Andrii
Passaro, Lucia
Lomonaco, Vincenzo
author_facet Rebillard, Clea
Hurtado, Julio
Krutsylo, Andrii
Passaro, Lucia
Lomonaco, Vincenzo
contents Learning continually from a stream of non-i.i.d. data is an open challenge in deep learning, even more so when working in resource-constrained environments such as embedded devices. Visual models that are continually updated through supervised learning are often prone to overfitting, catastrophic forgetting, and biased representations. On the other hand, large language models contain knowledge about multiple concepts and their relations, which can foster a more robust, informed and coherent learning process. This work proposes Continual Visual Mapping (CVM), an approach that continually ground vision representations to a knowledge space extracted from a fixed Language model. Specifically, CVM continually trains a small and efficient visual model to map its representations into a conceptual space established by a fixed Large Language Model. Due to their smaller nature, CVM can be used when directly adapting large visual pre-trained models is unfeasible due to computational or data constraints. CVM overcome state-of-the-art continual learning methods on five benchmarks and offers a promising avenue for addressing generalization capabilities in continual learning, even in computationally constrained devices.
format Preprint
id arxiv_https___arxiv_org_abs_2407_08279
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Continually Learn to Map Visual Concepts to Large Language Models in Resource-constrained Environments
Rebillard, Clea
Hurtado, Julio
Krutsylo, Andrii
Passaro, Lucia
Lomonaco, Vincenzo
Artificial Intelligence
Learning continually from a stream of non-i.i.d. data is an open challenge in deep learning, even more so when working in resource-constrained environments such as embedded devices. Visual models that are continually updated through supervised learning are often prone to overfitting, catastrophic forgetting, and biased representations. On the other hand, large language models contain knowledge about multiple concepts and their relations, which can foster a more robust, informed and coherent learning process. This work proposes Continual Visual Mapping (CVM), an approach that continually ground vision representations to a knowledge space extracted from a fixed Language model. Specifically, CVM continually trains a small and efficient visual model to map its representations into a conceptual space established by a fixed Large Language Model. Due to their smaller nature, CVM can be used when directly adapting large visual pre-trained models is unfeasible due to computational or data constraints. CVM overcome state-of-the-art continual learning methods on five benchmarks and offers a promising avenue for addressing generalization capabilities in continual learning, even in computationally constrained devices.
title Continually Learn to Map Visual Concepts to Large Language Models in Resource-constrained Environments
topic Artificial Intelligence
url https://arxiv.org/abs/2407.08279