Saved in:
Bibliographic Details
Main Authors: Kang, Liwei, Lee, Wee Sun
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2404.02754
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910397361029120
author Kang, Liwei
Lee, Wee Sun
author_facet Kang, Liwei
Lee, Wee Sun
contents Continual learning, an important aspect of artificial intelligence and machine learning research, focuses on developing models that learn and adapt to new tasks while retaining previously acquired knowledge. Existing continual learning algorithms usually involve a small number of tasks with uniform sizes and may not accurately represent real-world learning scenarios. In this paper, we investigate the performance of continual learning algorithms with a large number of tasks drawn from a task distribution that is long-tail in terms of task sizes. We design one synthetic dataset and two real-world continual learning datasets to evaluate the performance of existing algorithms in such a setting. Moreover, we study an overlooked factor in continual learning, the optimizer states, e.g. first and second moments in the Adam optimizer, and investigate how it can be used to improve continual learning performance. We propose a method that reuses the optimizer states in Adam by maintaining a weighted average of the second moments from previous tasks. We demonstrate that our method, compatible with most existing continual learning algorithms, effectively reduces forgetting with only a small amount of additional computational or memory costs, and provides further improvements on existing continual learning algorithms, particularly in a long-tail task sequence.
format Preprint
id arxiv_https___arxiv_org_abs_2404_02754
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Continual Learning of Numerous Tasks from Long-tail Distributions
Kang, Liwei
Lee, Wee Sun
Machine Learning
Continual learning, an important aspect of artificial intelligence and machine learning research, focuses on developing models that learn and adapt to new tasks while retaining previously acquired knowledge. Existing continual learning algorithms usually involve a small number of tasks with uniform sizes and may not accurately represent real-world learning scenarios. In this paper, we investigate the performance of continual learning algorithms with a large number of tasks drawn from a task distribution that is long-tail in terms of task sizes. We design one synthetic dataset and two real-world continual learning datasets to evaluate the performance of existing algorithms in such a setting. Moreover, we study an overlooked factor in continual learning, the optimizer states, e.g. first and second moments in the Adam optimizer, and investigate how it can be used to improve continual learning performance. We propose a method that reuses the optimizer states in Adam by maintaining a weighted average of the second moments from previous tasks. We demonstrate that our method, compatible with most existing continual learning algorithms, effectively reduces forgetting with only a small amount of additional computational or memory costs, and provides further improvements on existing continual learning algorithms, particularly in a long-tail task sequence.
title Continual Learning of Numerous Tasks from Long-tail Distributions
topic Machine Learning
url https://arxiv.org/abs/2404.02754