Saved in:
Bibliographic Details
Main Authors: Nagata, Kotaro, Ono, Hiromu, Hotta, Kazuhiro
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2409.11329
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912031877103616
author Nagata, Kotaro
Ono, Hiromu
Hotta, Kazuhiro
author_facet Nagata, Kotaro
Ono, Hiromu
Hotta, Kazuhiro
contents In continual learning, there is a serious problem of catastrophic forgetting, in which previous knowledge is forgotten when a model learns new tasks. Various methods have been proposed to solve this problem. Replay methods which replay data from previous tasks in later training, have shown good accuracy. However, replay methods have a generalizability problem from a limited memory buffer. In this paper, we tried to solve this problem by acquiring transferable knowledge through self-distillation using highly generalizable output in shallow layer as a teacher. Furthermore, when we deal with a large number of classes or challenging data, there is a risk of learning not converging and not experiencing overfitting. Therefore, we attempted to achieve more efficient and thorough learning by prioritizing the storage of easily misclassified samples through a new method of memory update. We confirmed that our proposed method outperformed conventional methods by experiments on CIFAR10, CIFAR100, and MiniimageNet datasets.
format Preprint
id arxiv_https___arxiv_org_abs_2409_11329
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Reducing Catastrophic Forgetting in Online Class Incremental Learning Using Self-Distillation
Nagata, Kotaro
Ono, Hiromu
Hotta, Kazuhiro
Computer Vision and Pattern Recognition
In continual learning, there is a serious problem of catastrophic forgetting, in which previous knowledge is forgotten when a model learns new tasks. Various methods have been proposed to solve this problem. Replay methods which replay data from previous tasks in later training, have shown good accuracy. However, replay methods have a generalizability problem from a limited memory buffer. In this paper, we tried to solve this problem by acquiring transferable knowledge through self-distillation using highly generalizable output in shallow layer as a teacher. Furthermore, when we deal with a large number of classes or challenging data, there is a risk of learning not converging and not experiencing overfitting. Therefore, we attempted to achieve more efficient and thorough learning by prioritizing the storage of easily misclassified samples through a new method of memory update. We confirmed that our proposed method outperformed conventional methods by experiments on CIFAR10, CIFAR100, and MiniimageNet datasets.
title Reducing Catastrophic Forgetting in Online Class Incremental Learning Using Self-Distillation
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2409.11329