Saved in:
Bibliographic Details
Main Authors: Zhu, Menghao Waiyan William, Kuruoğlu, Ercan Engin
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2405.16498
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917949129883648
author Zhu, Menghao Waiyan William
Kuruoğlu, Ercan Engin
author_facet Zhu, Menghao Waiyan William
Kuruoğlu, Ercan Engin
contents We formulate sequential maximum a posteriori inference as a recursion of loss functions and reduce the problem of continual learning to approximating the previous loss function. We then propose two coreset-free methods: autodiff quadratic consolidation, which uses an accurate and full quadratic approximation, and neural consolidation, which uses a neural network approximation. These methods are not scalable with respect to the neural network size, and we study them for classification tasks in combination with a fixed pre-trained feature extractor. We also introduce simple but challenging classical task sequences based on Iris and Wine datasets. We find that neural consolidation performs well in the classical task sequences, where the input dimension is small, while autodiff quadratic consolidation performs consistently well in image task sequences with a fixed pre-trained feature extractor, achieving comparable performance to joint maximum a posteriori training in many cases.
format Preprint
id arxiv_https___arxiv_org_abs_2405_16498
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle On Sequential Maximum a Posteriori Inference for Continual Learning
Zhu, Menghao Waiyan William
Kuruoğlu, Ercan Engin
Machine Learning
We formulate sequential maximum a posteriori inference as a recursion of loss functions and reduce the problem of continual learning to approximating the previous loss function. We then propose two coreset-free methods: autodiff quadratic consolidation, which uses an accurate and full quadratic approximation, and neural consolidation, which uses a neural network approximation. These methods are not scalable with respect to the neural network size, and we study them for classification tasks in combination with a fixed pre-trained feature extractor. We also introduce simple but challenging classical task sequences based on Iris and Wine datasets. We find that neural consolidation performs well in the classical task sequences, where the input dimension is small, while autodiff quadratic consolidation performs consistently well in image task sequences with a fixed pre-trained feature extractor, achieving comparable performance to joint maximum a posteriori training in many cases.
title On Sequential Maximum a Posteriori Inference for Continual Learning
topic Machine Learning
url https://arxiv.org/abs/2405.16498