Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Rypeść, Grzegorz
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2504.01219
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912305418076160
author	Rypeść, Grzegorz
author_facet	Rypeść, Grzegorz
contents	Continual learning (CL) presents a fundamental challenge in training neural networks on sequential tasks without experiencing catastrophic forgetting. Traditionally, the dominant approach in CL has been gradient-based optimization, where updates to the network parameters are performed using stochastic gradient descent (SGD) or its variants. However, a major limitation arises when previous data is no longer accessible, as is often assumed in CL settings. In such cases, there is no gradient information available for past data, leading to uncontrolled parameter changes and consequently severe forgetting of previously learned tasks. By shifting focus from data availability to gradient availability, this work opens up new avenues for addressing forgetting in CL. We explore the hypothesis that gradient-free optimization methods can provide a robust alternative to conventional gradient-based continual learning approaches. We discuss the theoretical underpinnings of such method, analyze their potential advantages and limitations, and present empirical evidence supporting their effectiveness. By reconsidering the fundamental cause of forgetting, this work aims to contribute a fresh perspective to the field of continual learning and inspire novel research directions.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_01219
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Gradient-free Continual Learning Rypeść, Grzegorz Machine Learning Continual learning (CL) presents a fundamental challenge in training neural networks on sequential tasks without experiencing catastrophic forgetting. Traditionally, the dominant approach in CL has been gradient-based optimization, where updates to the network parameters are performed using stochastic gradient descent (SGD) or its variants. However, a major limitation arises when previous data is no longer accessible, as is often assumed in CL settings. In such cases, there is no gradient information available for past data, leading to uncontrolled parameter changes and consequently severe forgetting of previously learned tasks. By shifting focus from data availability to gradient availability, this work opens up new avenues for addressing forgetting in CL. We explore the hypothesis that gradient-free optimization methods can provide a robust alternative to conventional gradient-based continual learning approaches. We discuss the theoretical underpinnings of such method, analyze their potential advantages and limitations, and present empirical evidence supporting their effectiveness. By reconsidering the fundamental cause of forgetting, this work aims to contribute a fresh perspective to the field of continual learning and inspire novel research directions.
title	Gradient-free Continual Learning
topic	Machine Learning
url	https://arxiv.org/abs/2504.01219

Similar Items