Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Elisii, Patrick, Beauchemin, Lucas, Jamshed, Dawer
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2605.06834
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911659013963776
author	Elisii, Patrick Beauchemin, Lucas Jamshed, Dawer
author_facet	Elisii, Patrick Beauchemin, Lucas Jamshed, Dawer
contents	Continual learning research attempts to conserve two fundamental capabilities: new knowledge acquisition and the preservation of previously acquired knowledge. While knowledge in this case can be measured through performance over an implicit or explicit task space, model plasticity generally concerns adaptability as data distributions evolve. Though much of the literature has focused on catastrophic forgetting, deep networks can also suffer from loss of plasticity, becoming progressively harder to update under continued training. Recent research has identified multiple mechanisms underlying this phenomenon, including neuron saturation, parameter norm growth, and loss of useful curvature directions. Adaptive reset-based interventions, which selectively reinitialize low-utility network parameters, have emerged as practical solutions to restore trainability. Existing utility measures used to guide resets, such as activation magnitude, contribution utility, or gradient-based activity, rely on proxy signals that can become misaligned with the intervention they are meant to guide. In this paper, we introduce gradient times difference from reference (GXD), a theoretically motivated utility measure based on reference-based gradient attribution that estimates the first-order functional cost of replacing a unit. Our results show that utility measures aligned with the functional cost of the reset can make interventions more reliable in settings where existing reset criteria degrade. GXD reframes adaptive resetting as an intervention cost estimation problem, providing a practical path toward more robust continual learning systems.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_06834
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Attribution-Based Neuron Utility for Plasticity Restoration in Deep Networks Elisii, Patrick Beauchemin, Lucas Jamshed, Dawer Machine Learning Continual learning research attempts to conserve two fundamental capabilities: new knowledge acquisition and the preservation of previously acquired knowledge. While knowledge in this case can be measured through performance over an implicit or explicit task space, model plasticity generally concerns adaptability as data distributions evolve. Though much of the literature has focused on catastrophic forgetting, deep networks can also suffer from loss of plasticity, becoming progressively harder to update under continued training. Recent research has identified multiple mechanisms underlying this phenomenon, including neuron saturation, parameter norm growth, and loss of useful curvature directions. Adaptive reset-based interventions, which selectively reinitialize low-utility network parameters, have emerged as practical solutions to restore trainability. Existing utility measures used to guide resets, such as activation magnitude, contribution utility, or gradient-based activity, rely on proxy signals that can become misaligned with the intervention they are meant to guide. In this paper, we introduce gradient times difference from reference (GXD), a theoretically motivated utility measure based on reference-based gradient attribution that estimates the first-order functional cost of replacing a unit. Our results show that utility measures aligned with the functional cost of the reset can make interventions more reliable in settings where existing reset criteria degrade. GXD reframes adaptive resetting as an intervention cost estimation problem, providing a practical path toward more robust continual learning systems.
title	Attribution-Based Neuron Utility for Plasticity Restoration in Deep Networks
topic	Machine Learning
url	https://arxiv.org/abs/2605.06834

Similar Items