Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Whitaker, Tim, Whitley, Darrell
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2401.08830
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929212654354432
author	Whitaker, Tim Whitley, Darrell
author_facet	Whitaker, Tim Whitley, Darrell
contents	Pruning methods have recently grown in popularity as an effective way to reduce the size and computational complexity of deep neural networks. Large numbers of parameters can be removed from trained models with little discernible loss in accuracy after a small number of continued training epochs. However, pruning too many parameters at once often causes an initial steep drop in accuracy which can undermine convergence quality. Iterative pruning approaches mitigate this by gradually removing a small number of parameters over multiple epochs. However, this can still lead to subnetworks that overfit local regions of the loss landscape. We introduce a novel and effective approach to tuning subnetworks through a regularization technique we call Stochastic Subnetwork Annealing. Instead of removing parameters in a discrete manner, we instead represent subnetworks with stochastic masks where each parameter has a probabilistic chance of being included or excluded on any given forward pass. We anneal these probabilities over time such that subnetwork structure slowly evolves as mask values become more deterministic, allowing for a smoother and more robust optimization of subnetworks at high levels of sparsity.
format	Preprint
id	arxiv_https___arxiv_org_abs_2401_08830
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Stochastic Subnetwork Annealing: A Regularization Technique for Fine Tuning Pruned Subnetworks Whitaker, Tim Whitley, Darrell Machine Learning Pruning methods have recently grown in popularity as an effective way to reduce the size and computational complexity of deep neural networks. Large numbers of parameters can be removed from trained models with little discernible loss in accuracy after a small number of continued training epochs. However, pruning too many parameters at once often causes an initial steep drop in accuracy which can undermine convergence quality. Iterative pruning approaches mitigate this by gradually removing a small number of parameters over multiple epochs. However, this can still lead to subnetworks that overfit local regions of the loss landscape. We introduce a novel and effective approach to tuning subnetworks through a regularization technique we call Stochastic Subnetwork Annealing. Instead of removing parameters in a discrete manner, we instead represent subnetworks with stochastic masks where each parameter has a probabilistic chance of being included or excluded on any given forward pass. We anneal these probabilities over time such that subnetwork structure slowly evolves as mask values become more deterministic, allowing for a smoother and more robust optimization of subnetworks at high levels of sparsity.
title	Stochastic Subnetwork Annealing: A Regularization Technique for Fine Tuning Pruned Subnetworks
topic	Machine Learning
url	https://arxiv.org/abs/2401.08830

Similar Items