Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Fu, Minghan, Wu, Fang-Xiang
Format:	Preprint
Published:	2023
Subjects:	Machine Learning Optimization and Control
Online Access:	https://arxiv.org/abs/2302.00252
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914709791309824
author	Fu, Minghan Wu, Fang-Xiang
author_facet	Fu, Minghan Wu, Fang-Xiang
contents	The learning rate is a critical hyperparameter for deep learning tasks since it determines the extent to which the model parameters are updated during the learning course. However, the choice of learning rates typically depends on empirical judgment, which may not result in satisfactory outcomes without intensive try-and-error experiments. In this study, we propose a novel learning rate adaptation scheme called QLABGrad. Without any user-specified hyperparameter, QLABGrad automatically determines the learning rate by optimizing the Quadratic Loss Approximation-Based (QLAB) function for a given gradient descent direction, where only one extra forward propagation is required. We theoretically prove the convergence of QLABGrad with a smooth Lipschitz condition on the loss function. Experiment results on multiple architectures, including MLP, CNN, and ResNet, on MNIST, CIFAR10, and ImageNet datasets, demonstrate that QLABGrad outperforms various competing schemes for deep learning.
format	Preprint
id	arxiv_https___arxiv_org_abs_2302_00252
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	QLABGrad: a Hyperparameter-Free and Convergence-Guaranteed Scheme for Deep Learning Fu, Minghan Wu, Fang-Xiang Machine Learning Optimization and Control The learning rate is a critical hyperparameter for deep learning tasks since it determines the extent to which the model parameters are updated during the learning course. However, the choice of learning rates typically depends on empirical judgment, which may not result in satisfactory outcomes without intensive try-and-error experiments. In this study, we propose a novel learning rate adaptation scheme called QLABGrad. Without any user-specified hyperparameter, QLABGrad automatically determines the learning rate by optimizing the Quadratic Loss Approximation-Based (QLAB) function for a given gradient descent direction, where only one extra forward propagation is required. We theoretically prove the convergence of QLABGrad with a smooth Lipschitz condition on the loss function. Experiment results on multiple architectures, including MLP, CNN, and ResNet, on MNIST, CIFAR10, and ImageNet datasets, demonstrate that QLABGrad outperforms various competing schemes for deep learning.
title	QLABGrad: a Hyperparameter-Free and Convergence-Guaranteed Scheme for Deep Learning
topic	Machine Learning Optimization and Control
url	https://arxiv.org/abs/2302.00252

Similar Items