Salvato in:
Dettagli Bibliografici
Autori principali: Chayti, El Mahdi, Jaggi, Martin
Natura: Preprint
Pubblicazione: 2024
Soggetti:
Accesso online:https://arxiv.org/abs/2409.03682
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866909306811580416
author Chayti, El Mahdi
Jaggi, Martin
author_facet Chayti, El Mahdi
Jaggi, Martin
contents Learning new tasks by drawing on prior experience gathered from other (related) tasks is a core property of any intelligent system. Gradient-based meta-learning, especially MAML and its variants, has emerged as a viable solution to accomplish this goal. One problem MAML encounters is its computational and memory burdens needed to compute the meta-gradients. We propose a new first-order variant of MAML that we prove converges to a stationary point of the MAML objective, unlike other first-order variants. We also show that the MAML objective does not satisfy the smoothness assumption assumed in previous works; we show instead that its smoothness constant grows with the norm of the meta-gradient, which theoretically suggests the use of normalized or clipped-gradient methods compared to the plain gradient method used in previous works. We validate our theory on a synthetic experiment.
format Preprint
id arxiv_https___arxiv_org_abs_2409_03682
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle A New First-Order Meta-Learning Algorithm with Convergence Guarantees
Chayti, El Mahdi
Jaggi, Martin
Machine Learning
Optimization and Control
Learning new tasks by drawing on prior experience gathered from other (related) tasks is a core property of any intelligent system. Gradient-based meta-learning, especially MAML and its variants, has emerged as a viable solution to accomplish this goal. One problem MAML encounters is its computational and memory burdens needed to compute the meta-gradients. We propose a new first-order variant of MAML that we prove converges to a stationary point of the MAML objective, unlike other first-order variants. We also show that the MAML objective does not satisfy the smoothness assumption assumed in previous works; we show instead that its smoothness constant grows with the norm of the meta-gradient, which theoretically suggests the use of normalized or clipped-gradient methods compared to the plain gradient method used in previous works. We validate our theory on a synthetic experiment.
title A New First-Order Meta-Learning Algorithm with Convergence Guarantees
topic Machine Learning
Optimization and Control
url https://arxiv.org/abs/2409.03682