Salvato in:
| Autori principali: | , |
|---|---|
| Natura: | Preprint |
| Pubblicazione: |
2024
|
| Soggetti: | |
| Accesso online: | https://arxiv.org/abs/2409.03682 |
| Tags: |
Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
|
| _version_ | 1866909306811580416 |
|---|---|
| author | Chayti, El Mahdi Jaggi, Martin |
| author_facet | Chayti, El Mahdi Jaggi, Martin |
| contents | Learning new tasks by drawing on prior experience gathered from other (related) tasks is a core property of any intelligent system. Gradient-based meta-learning, especially MAML and its variants, has emerged as a viable solution to accomplish this goal. One problem MAML encounters is its computational and memory burdens needed to compute the meta-gradients. We propose a new first-order variant of MAML that we prove converges to a stationary point of the MAML objective, unlike other first-order variants. We also show that the MAML objective does not satisfy the smoothness assumption assumed in previous works; we show instead that its smoothness constant grows with the norm of the meta-gradient, which theoretically suggests the use of normalized or clipped-gradient methods compared to the plain gradient method used in previous works. We validate our theory on a synthetic experiment. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2409_03682 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | A New First-Order Meta-Learning Algorithm with Convergence Guarantees Chayti, El Mahdi Jaggi, Martin Machine Learning Optimization and Control Learning new tasks by drawing on prior experience gathered from other (related) tasks is a core property of any intelligent system. Gradient-based meta-learning, especially MAML and its variants, has emerged as a viable solution to accomplish this goal. One problem MAML encounters is its computational and memory burdens needed to compute the meta-gradients. We propose a new first-order variant of MAML that we prove converges to a stationary point of the MAML objective, unlike other first-order variants. We also show that the MAML objective does not satisfy the smoothness assumption assumed in previous works; we show instead that its smoothness constant grows with the norm of the meta-gradient, which theoretically suggests the use of normalized or clipped-gradient methods compared to the plain gradient method used in previous works. We validate our theory on a synthetic experiment. |
| title | A New First-Order Meta-Learning Algorithm with Convergence Guarantees |
| topic | Machine Learning Optimization and Control |
| url | https://arxiv.org/abs/2409.03682 |