Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2412.12473 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866916527123464192 |
|---|---|
| author | Jin, Long Nong, Han Chen, Liangming Su, Zhenming |
| author_facet | Jin, Long Nong, Han Chen, Liangming Su, Zhenming |
| contents | The insufficient generalization of adaptive moment estimation (Adam) has hindered its broader application. Recent studies have shown that flat minima in loss landscapes are highly associated with improved generalization. Inspired by the filtering effect of integration operations on high-frequency signals, we propose multiple integral Adam (MIAdam), a novel optimizer that integrates a multiple integral term into Adam. This multiple integral term effectively filters out sharp minima encountered during optimization, guiding the optimizer towards flatter regions and thereby enhancing generalization capability. We provide a theoretical explanation for the improvement in generalization through the diffusion theory framework and analyze the impact of the multiple integral term on the optimizer's convergence. Experimental results demonstrate that MIAdam not only enhances generalization and robustness against label noise but also maintains the rapid convergence characteristic of Adam, outperforming Adam and its variants in state-of-the-art benchmarks. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2412_12473 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | A Method for Enhancing Generalization of Adam by Multiple Integrations Jin, Long Nong, Han Chen, Liangming Su, Zhenming Machine Learning The insufficient generalization of adaptive moment estimation (Adam) has hindered its broader application. Recent studies have shown that flat minima in loss landscapes are highly associated with improved generalization. Inspired by the filtering effect of integration operations on high-frequency signals, we propose multiple integral Adam (MIAdam), a novel optimizer that integrates a multiple integral term into Adam. This multiple integral term effectively filters out sharp minima encountered during optimization, guiding the optimizer towards flatter regions and thereby enhancing generalization capability. We provide a theoretical explanation for the improvement in generalization through the diffusion theory framework and analyze the impact of the multiple integral term on the optimizer's convergence. Experimental results demonstrate that MIAdam not only enhances generalization and robustness against label noise but also maintains the rapid convergence characteristic of Adam, outperforming Adam and its variants in state-of-the-art benchmarks. |
| title | A Method for Enhancing Generalization of Adam by Multiple Integrations |
| topic | Machine Learning |
| url | https://arxiv.org/abs/2412.12473 |