Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wang, Yangdi, Zhang, Zhi-Hai, Xu, Su Xiu, Guo, Wenming
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2408.05082
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909282841133056
author	Wang, Yangdi Zhang, Zhi-Hai Xu, Su Xiu Guo, Wenming
author_facet	Wang, Yangdi Zhang, Zhi-Hai Xu, Su Xiu Guo, Wenming
contents	Overfitting commonly occurs when applying deep neural networks (DNNs) on small-scale datasets, where DNNs do not generalize well from existing data to unseen data. The main reason resulting in overfitting is that small-scale datasets cannot reflect the situations of the real world. Label smoothing (LS) is an effective regularization method to prevent overfitting, avoiding it by mixing one-hot labels with uniform label vectors. However, LS only focuses on labels while ignoring the distribution of existing data. In this paper, we introduce the distributionally robust optimization (DRO) to LS, achieving shift the existing data distribution flexibly to unseen domains when training DNNs. Specifically, we prove that the regularization of LS can be extended to a regularization term for the DNNs parameters when integrating DRO. The regularization term can be utilized to shift existing data to unseen domains and generate new data. Furthermore, we propose an approximate gradient-iteration label smoothing algorithm (GI-LS) to achieve the findings and train DNNs. We prove that the shift for the existing data does not influence the convergence of GI-LS. Since GI-LS incorporates a series of hyperparameters, we further consider using Bayesian optimization (BO) to find the relatively optimal combinations of these hyperparameters. Taking small-scale anomaly classification tasks as a case, we evaluate GI-LS, and the results clearly demonstrate its superior performance.
format	Preprint
id	arxiv_https___arxiv_org_abs_2408_05082
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Generalizing Few Data to Unseen Domains Flexibly Based on Label Smoothing Integrated with Distributionally Robust Optimization Wang, Yangdi Zhang, Zhi-Hai Xu, Su Xiu Guo, Wenming Machine Learning Artificial Intelligence Overfitting commonly occurs when applying deep neural networks (DNNs) on small-scale datasets, where DNNs do not generalize well from existing data to unseen data. The main reason resulting in overfitting is that small-scale datasets cannot reflect the situations of the real world. Label smoothing (LS) is an effective regularization method to prevent overfitting, avoiding it by mixing one-hot labels with uniform label vectors. However, LS only focuses on labels while ignoring the distribution of existing data. In this paper, we introduce the distributionally robust optimization (DRO) to LS, achieving shift the existing data distribution flexibly to unseen domains when training DNNs. Specifically, we prove that the regularization of LS can be extended to a regularization term for the DNNs parameters when integrating DRO. The regularization term can be utilized to shift existing data to unseen domains and generate new data. Furthermore, we propose an approximate gradient-iteration label smoothing algorithm (GI-LS) to achieve the findings and train DNNs. We prove that the shift for the existing data does not influence the convergence of GI-LS. Since GI-LS incorporates a series of hyperparameters, we further consider using Bayesian optimization (BO) to find the relatively optimal combinations of these hyperparameters. Taking small-scale anomaly classification tasks as a case, we evaluate GI-LS, and the results clearly demonstrate its superior performance.
title	Generalizing Few Data to Unseen Domains Flexibly Based on Label Smoothing Integrated with Distributionally Robust Optimization
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2408.05082

Similar Items