Saved in:
Bibliographic Details
Main Authors: Naeini, Mohammadreza Tavasoli, Bereyhi, Ali, Noshad, Morteza, Liang, Ben, Hero III, Alfred O.
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2501.07754
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913647837577216
author Naeini, Mohammadreza Tavasoli
Bereyhi, Ali
Noshad, Morteza
Liang, Ben
Hero III, Alfred O.
author_facet Naeini, Mohammadreza Tavasoli
Bereyhi, Ali
Noshad, Morteza
Liang, Ben
Hero III, Alfred O.
contents This work invokes the notion of $f$-divergence to introduce a novel upper bound on the Bayes error rate of a general classification task. We show that the proposed bound can be computed by sampling from the output of a parameterized model. Using this practical interpretation, we introduce the Bayes optimal learning threshold (BOLT) loss whose minimization enforces a classification model to achieve the Bayes error rate. We validate the proposed loss for image and text classification tasks, considering MNIST, Fashion-MNIST, CIFAR-10, and IMDb datasets. Numerical experiments demonstrate that models trained with BOLT achieve performance on par with or exceeding that of cross-entropy, particularly on challenging datasets. This highlights the potential of BOLT in improving generalization.
format Preprint
id arxiv_https___arxiv_org_abs_2501_07754
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Universal Training of Neural Networks to Achieve Bayes Optimal Classification Accuracy
Naeini, Mohammadreza Tavasoli
Bereyhi, Ali
Noshad, Morteza
Liang, Ben
Hero III, Alfred O.
Machine Learning
Computer Vision and Pattern Recognition
Information Theory
Image and Video Processing
Signal Processing
I.2.6; I.5.4
This work invokes the notion of $f$-divergence to introduce a novel upper bound on the Bayes error rate of a general classification task. We show that the proposed bound can be computed by sampling from the output of a parameterized model. Using this practical interpretation, we introduce the Bayes optimal learning threshold (BOLT) loss whose minimization enforces a classification model to achieve the Bayes error rate. We validate the proposed loss for image and text classification tasks, considering MNIST, Fashion-MNIST, CIFAR-10, and IMDb datasets. Numerical experiments demonstrate that models trained with BOLT achieve performance on par with or exceeding that of cross-entropy, particularly on challenging datasets. This highlights the potential of BOLT in improving generalization.
title Universal Training of Neural Networks to Achieve Bayes Optimal Classification Accuracy
topic Machine Learning
Computer Vision and Pattern Recognition
Information Theory
Image and Video Processing
Signal Processing
I.2.6; I.5.4
url https://arxiv.org/abs/2501.07754