Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Gomez, Camilo, Wang, Pengyang, Tang, Liansheng
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2602.23336
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917297254301696
author	Gomez, Camilo Wang, Pengyang Tang, Liansheng
author_facet	Gomez, Camilo Wang, Pengyang Tang, Liansheng
contents	Recent advances in machine learning have emphasized the integration of structured optimization components into end-to-end differentiable models, enabling richer inductive biases and tighter alignment with task-specific objectives. In this work, we introduce a novel differentiable approximation to the zero-one loss-long considered the gold standard for classification performance, yet incompatible with gradient-based optimization due to its non-differentiability. Our method constructs a smooth, order-preserving projection onto the n,k-dimensional hypersimplex through a constrained optimization framework, leading to a new operator we term Soft-Binary-Argmax. After deriving its mathematical properties, we show how its Jacobian can be efficiently computed and integrated into binary and multiclass learning systems. Empirically, our approach achieves significant improvements in generalization under large-batch training by imposing geometric consistency constraints on the output logits, thereby narrowing the performance gap traditionally observed in large-batch training.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_23336
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Differentiable Zero-One Loss via Hypersimplex Projections Gomez, Camilo Wang, Pengyang Tang, Liansheng Machine Learning Recent advances in machine learning have emphasized the integration of structured optimization components into end-to-end differentiable models, enabling richer inductive biases and tighter alignment with task-specific objectives. In this work, we introduce a novel differentiable approximation to the zero-one loss-long considered the gold standard for classification performance, yet incompatible with gradient-based optimization due to its non-differentiability. Our method constructs a smooth, order-preserving projection onto the n,k-dimensional hypersimplex through a constrained optimization framework, leading to a new operator we term Soft-Binary-Argmax. After deriving its mathematical properties, we show how its Jacobian can be efficiently computed and integrated into binary and multiclass learning systems. Empirically, our approach achieves significant improvements in generalization under large-batch training by imposing geometric consistency constraints on the output logits, thereby narrowing the performance gap traditionally observed in large-batch training.
title	Differentiable Zero-One Loss via Hypersimplex Projections
topic	Machine Learning
url	https://arxiv.org/abs/2602.23336

Similar Items