Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Subiñas, Sergio Muñiz, González, Manuel L., Gómez, Jorge Ruiz, Ali, Alejandro Mata, Martín, Jorge Martínez, Hernando, Miguel Franco, García-Vico, Ángel Miguel
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2510.16075
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914100816117760
author	Subiñas, Sergio Muñiz González, Manuel L. Gómez, Jorge Ruiz Ali, Alejandro Mata Martín, Jorge Martínez Hernando, Miguel Franco García-Vico, Ángel Miguel
author_facet	Subiñas, Sergio Muñiz González, Manuel L. Gómez, Jorge Ruiz Ali, Alejandro Mata Martín, Jorge Martínez Hernando, Miguel Franco García-Vico, Ángel Miguel
contents	This work introduces a post-training quantization (PTQ) method for dense neural networks via a novel ADAROUND-based QUBO formulation. Using the Frobenius distance between the theoretical output and the dequantized output (before the activation function) as the objective, an explicit QUBO whose binary variables represent the rounding choice for each weight and bias is obtained. Additionally, by exploiting the structure of the coefficient QUBO matrix, the global problem can be exactly decomposed into $n$ independent subproblems of size $f+1$, which can be efficiently solved using some heuristics such as simulated annealing. The approach is evaluated on MNIST, Fashion-MNIST, EMNIST, and CIFAR-10 across integer precisions from int8 to int1 and compared with a round-to-nearest traditional quantization methodology.
format	Preprint
id	arxiv_https___arxiv_org_abs_2510_16075
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Optimization of the quantization of dense neural networks from an exact QUBO formulation Subiñas, Sergio Muñiz González, Manuel L. Gómez, Jorge Ruiz Ali, Alejandro Mata Martín, Jorge Martínez Hernando, Miguel Franco García-Vico, Ángel Miguel Machine Learning Artificial Intelligence This work introduces a post-training quantization (PTQ) method for dense neural networks via a novel ADAROUND-based QUBO formulation. Using the Frobenius distance between the theoretical output and the dequantized output (before the activation function) as the objective, an explicit QUBO whose binary variables represent the rounding choice for each weight and bias is obtained. Additionally, by exploiting the structure of the coefficient QUBO matrix, the global problem can be exactly decomposed into $n$ independent subproblems of size $f+1$, which can be efficiently solved using some heuristics such as simulated annealing. The approach is evaluated on MNIST, Fashion-MNIST, EMNIST, and CIFAR-10 across integer precisions from int8 to int1 and compared with a round-to-nearest traditional quantization methodology.
title	Optimization of the quantization of dense neural networks from an exact QUBO formulation
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2510.16075

Similar Items