Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Molina, Facundo, Naziri, M M Abid, Qin, Feiran, Gorla, Alessandra, d'Amorim, Marcelo
Format:	Preprint
Published:	2026
Subjects:	Software Engineering
Online Access:	https://arxiv.org/abs/2602.03755
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912873584787456
author	Molina, Facundo Naziri, M M Abid Qin, Feiran Gorla, Alessandra d'Amorim, Marcelo
author_facet	Molina, Facundo Naziri, M M Abid Qin, Feiran Gorla, Alessandra d'Amorim, Marcelo
contents	Deep Learning (DL) libraries like TensorFlow and Pytorch simplify machine learning (ML) model development but are prone to bugs due to their complex design. Bug-finding techniques exist, but without precise API specifications, they produce many false alarms. Existing methods to mine API specifications lack accuracy. We explore using ML classifiers to determine input validity. We hypothesize that tensor shapes are a precise abstraction to encode concrete inputs and capture relationships of the data. Shape abstraction severely reduces problem dimensionality, which is important to facilitate ML training. Labeled data are obtained by observing runtime outcomes on a sample of inputs and classifiers are trained on sets of labeled inputs to capture API constraints. Our evaluation, conducted over 183 APIs from TensorFlow and Pytorch, shows that the classifiers generalize well on unseen data with over 91% accuracy. Integrating these classifiers into the pipeline of ACETest, a SoTA bug-finding technique, improves its pass rate from ~29% to ~61%. Our findings suggest that ML-enhanced input classification is an important aid to scale DL library testing.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_03755
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Improving Deep Learning Library Testing with Machine Learning Molina, Facundo Naziri, M M Abid Qin, Feiran Gorla, Alessandra d'Amorim, Marcelo Software Engineering Deep Learning (DL) libraries like TensorFlow and Pytorch simplify machine learning (ML) model development but are prone to bugs due to their complex design. Bug-finding techniques exist, but without precise API specifications, they produce many false alarms. Existing methods to mine API specifications lack accuracy. We explore using ML classifiers to determine input validity. We hypothesize that tensor shapes are a precise abstraction to encode concrete inputs and capture relationships of the data. Shape abstraction severely reduces problem dimensionality, which is important to facilitate ML training. Labeled data are obtained by observing runtime outcomes on a sample of inputs and classifiers are trained on sets of labeled inputs to capture API constraints. Our evaluation, conducted over 183 APIs from TensorFlow and Pytorch, shows that the classifiers generalize well on unseen data with over 91% accuracy. Integrating these classifiers into the pipeline of ACETest, a SoTA bug-finding technique, improves its pass rate from ~29% to ~61%. Our findings suggest that ML-enhanced input classification is an important aid to scale DL library testing.
title	Improving Deep Learning Library Testing with Machine Learning
topic	Software Engineering
url	https://arxiv.org/abs/2602.03755

Similar Items