Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Weißl, Oliver, Abdellatif, Amr, Chen, Xingcheng, Merabishvili, Giorgi, Riccio, Vincenzo, Kacianka, Severin, Stocco, Andrea
Format:	Preprint
Published:	2024
Subjects:	Software Engineering Machine Learning
Online Access:	https://arxiv.org/abs/2408.06258
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912837803180032
author	Weißl, Oliver Abdellatif, Amr Chen, Xingcheng Merabishvili, Giorgi Riccio, Vincenzo Kacianka, Severin Stocco, Andrea
author_facet	Weißl, Oliver Abdellatif, Amr Chen, Xingcheng Merabishvili, Giorgi Riccio, Vincenzo Kacianka, Severin Stocco, Andrea
contents	Evaluating the behavioral boundaries of deep learning (DL) systems is crucial for understanding their reliability across diverse, unseen inputs. Existing solutions fall short as they rely on untargeted random, model- or latent-based perturbations, due to difficulties in generating controlled input variations. In this work, we introduce Mimicry, a novel black-box test generator for fine-grained, targeted exploration of DL system boundaries. Mimicry performs boundary testing by leveraging the probabilistic nature of DL outputs to identify promising directions for exploration. It uses style-based GANs to disentangle input representations into content and style components, enabling controlled feature mixing to approximate the decision boundary. We evaluated Mimicry's effectiveness in generating boundary inputs for five widely used DL image classification systems of increasing complexity, comparing it to two baseline approaches. Our results show that Mimicry consistently identifies inputs closer to the decision boundary. It generates semantically meaningful boundary test cases that reveal new functional (mis)behaviors, while the baselines produce mainly corrupted or invalid inputs. Thanks to its enhanced control over latent space manipulations, Mimicry remains effective as dataset complexity increases, maintaining competitive diversity and higher validity rates, confirmed by human assessors.
format	Preprint
id	arxiv_https___arxiv_org_abs_2408_06258
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Targeted Deep Learning System Boundary Testing Weißl, Oliver Abdellatif, Amr Chen, Xingcheng Merabishvili, Giorgi Riccio, Vincenzo Kacianka, Severin Stocco, Andrea Software Engineering Machine Learning Evaluating the behavioral boundaries of deep learning (DL) systems is crucial for understanding their reliability across diverse, unseen inputs. Existing solutions fall short as they rely on untargeted random, model- or latent-based perturbations, due to difficulties in generating controlled input variations. In this work, we introduce Mimicry, a novel black-box test generator for fine-grained, targeted exploration of DL system boundaries. Mimicry performs boundary testing by leveraging the probabilistic nature of DL outputs to identify promising directions for exploration. It uses style-based GANs to disentangle input representations into content and style components, enabling controlled feature mixing to approximate the decision boundary. We evaluated Mimicry's effectiveness in generating boundary inputs for five widely used DL image classification systems of increasing complexity, comparing it to two baseline approaches. Our results show that Mimicry consistently identifies inputs closer to the decision boundary. It generates semantically meaningful boundary test cases that reveal new functional (mis)behaviors, while the baselines produce mainly corrupted or invalid inputs. Thanks to its enhanced control over latent space manipulations, Mimicry remains effective as dataset complexity increases, maintaining competitive diversity and higher validity rates, confirmed by human assessors.
title	Targeted Deep Learning System Boundary Testing
topic	Software Engineering Machine Learning
url	https://arxiv.org/abs/2408.06258

Similar Items