Saved in:
Bibliographic Details
Main Authors: Zhao, Peng, Shan, Jia-Wei, Zhang, Yu-Jie, Zhou, Zhi-Hua
Format: Preprint
Published: 2020
Subjects:
Online Access:https://arxiv.org/abs/2002.01605
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911895633526784
author Zhao, Peng
Shan, Jia-Wei
Zhang, Yu-Jie
Zhou, Zhi-Hua
author_facet Zhao, Peng
Shan, Jia-Wei
Zhang, Yu-Jie
Zhou, Zhi-Hua
contents In conventional supervised learning, a training dataset is given with ground-truth labels from a known label set, and the learned model will classify unseen instances to known labels. This paper studies a new problem setting in which there are unknown classes in the training data misperceived as other labels, and thus their existence appears unknown from the given supervision. We attribute the unknown unknowns to the fact that the training dataset is badly advised by the incompletely perceived label space due to the insufficient feature information. To this end, we propose the exploratory machine learning, which examines and investigates training data by actively augmenting the feature space to discover potentially hidden classes. Our method consists of three ingredients including rejection model, feature exploration, and model cascade. We provide theoretical analysis to justify its superiority, and validate the effectiveness on both synthetic and real datasets.
format Preprint
id arxiv_https___arxiv_org_abs_2002_01605
institution arXiv
publishDate 2020
record_format arxiv
spellingShingle Exploratory Machine Learning with Unknown Unknowns
Zhao, Peng
Shan, Jia-Wei
Zhang, Yu-Jie
Zhou, Zhi-Hua
Machine Learning
Artificial Intelligence
In conventional supervised learning, a training dataset is given with ground-truth labels from a known label set, and the learned model will classify unseen instances to known labels. This paper studies a new problem setting in which there are unknown classes in the training data misperceived as other labels, and thus their existence appears unknown from the given supervision. We attribute the unknown unknowns to the fact that the training dataset is badly advised by the incompletely perceived label space due to the insufficient feature information. To this end, we propose the exploratory machine learning, which examines and investigates training data by actively augmenting the feature space to discover potentially hidden classes. Our method consists of three ingredients including rejection model, feature exploration, and model cascade. We provide theoretical analysis to justify its superiority, and validate the effectiveness on both synthetic and real datasets.
title Exploratory Machine Learning with Unknown Unknowns
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2002.01605