Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Sun, Qiang, Zhang, Archer Gong, Liu, Chenyu, Tan, Kean Ming
Format:	Preprint
Published:	2019
Subjects:	Methodology
Online Access:	https://arxiv.org/abs/1906.09581
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909343177244672
author	Sun, Qiang Zhang, Archer Gong Liu, Chenyu Tan, Kean Ming
author_facet	Sun, Qiang Zhang, Archer Gong Liu, Chenyu Tan, Kean Ming
contents	Convex clustering is a convex relaxation of the $k$-means and hierarchical clustering. It involves solving a convex optimization problem with the objective function being a squared error loss plus a fusion penalty that encourages the estimated centroids for observations in the same cluster to be identical. However, when data are contaminated, convex clustering with a squared error loss fails even when there is only one arbitrary outlier. To address this challenge, we propose a resistant convex clustering method. Theoretically, we show that the new estimator is resistant to arbitrary outliers: it does not break down until more than half of the observations are arbitrary outliers. Perhaps surprisingly, the fusion penalty can help enhance resistance by fusing the estimators to the cluster centers of uncontaminated samples, but not the other way around. Numerical studies demonstrate the competitive performance of the proposed method.
format	Preprint
id	arxiv_https___arxiv_org_abs_1906_09581
institution	arXiv
publishDate	2019
record_format	arxiv
spellingShingle	Resistant convex clustering: How does the fusion penalty enhance resistantance? Sun, Qiang Zhang, Archer Gong Liu, Chenyu Tan, Kean Ming Methodology Convex clustering is a convex relaxation of the $k$-means and hierarchical clustering. It involves solving a convex optimization problem with the objective function being a squared error loss plus a fusion penalty that encourages the estimated centroids for observations in the same cluster to be identical. However, when data are contaminated, convex clustering with a squared error loss fails even when there is only one arbitrary outlier. To address this challenge, we propose a resistant convex clustering method. Theoretically, we show that the new estimator is resistant to arbitrary outliers: it does not break down until more than half of the observations are arbitrary outliers. Perhaps surprisingly, the fusion penalty can help enhance resistance by fusing the estimators to the cluster centers of uncontaminated samples, but not the other way around. Numerical studies demonstrate the competitive performance of the proposed method.
title	Resistant convex clustering: How does the fusion penalty enhance resistantance?
topic	Methodology
url	https://arxiv.org/abs/1906.09581

Similar Items