Saved in:
Bibliographic Details
Main Authors: Hong, Jiazhen, Qian, Wei, Chen, Yudong, Zhang, Yuqian
Format: Preprint
Published: 2022
Subjects:
Online Access:https://arxiv.org/abs/2201.04822
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914109171171328
author Hong, Jiazhen
Qian, Wei
Chen, Yudong
Zhang, Yuqian
author_facet Hong, Jiazhen
Qian, Wei
Chen, Yudong
Zhang, Yuqian
contents \kmeans clustering is a fundamental problem in many scientific and engineering domains. The optimization problem associated with \kmeans clustering is nonconvex, for which standard algorithms are only guaranteed to find a local optimum. Leveraging the hidden structure of local solutions, we propose a general algorithmic framework for escaping undesirable local solutions and recovering the global solution or the ground truth clustering. This framework consists of iteratively alternating between two steps: (i) detect mis-specified clusters in a local solution, and (ii) improve the local solution by non-local operations. We discuss specific implementation of these steps, and elucidate how the proposed framework unifies many existing variants of \kmeans algorithms through a geometric perspective. We also present two natural variants of the proposed framework, where the initial number of clusters may be over- or under-specified. We provide theoretical justifications and extensive experiments to demonstrate the efficacy of the proposed approach.
format Preprint
id arxiv_https___arxiv_org_abs_2201_04822
institution arXiv
publishDate 2022
record_format arxiv
spellingShingle A Geometric Approach to $k$-means
Hong, Jiazhen
Qian, Wei
Chen, Yudong
Zhang, Yuqian
Machine Learning
\kmeans clustering is a fundamental problem in many scientific and engineering domains. The optimization problem associated with \kmeans clustering is nonconvex, for which standard algorithms are only guaranteed to find a local optimum. Leveraging the hidden structure of local solutions, we propose a general algorithmic framework for escaping undesirable local solutions and recovering the global solution or the ground truth clustering. This framework consists of iteratively alternating between two steps: (i) detect mis-specified clusters in a local solution, and (ii) improve the local solution by non-local operations. We discuss specific implementation of these steps, and elucidate how the proposed framework unifies many existing variants of \kmeans algorithms through a geometric perspective. We also present two natural variants of the proposed framework, where the initial number of clusters may be over- or under-specified. We provide theoretical justifications and extensive experiments to demonstrate the efficacy of the proposed approach.
title A Geometric Approach to $k$-means
topic Machine Learning
url https://arxiv.org/abs/2201.04822