Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Ting, Kai Ming, Xu, Wei-Jie, Zhang, Hang
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2602.05749
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866918324379582464
author	Ting, Kai Ming Xu, Wei-Jie Zhang, Hang
author_facet	Ting, Kai Ming Xu, Wei-Jie Zhang, Hang
contents	Deep clustering (DC) is often quoted to have a key advantage over $k$-means clustering. Yet, this advantage is often demonstrated using image datasets only, and it is unclear whether it addresses the fundamental limitations of $k$-means clustering. Deep Embedded Clustering (DEC) learns a latent representation via an autoencoder and performs clustering based on a $k$-means-like procedure, while the optimization is conducted in an end-to-end manner. This paper investigates whether the deep-learned representation has enabled DEC to overcome the known fundamental limitations of $k$-means clustering, i.e., its inability to discover clusters of arbitrary shapes, varied sizes and densities. Our investigations on DEC have a wider implication on deep clustering methods in general. Notably, none of these methods exploit the underlying data distribution. We uncover that a non-deep learning approach achieves the intended aim of deep clustering by making use of distributional information of clusters in a dataset to effectively address these fundamental limitations.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_05749
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	How to Achieve the Intended Aim of Deep Clustering Now, without Deep Learning Ting, Kai Ming Xu, Wei-Jie Zhang, Hang Machine Learning Deep clustering (DC) is often quoted to have a key advantage over $k$-means clustering. Yet, this advantage is often demonstrated using image datasets only, and it is unclear whether it addresses the fundamental limitations of $k$-means clustering. Deep Embedded Clustering (DEC) learns a latent representation via an autoencoder and performs clustering based on a $k$-means-like procedure, while the optimization is conducted in an end-to-end manner. This paper investigates whether the deep-learned representation has enabled DEC to overcome the known fundamental limitations of $k$-means clustering, i.e., its inability to discover clusters of arbitrary shapes, varied sizes and densities. Our investigations on DEC have a wider implication on deep clustering methods in general. Notably, none of these methods exploit the underlying data distribution. We uncover that a non-deep learning approach achieves the intended aim of deep clustering by making use of distributional information of clusters in a dataset to effectively address these fundamental limitations.
title	How to Achieve the Intended Aim of Deep Clustering Now, without Deep Learning
topic	Machine Learning
url	https://arxiv.org/abs/2602.05749

Similar Items