Saved in:
Bibliographic Details
Main Authors: Ting, Kai Ming, Xu, Wei-Jie, Zhang, Hang
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.05749
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918324379582464
author Ting, Kai Ming
Xu, Wei-Jie
Zhang, Hang
author_facet Ting, Kai Ming
Xu, Wei-Jie
Zhang, Hang
contents Deep clustering (DC) is often quoted to have a key advantage over $k$-means clustering. Yet, this advantage is often demonstrated using image datasets only, and it is unclear whether it addresses the fundamental limitations of $k$-means clustering. Deep Embedded Clustering (DEC) learns a latent representation via an autoencoder and performs clustering based on a $k$-means-like procedure, while the optimization is conducted in an end-to-end manner. This paper investigates whether the deep-learned representation has enabled DEC to overcome the known fundamental limitations of $k$-means clustering, i.e., its inability to discover clusters of arbitrary shapes, varied sizes and densities. Our investigations on DEC have a wider implication on deep clustering methods in general. Notably, none of these methods exploit the underlying data distribution. We uncover that a non-deep learning approach achieves the intended aim of deep clustering by making use of distributional information of clusters in a dataset to effectively address these fundamental limitations.
format Preprint
id arxiv_https___arxiv_org_abs_2602_05749
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle How to Achieve the Intended Aim of Deep Clustering Now, without Deep Learning
Ting, Kai Ming
Xu, Wei-Jie
Zhang, Hang
Machine Learning
Deep clustering (DC) is often quoted to have a key advantage over $k$-means clustering. Yet, this advantage is often demonstrated using image datasets only, and it is unclear whether it addresses the fundamental limitations of $k$-means clustering. Deep Embedded Clustering (DEC) learns a latent representation via an autoencoder and performs clustering based on a $k$-means-like procedure, while the optimization is conducted in an end-to-end manner. This paper investigates whether the deep-learned representation has enabled DEC to overcome the known fundamental limitations of $k$-means clustering, i.e., its inability to discover clusters of arbitrary shapes, varied sizes and densities. Our investigations on DEC have a wider implication on deep clustering methods in general. Notably, none of these methods exploit the underlying data distribution. We uncover that a non-deep learning approach achieves the intended aim of deep clustering by making use of distributional information of clusters in a dataset to effectively address these fundamental limitations.
title How to Achieve the Intended Aim of Deep Clustering Now, without Deep Learning
topic Machine Learning
url https://arxiv.org/abs/2602.05749