Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.05749 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866918324379582464 |
|---|---|
| author | Ting, Kai Ming Xu, Wei-Jie Zhang, Hang |
| author_facet | Ting, Kai Ming Xu, Wei-Jie Zhang, Hang |
| contents | Deep clustering (DC) is often quoted to have a key advantage over $k$-means clustering. Yet, this advantage is often demonstrated using image datasets only, and it is unclear whether it addresses the fundamental limitations of $k$-means clustering. Deep Embedded Clustering (DEC) learns a latent representation via an autoencoder and performs clustering based on a $k$-means-like procedure, while the optimization is conducted in an end-to-end manner. This paper investigates whether the deep-learned representation has enabled DEC to overcome the known fundamental limitations of $k$-means clustering, i.e., its inability to discover clusters of arbitrary shapes, varied sizes and densities. Our investigations on DEC have a wider implication on deep clustering methods in general. Notably, none of these methods exploit the underlying data distribution. We uncover that a non-deep learning approach achieves the intended aim of deep clustering by making use of distributional information of clusters in a dataset to effectively address these fundamental limitations. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2602_05749 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | How to Achieve the Intended Aim of Deep Clustering Now, without Deep Learning Ting, Kai Ming Xu, Wei-Jie Zhang, Hang Machine Learning Deep clustering (DC) is often quoted to have a key advantage over $k$-means clustering. Yet, this advantage is often demonstrated using image datasets only, and it is unclear whether it addresses the fundamental limitations of $k$-means clustering. Deep Embedded Clustering (DEC) learns a latent representation via an autoencoder and performs clustering based on a $k$-means-like procedure, while the optimization is conducted in an end-to-end manner. This paper investigates whether the deep-learned representation has enabled DEC to overcome the known fundamental limitations of $k$-means clustering, i.e., its inability to discover clusters of arbitrary shapes, varied sizes and densities. Our investigations on DEC have a wider implication on deep clustering methods in general. Notably, none of these methods exploit the underlying data distribution. We uncover that a non-deep learning approach achieves the intended aim of deep clustering by making use of distributional information of clusters in a dataset to effectively address these fundamental limitations. |
| title | How to Achieve the Intended Aim of Deep Clustering Now, without Deep Learning |
| topic | Machine Learning |
| url | https://arxiv.org/abs/2602.05749 |