Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wu, Zihan, Huang, Zhaoke, Yan, Hong
Format:	Preprint
Published:	2024
Subjects:	Distributed, Parallel, and Cluster Computing Machine Learning H.2.8
Online Access:	https://arxiv.org/abs/2410.18113
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

Co-clustering simultaneously clusters rows and columns, revealing more fine-grained groups. However, existing co-clustering methods suffer from poor scalability and cannot handle large-scale data. This paper presents a novel and scalable co-clustering method designed to uncover intricate patterns in high-dimensional, large-scale datasets. Specifically, we first propose a large matrix partitioning algorithm that partitions a large matrix into smaller submatrices, enabling parallel co-clustering. This method employs a probabilistic model to optimize the configuration of submatrices, balancing the computational efficiency and depth of analysis. Additionally, we propose a hierarchical co-cluster merging algorithm that efficiently identifies and merges co-clusters from these submatrices, enhancing the robustness and reliability of the process. Extensive evaluations validate the effectiveness and efficiency of our method. Experimental results demonstrate a significant reduction in computation time, with an approximate 83% decrease for dense matrices and up to 30% for sparse matrices.

Similar Items