Guardado en:
Detalles Bibliográficos
Autores principales: Peng, Dehua, Gui, Zhipeng, Wei, Wenzhang, Li, Fa, Gui, Jie, Wu, Huayi, Gong, Jianya
Formato: Preprint
Publicado: 2024
Materias:
Acceso en línea:https://arxiv.org/abs/2401.01100
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866914034262999040
author Peng, Dehua
Gui, Zhipeng
Wei, Wenzhang
Li, Fa
Gui, Jie
Wu, Huayi
Gong, Jianya
author_facet Peng, Dehua
Gui, Zhipeng
Wei, Wenzhang
Li, Fa
Gui, Jie
Wu, Huayi
Gong, Jianya
contents As a pivotal branch of machine learning, manifold learning uncovers the intrinsic low-dimensional structure within complex nonlinear manifolds in high-dimensional space for visualization, classification, clustering, and gaining key insights. Although existing techniques have achieved remarkable successes, they suffer from extensive distortions of cluster structure, which hinders the understanding of underlying patterns. Scalability issues also limit their applicability for handling large-scale data. We hence propose a sampling-based Scalable manifold learning technique that enables Uniform and Discriminative Embedding, namely SUDE, for large-scale and high-dimensional data. It starts by seeking a set of landmarks to construct the low-dimensional skeleton of the entire data, and then incorporates the non-landmarks into the learned space based on the constrained locally linear embedding (CLLE). We empirically validated the effectiveness of SUDE on synthetic datasets and real-world benchmarks, and applied it to analyze single-cell data and detect anomalies in electrocardiogram (ECG) signals. SUDE exhibits distinct advantage in scalability with respect to data size and embedding dimension, and has promising performance in cluster separation, integrity, and global structure preservation. The experiments also demonstrate notable robustness in embedding quality as the sampling rate decreases.
format Preprint
id arxiv_https___arxiv_org_abs_2401_01100
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Sampling-enabled scalable manifold learning unveils the discriminative cluster structure of high-dimensional data
Peng, Dehua
Gui, Zhipeng
Wei, Wenzhang
Li, Fa
Gui, Jie
Wu, Huayi
Gong, Jianya
Machine Learning
I.5.3
As a pivotal branch of machine learning, manifold learning uncovers the intrinsic low-dimensional structure within complex nonlinear manifolds in high-dimensional space for visualization, classification, clustering, and gaining key insights. Although existing techniques have achieved remarkable successes, they suffer from extensive distortions of cluster structure, which hinders the understanding of underlying patterns. Scalability issues also limit their applicability for handling large-scale data. We hence propose a sampling-based Scalable manifold learning technique that enables Uniform and Discriminative Embedding, namely SUDE, for large-scale and high-dimensional data. It starts by seeking a set of landmarks to construct the low-dimensional skeleton of the entire data, and then incorporates the non-landmarks into the learned space based on the constrained locally linear embedding (CLLE). We empirically validated the effectiveness of SUDE on synthetic datasets and real-world benchmarks, and applied it to analyze single-cell data and detect anomalies in electrocardiogram (ECG) signals. SUDE exhibits distinct advantage in scalability with respect to data size and embedding dimension, and has promising performance in cluster separation, integrity, and global structure preservation. The experiments also demonstrate notable robustness in embedding quality as the sampling rate decreases.
title Sampling-enabled scalable manifold learning unveils the discriminative cluster structure of high-dimensional data
topic Machine Learning
I.5.3
url https://arxiv.org/abs/2401.01100