Enregistré dans:
Détails bibliographiques
Auteurs principaux: Islam, Mohammad Tariqul, Fleischer, Jason W.
Format: Preprint
Publié: 2025
Sujets:
Accès en ligne:https://arxiv.org/abs/2503.09101
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866914508339937280
author Islam, Mohammad Tariqul
Fleischer, Jason W.
author_facet Islam, Mohammad Tariqul
Fleischer, Jason W.
contents Uniform manifold approximation and projection (UMAP) is among the most popular neighbor embedding methods. The method samples pairs of point indices according to similarities in the high-dimensional space, and applies attractive and repulsive forces to their coordinates in the low-dimensional embedding. In this paper, we analyze the forces to reveal their effects on cluster formations and visualization, and compare UMAP to its contemporaries. Repulsion emphasizes differences, controlling cluster boundaries and inter-cluster distance. Attraction is more subtle, as attractive tension between points can manifest simultaneously as attraction and repulsion in the lower-dimensional mapping. This explains the need for learning rate annealing and motivates the different treatments between attractive and repulsive terms. Moreover, by modifying attraction, we improve the consistency of cluster formation under random initialization. Overall, our analysis provides a mechanistic understanding of UMAP and related embedding methods.
format Preprint
id arxiv_https___arxiv_org_abs_2503_09101
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle The Shape of Attraction in UMAP: Exploring the Embedding Forces in Dimensionality Reduction
Islam, Mohammad Tariqul
Fleischer, Jason W.
Machine Learning
Artificial Intelligence
Computer Vision and Pattern Recognition
Uniform manifold approximation and projection (UMAP) is among the most popular neighbor embedding methods. The method samples pairs of point indices according to similarities in the high-dimensional space, and applies attractive and repulsive forces to their coordinates in the low-dimensional embedding. In this paper, we analyze the forces to reveal their effects on cluster formations and visualization, and compare UMAP to its contemporaries. Repulsion emphasizes differences, controlling cluster boundaries and inter-cluster distance. Attraction is more subtle, as attractive tension between points can manifest simultaneously as attraction and repulsion in the lower-dimensional mapping. This explains the need for learning rate annealing and motivates the different treatments between attractive and repulsive terms. Moreover, by modifying attraction, we improve the consistency of cluster formation under random initialization. Overall, our analysis provides a mechanistic understanding of UMAP and related embedding methods.
title The Shape of Attraction in UMAP: Exploring the Embedding Forces in Dimensionality Reduction
topic Machine Learning
Artificial Intelligence
Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2503.09101