Enregistré dans:
Détails bibliographiques
Auteurs principaux: García-Heredia, Sergio, Fernández, Ángela, Alaíz, Carlos M.
Format: Preprint
Publié: 2025
Sujets:
Accès en ligne:https://arxiv.org/abs/2505.06087
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866912367966683136
author García-Heredia, Sergio
Fernández, Ángela
Alaíz, Carlos M.
author_facet García-Heredia, Sergio
Fernández, Ángela
Alaíz, Carlos M.
contents One of the fundamental problems within the field of machine learning is dimensionality reduction. Dimensionality reduction methods make it possible to combat the so-called curse of dimensionality, visualize high-dimensional data and, in general, improve the efficiency of storing and processing large data sets. One of the best-known nonlinear dimensionality reduction methods is Diffusion Maps. However, despite their virtues, both Diffusion Maps and many other manifold learning methods based on the spectral decomposition of kernel matrices have drawbacks such as the inability to apply them to data outside the initial set, their computational complexity, and high memory costs for large data sets. In this work, we propose to alleviate these problems by resorting to deep learning. Specifically, a new formulation of Diffusion Maps embedding is offered as a solution to a certain unconstrained minimization problem and, based on it, a cost function to train a neural network which computes Diffusion Maps embedding -- both inside and outside the training sample -- without the need to perform any spectral decomposition. The capabilities of this approach are compared on different data sets, both real and synthetic, with those of Diffusion Maps and the Nystrom method.
format Preprint
id arxiv_https___arxiv_org_abs_2505_06087
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Deep Diffusion Maps
García-Heredia, Sergio
Fernández, Ángela
Alaíz, Carlos M.
Machine Learning
One of the fundamental problems within the field of machine learning is dimensionality reduction. Dimensionality reduction methods make it possible to combat the so-called curse of dimensionality, visualize high-dimensional data and, in general, improve the efficiency of storing and processing large data sets. One of the best-known nonlinear dimensionality reduction methods is Diffusion Maps. However, despite their virtues, both Diffusion Maps and many other manifold learning methods based on the spectral decomposition of kernel matrices have drawbacks such as the inability to apply them to data outside the initial set, their computational complexity, and high memory costs for large data sets. In this work, we propose to alleviate these problems by resorting to deep learning. Specifically, a new formulation of Diffusion Maps embedding is offered as a solution to a certain unconstrained minimization problem and, based on it, a cost function to train a neural network which computes Diffusion Maps embedding -- both inside and outside the training sample -- without the need to perform any spectral decomposition. The capabilities of this approach are compared on different data sets, both real and synthetic, with those of Diffusion Maps and the Nystrom method.
title Deep Diffusion Maps
topic Machine Learning
url https://arxiv.org/abs/2505.06087