Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Ni, Shuang, Aumon, Adrien, Wolf, Guy, Moon, Kevin R., Rhodes, Jake S.
Formato:	Preprint
Publicado:	2024
Materias:	Machine Learning
Acceso en línea:	https://arxiv.org/abs/2406.04421
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866929376383205376
author	Ni, Shuang Aumon, Adrien Wolf, Guy Moon, Kevin R. Rhodes, Jake S.
author_facet	Ni, Shuang Aumon, Adrien Wolf, Guy Moon, Kevin R. Rhodes, Jake S.
contents	The value of supervised dimensionality reduction lies in its ability to uncover meaningful connections between data features and labels. Common dimensionality reduction methods embed a set of fixed, latent points, but are not capable of generalizing to an unseen test set. In this paper, we provide an out-of-sample extension method for the random forest-based supervised dimensionality reduction method, RF-PHATE, combining information learned from the random forest model with the function-learning capabilities of autoencoders. Through quantitative assessment of various autoencoder architectures, we identify that networks that reconstruct random forest proximities are more robust for the embedding extension problem. Furthermore, by leveraging proximity-based prototypes, we achieve a 40% reduction in training time without compromising extension quality. Our method does not require label information for out-of-sample points, thus serving as a semi-supervised method, and can achieve consistent quality using only 10% of the training data.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_04421
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Enhancing Supervised Visualization through Autoencoder and Random Forest Proximities for Out-of-Sample Extension Ni, Shuang Aumon, Adrien Wolf, Guy Moon, Kevin R. Rhodes, Jake S. Machine Learning The value of supervised dimensionality reduction lies in its ability to uncover meaningful connections between data features and labels. Common dimensionality reduction methods embed a set of fixed, latent points, but are not capable of generalizing to an unseen test set. In this paper, we provide an out-of-sample extension method for the random forest-based supervised dimensionality reduction method, RF-PHATE, combining information learned from the random forest model with the function-learning capabilities of autoencoders. Through quantitative assessment of various autoencoder architectures, we identify that networks that reconstruct random forest proximities are more robust for the embedding extension problem. Furthermore, by leveraging proximity-based prototypes, we achieve a 40% reduction in training time without compromising extension quality. Our method does not require label information for out-of-sample points, thus serving as a semi-supervised method, and can achieve consistent quality using only 10% of the training data.
title	Enhancing Supervised Visualization through Autoencoder and Random Forest Proximities for Out-of-Sample Extension
topic	Machine Learning
url	https://arxiv.org/abs/2406.04421

Ejemplares similares