Saved in:
Bibliographic Details
Main Authors: Shaham, Uri, Stanton, Kelly, Li, Henry, Nadler, Boaz, Basri, Ronen, Kluger, Yuval
Format: Preprint
Published: 2018
Subjects:
Online Access:https://arxiv.org/abs/1801.01587
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909377179418624
author Shaham, Uri
Stanton, Kelly
Li, Henry
Nadler, Boaz
Basri, Ronen
Kluger, Yuval
author_facet Shaham, Uri
Stanton, Kelly
Li, Henry
Nadler, Boaz
Basri, Ronen
Kluger, Yuval
contents Spectral clustering is a leading and popular technique in unsupervised data analysis. Two of its major limitations are scalability and generalization of the spectral embedding (i.e., out-of-sample-extension). In this paper we introduce a deep learning approach to spectral clustering that overcomes the above shortcomings. Our network, which we call SpectralNet, learns a map that embeds input data points into the eigenspace of their associated graph Laplacian matrix and subsequently clusters them. We train SpectralNet using a procedure that involves constrained stochastic optimization. Stochastic optimization allows it to scale to large datasets, while the constraints, which are implemented using a special-purpose output layer, allow us to keep the network output orthogonal. Moreover, the map learned by SpectralNet naturally generalizes the spectral embedding to unseen data points. To further improve the quality of the clustering, we replace the standard pairwise Gaussian affinities with affinities leaned from unlabeled data using a Siamese network. Additional improvement can be achieved by applying the network to code representations produced, e.g., by standard autoencoders. Our end-to-end learning procedure is fully unsupervised. In addition, we apply VC dimension theory to derive a lower bound on the size of SpectralNet. State-of-the-art clustering results are reported on the Reuters dataset. Our implementation is publicly available at https://github.com/kstant0725/SpectralNet .
format Preprint
id arxiv_https___arxiv_org_abs_1801_01587
institution arXiv
publishDate 2018
record_format arxiv
spellingShingle SpectralNet: Spectral Clustering using Deep Neural Networks
Shaham, Uri
Stanton, Kelly
Li, Henry
Nadler, Boaz
Basri, Ronen
Kluger, Yuval
Machine Learning
Spectral clustering is a leading and popular technique in unsupervised data analysis. Two of its major limitations are scalability and generalization of the spectral embedding (i.e., out-of-sample-extension). In this paper we introduce a deep learning approach to spectral clustering that overcomes the above shortcomings. Our network, which we call SpectralNet, learns a map that embeds input data points into the eigenspace of their associated graph Laplacian matrix and subsequently clusters them. We train SpectralNet using a procedure that involves constrained stochastic optimization. Stochastic optimization allows it to scale to large datasets, while the constraints, which are implemented using a special-purpose output layer, allow us to keep the network output orthogonal. Moreover, the map learned by SpectralNet naturally generalizes the spectral embedding to unseen data points. To further improve the quality of the clustering, we replace the standard pairwise Gaussian affinities with affinities leaned from unlabeled data using a Siamese network. Additional improvement can be achieved by applying the network to code representations produced, e.g., by standard autoencoders. Our end-to-end learning procedure is fully unsupervised. In addition, we apply VC dimension theory to derive a lower bound on the size of SpectralNet. State-of-the-art clustering results are reported on the Reuters dataset. Our implementation is publicly available at https://github.com/kstant0725/SpectralNet .
title SpectralNet: Spectral Clustering using Deep Neural Networks
topic Machine Learning
url https://arxiv.org/abs/1801.01587