Saved in:
Bibliographic Details
Main Authors: Sun, Philip, Simcha, David, Dopson, Dave, Guo, Ruiqi, Kumar, Sanjiv
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2404.00774
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910393251659776
author Sun, Philip
Simcha, David
Dopson, Dave
Guo, Ruiqi
Kumar, Sanjiv
author_facet Sun, Philip
Simcha, David
Dopson, Dave
Guo, Ruiqi
Kumar, Sanjiv
contents This paper introduces SOAR: Spilling with Orthogonality-Amplified Residuals, a novel data indexing technique for approximate nearest neighbor (ANN) search. SOAR extends upon previous approaches to ANN search, such as spill trees, that utilize multiple redundant representations while partitioning the data to reduce the probability of missing a nearest neighbor during search. Rather than training and computing these redundant representations independently, however, SOAR uses an orthogonality-amplified residual loss, which optimizes each representation to compensate for cases where other representations perform poorly. This drastically improves the overall index quality, resulting in state-of-the-art ANN benchmark performance while maintaining fast indexing times and low memory consumption.
format Preprint
id arxiv_https___arxiv_org_abs_2404_00774
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle SOAR: Improved Indexing for Approximate Nearest Neighbor Search
Sun, Philip
Simcha, David
Dopson, Dave
Guo, Ruiqi
Kumar, Sanjiv
Machine Learning
This paper introduces SOAR: Spilling with Orthogonality-Amplified Residuals, a novel data indexing technique for approximate nearest neighbor (ANN) search. SOAR extends upon previous approaches to ANN search, such as spill trees, that utilize multiple redundant representations while partitioning the data to reduce the probability of missing a nearest neighbor during search. Rather than training and computing these redundant representations independently, however, SOAR uses an orthogonality-amplified residual loss, which optimizes each representation to compensate for cases where other representations perform poorly. This drastically improves the overall index quality, resulting in state-of-the-art ANN benchmark performance while maintaining fast indexing times and low memory consumption.
title SOAR: Improved Indexing for Approximate Nearest Neighbor Search
topic Machine Learning
url https://arxiv.org/abs/2404.00774