Salvato in:
Dettagli Bibliografici
Autori principali: Yamashita, Tomohiro, Amagata, Daichi, Matsui, Yusuke
Natura: Preprint
Pubblicazione: 2025
Soggetti:
Accesso online:https://arxiv.org/abs/2512.06200
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866917129278717952
author Yamashita, Tomohiro
Amagata, Daichi
Matsui, Yusuke
author_facet Yamashita, Tomohiro
Amagata, Daichi
Matsui, Yusuke
contents Approximate Nearest Neighbor Search (ANNS) has recently gained significant attention due to its many applications, such as Retrieval-Augmented Generation. Such applications require ANNS algorithms that support dynamic data, so the ANNS problem on dynamic data has attracted considerable interest. However, a comprehensive evaluation methodology for data deletion in ANNS has yet to be established. This study proposes an experimental framework and comprehensive evaluation metrics to assess the efficiency of data deletion for ANNS indexes under practical use cases. Specifically, we categorize data deletion methods in graph-based ANNS into three approaches and formalize them mathematically. The performance is assessed in terms of accuracy, query speed, and other relevant metrics. Finally, we apply the proposed evaluation framework to Hierarchical Navigable Small World, one of the state-of-the-art ANNS methods, to analyze the effects of data deletion, and propose Deletion Control, a method which dynamically selects the appropriate deletion method under a required search accuracy.
format Preprint
id arxiv_https___arxiv_org_abs_2512_06200
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle How Should We Evaluate Data Deletion in Graph-Based ANN Indexes?
Yamashita, Tomohiro
Amagata, Daichi
Matsui, Yusuke
Machine Learning
H.3.3; E.1
Approximate Nearest Neighbor Search (ANNS) has recently gained significant attention due to its many applications, such as Retrieval-Augmented Generation. Such applications require ANNS algorithms that support dynamic data, so the ANNS problem on dynamic data has attracted considerable interest. However, a comprehensive evaluation methodology for data deletion in ANNS has yet to be established. This study proposes an experimental framework and comprehensive evaluation metrics to assess the efficiency of data deletion for ANNS indexes under practical use cases. Specifically, we categorize data deletion methods in graph-based ANNS into three approaches and formalize them mathematically. The performance is assessed in terms of accuracy, query speed, and other relevant metrics. Finally, we apply the proposed evaluation framework to Hierarchical Navigable Small World, one of the state-of-the-art ANNS methods, to analyze the effects of data deletion, and propose Deletion Control, a method which dynamically selects the appropriate deletion method under a required search accuracy.
title How Should We Evaluate Data Deletion in Graph-Based ANN Indexes?
topic Machine Learning
H.3.3; E.1
url https://arxiv.org/abs/2512.06200