Saved in:
Bibliographic Details
Main Authors: Ji, Yushuai, Wang, Sheng, Chen, Zhiyu, Sun, Yuan, Peng, Zhiyong
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2511.20049
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918247502184448
author Ji, Yushuai
Wang, Sheng
Chen, Zhiyu
Sun, Yuan
Peng, Zhiyong
author_facet Ji, Yushuai
Wang, Sheng
Chen, Zhiyu
Sun, Yuan
Peng, Zhiyong
contents Diverse types of edge data, such as 2D geo-locations and 3D point clouds, are collected by sensors like lidar and GPS receivers on edge devices. On-device searches, such as k-nearest neighbor (kNN) search and radius search, are commonly used to enable fast analytics and learning technologies, such as k-means dataset simplification using kNN. To maintain high search efficiency, a representative approach is to utilize a balanced multi-way KD-tree (BMKD-tree). However, the index has shown limited gains, mainly due to substantial construction overhead, inflexibility to real-time insertion, and inconsistent query performance. In this paper, we propose UnIS to address the above limitations. We first accelerate the construction process of the BMKD-tree by utilizing the dataset distribution to predict the splitting hyperplanes. To make the continuously generated data searchable, we propose a selective sub-tree rebuilding scheme to accelerate rebalancing during insertion by reducing the number of data points involved. We then propose an auto-selection model to improve query performance by automatically selecting the optimal search strategy among multiple strategies for an arbitrary query task. Experimental results show that UnIS achieves average speedups of 17.96x in index construction, 1.60x in insertion, 7.15x in kNN search, and 1.09x in radius search compared to the BMKD-tree. We further verify its effectiveness in accelerating dataset simplification on edge devices, achieving a speedup of 217x over Lloyd's algorithm.
format Preprint
id arxiv_https___arxiv_org_abs_2511_20049
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Updatable Balanced Index for Fast On-device Search with Auto-selection Model
Ji, Yushuai
Wang, Sheng
Chen, Zhiyu
Sun, Yuan
Peng, Zhiyong
Databases
Diverse types of edge data, such as 2D geo-locations and 3D point clouds, are collected by sensors like lidar and GPS receivers on edge devices. On-device searches, such as k-nearest neighbor (kNN) search and radius search, are commonly used to enable fast analytics and learning technologies, such as k-means dataset simplification using kNN. To maintain high search efficiency, a representative approach is to utilize a balanced multi-way KD-tree (BMKD-tree). However, the index has shown limited gains, mainly due to substantial construction overhead, inflexibility to real-time insertion, and inconsistent query performance. In this paper, we propose UnIS to address the above limitations. We first accelerate the construction process of the BMKD-tree by utilizing the dataset distribution to predict the splitting hyperplanes. To make the continuously generated data searchable, we propose a selective sub-tree rebuilding scheme to accelerate rebalancing during insertion by reducing the number of data points involved. We then propose an auto-selection model to improve query performance by automatically selecting the optimal search strategy among multiple strategies for an arbitrary query task. Experimental results show that UnIS achieves average speedups of 17.96x in index construction, 1.60x in insertion, 7.15x in kNN search, and 1.09x in radius search compared to the BMKD-tree. We further verify its effectiveness in accelerating dataset simplification on edge devices, achieving a speedup of 217x over Lloyd's algorithm.
title Updatable Balanced Index for Fast On-device Search with Auto-selection Model
topic Databases
url https://arxiv.org/abs/2511.20049