Saved in:
Bibliographic Details
Main Author: Molnar, Grant
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2604.20633
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913055303008256
author Molnar, Grant
author_facet Molnar, Grant
contents We define a multi-scale metric $d_ρ$ on strings by aggregating angle distances between all $n$-gram count vectors with exponential weights $ρ^n$. We benchmark $d_ρ$ in DBSCAN clustering against edit and $n$-gram baselines, give a linear-time suffix-tree algorithm for evaluation, prove metric and stability properties (including robustness under tandem-repeat stutters), and characterize isometries.
format Preprint
id arxiv_https___arxiv_org_abs_2604_20633
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle A weighted angle distance on strings
Molnar, Grant
Metric Geometry
Data Structures and Algorithms
Machine Learning
Combinatorics
68R15 (Primary) 54E35, 68W32, 37B10, 62H30 (Secondary)
We define a multi-scale metric $d_ρ$ on strings by aggregating angle distances between all $n$-gram count vectors with exponential weights $ρ^n$. We benchmark $d_ρ$ in DBSCAN clustering against edit and $n$-gram baselines, give a linear-time suffix-tree algorithm for evaluation, prove metric and stability properties (including robustness under tandem-repeat stutters), and characterize isometries.
title A weighted angle distance on strings
topic Metric Geometry
Data Structures and Algorithms
Machine Learning
Combinatorics
68R15 (Primary) 54E35, 68W32, 37B10, 62H30 (Secondary)
url https://arxiv.org/abs/2604.20633