Salvato in:
Dettagli Bibliografici
Autore principale: Molnar, Grant
Natura: Preprint
Pubblicazione: 2026
Soggetti:
Accesso online:https://arxiv.org/abs/2604.20633
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
Sommario:
  • We define a multi-scale metric $d_ρ$ on strings by aggregating angle distances between all $n$-gram count vectors with exponential weights $ρ^n$. We benchmark $d_ρ$ in DBSCAN clustering against edit and $n$-gram baselines, give a linear-time suffix-tree algorithm for evaluation, prove metric and stability properties (including robustness under tandem-repeat stutters), and characterize isometries.