Saved in:
| Main Author: | |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.20633 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866913055303008256 |
|---|---|
| author | Molnar, Grant |
| author_facet | Molnar, Grant |
| contents | We define a multi-scale metric $d_ρ$ on strings by aggregating angle distances between all $n$-gram count vectors with exponential weights $ρ^n$. We benchmark $d_ρ$ in DBSCAN clustering against edit and $n$-gram baselines, give a linear-time suffix-tree algorithm for evaluation, prove metric and stability properties (including robustness under tandem-repeat stutters), and characterize isometries. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2604_20633 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | A weighted angle distance on strings Molnar, Grant Metric Geometry Data Structures and Algorithms Machine Learning Combinatorics 68R15 (Primary) 54E35, 68W32, 37B10, 62H30 (Secondary) We define a multi-scale metric $d_ρ$ on strings by aggregating angle distances between all $n$-gram count vectors with exponential weights $ρ^n$. We benchmark $d_ρ$ in DBSCAN clustering against edit and $n$-gram baselines, give a linear-time suffix-tree algorithm for evaluation, prove metric and stability properties (including robustness under tandem-repeat stutters), and characterize isometries. |
| title | A weighted angle distance on strings |
| topic | Metric Geometry Data Structures and Algorithms Machine Learning Combinatorics 68R15 (Primary) 54E35, 68W32, 37B10, 62H30 (Secondary) |
| url | https://arxiv.org/abs/2604.20633 |