Saved in:
| Main Author: | |
|---|---|
| Format: | Preprint |
| Published: |
2020
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2011.04072 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866917866501046272 |
|---|---|
| author | Pepin, Bob |
| author_facet | Pepin, Bob |
| contents | This short note introduces the harmonic indel distance (HID), a new distance between strings where the cost of an insertion or deletion is inversely proportional to the string length. We present a closed-form formula and show that the HID is a proper distance metric. Then we perform an experimental comparison of HID to normalized and unnormalized versions of the indel distance on benchmark tasks for biomedical sequence data. We finally show planar embeddings of the benchmark datasets to provide some insights into the geometry of the metric spaces associated with the different distance metrics. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2011_04072 |
| institution | arXiv |
| publishDate | 2020 |
| record_format | arxiv |
| spellingShingle | The Harmonic Indel Distance Pepin, Bob Discrete Mathematics This short note introduces the harmonic indel distance (HID), a new distance between strings where the cost of an insertion or deletion is inversely proportional to the string length. We present a closed-form formula and show that the HID is a proper distance metric. Then we perform an experimental comparison of HID to normalized and unnormalized versions of the indel distance on benchmark tasks for biomedical sequence data. We finally show planar embeddings of the benchmark datasets to provide some insights into the geometry of the metric spaces associated with the different distance metrics. |
| title | The Harmonic Indel Distance |
| topic | Discrete Mathematics |
| url | https://arxiv.org/abs/2011.04072 |