Saved in:
| Main Author: | |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.12342 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866912277198798848 |
|---|---|
| author | Chen, Zitan |
| author_facet | Chen, Zitan |
| contents | The number of zeros and the number of ones in a binary string are referred to as the composition of the string, and the prefix-suffix compositions of a string are a multiset formed by the compositions of the prefixes and suffixes of all possible lengths of the string. In this work, we present binary codes of length n in which every codeword can be efficiently reconstructed from its erroneous prefix-suffix compositions with at most t composition errors. All our constructions have decoding complexity polynomial in n and the best of our constructions has constant rate and can correct $t = Θ(n)$ errors. As a comparison, no prior constructions can afford to efficiently correct $t = Θ(n)$ arbitrary composition errors.
Additionally, we propose a method of encoding h arbitrary strings of the same length so that they can be reconstructed from the multiset union of their error-free prefix-suffix compositions, at the expense of h-fold coding overhead. In contrast, existing methods can only recover h distinct strings, albeit with code rate asymptotically equal to 1/h. Building on the top of the proposed method, we also present a coding scheme that enables efficient recovery of h strings from their erroneous prefix-suffix compositions with $t = Θ(n)$ errors. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2503_12342 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Coding methods for string reconstruction from erroneous prefix-suffix compositions Chen, Zitan Information Theory The number of zeros and the number of ones in a binary string are referred to as the composition of the string, and the prefix-suffix compositions of a string are a multiset formed by the compositions of the prefixes and suffixes of all possible lengths of the string. In this work, we present binary codes of length n in which every codeword can be efficiently reconstructed from its erroneous prefix-suffix compositions with at most t composition errors. All our constructions have decoding complexity polynomial in n and the best of our constructions has constant rate and can correct $t = Θ(n)$ errors. As a comparison, no prior constructions can afford to efficiently correct $t = Θ(n)$ arbitrary composition errors. Additionally, we propose a method of encoding h arbitrary strings of the same length so that they can be reconstructed from the multiset union of their error-free prefix-suffix compositions, at the expense of h-fold coding overhead. In contrast, existing methods can only recover h distinct strings, albeit with code rate asymptotically equal to 1/h. Building on the top of the proposed method, we also present a coding scheme that enables efficient recovery of h strings from their erroneous prefix-suffix compositions with $t = Θ(n)$ errors. |
| title | Coding methods for string reconstruction from erroneous prefix-suffix compositions |
| topic | Information Theory |
| url | https://arxiv.org/abs/2503.12342 |