Saved in:
Bibliographic Details
Main Author: Chen, Zitan
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.12342
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912277198798848
author Chen, Zitan
author_facet Chen, Zitan
contents The number of zeros and the number of ones in a binary string are referred to as the composition of the string, and the prefix-suffix compositions of a string are a multiset formed by the compositions of the prefixes and suffixes of all possible lengths of the string. In this work, we present binary codes of length n in which every codeword can be efficiently reconstructed from its erroneous prefix-suffix compositions with at most t composition errors. All our constructions have decoding complexity polynomial in n and the best of our constructions has constant rate and can correct $t = Θ(n)$ errors. As a comparison, no prior constructions can afford to efficiently correct $t = Θ(n)$ arbitrary composition errors. Additionally, we propose a method of encoding h arbitrary strings of the same length so that they can be reconstructed from the multiset union of their error-free prefix-suffix compositions, at the expense of h-fold coding overhead. In contrast, existing methods can only recover h distinct strings, albeit with code rate asymptotically equal to 1/h. Building on the top of the proposed method, we also present a coding scheme that enables efficient recovery of h strings from their erroneous prefix-suffix compositions with $t = Θ(n)$ errors.
format Preprint
id arxiv_https___arxiv_org_abs_2503_12342
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Coding methods for string reconstruction from erroneous prefix-suffix compositions
Chen, Zitan
Information Theory
The number of zeros and the number of ones in a binary string are referred to as the composition of the string, and the prefix-suffix compositions of a string are a multiset formed by the compositions of the prefixes and suffixes of all possible lengths of the string. In this work, we present binary codes of length n in which every codeword can be efficiently reconstructed from its erroneous prefix-suffix compositions with at most t composition errors. All our constructions have decoding complexity polynomial in n and the best of our constructions has constant rate and can correct $t = Θ(n)$ errors. As a comparison, no prior constructions can afford to efficiently correct $t = Θ(n)$ arbitrary composition errors. Additionally, we propose a method of encoding h arbitrary strings of the same length so that they can be reconstructed from the multiset union of their error-free prefix-suffix compositions, at the expense of h-fold coding overhead. In contrast, existing methods can only recover h distinct strings, albeit with code rate asymptotically equal to 1/h. Building on the top of the proposed method, we also present a coding scheme that enables efficient recovery of h strings from their erroneous prefix-suffix compositions with $t = Θ(n)$ errors.
title Coding methods for string reconstruction from erroneous prefix-suffix compositions
topic Information Theory
url https://arxiv.org/abs/2503.12342