Saved in:
Bibliographic Details
Main Authors: Lu, Ziyang, Kiah, Han Mao, Zhang, Yiwei, Grass, Robert N., Yaakobi, Eitan
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2405.02080
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913340185378816
author Lu, Ziyang
Kiah, Han Mao
Zhang, Yiwei
Grass, Robert N.
Yaakobi, Eitan
author_facet Lu, Ziyang
Kiah, Han Mao
Zhang, Yiwei
Grass, Robert N.
Yaakobi, Eitan
contents Motivated by DNA based data storage system, we investigate the errors that occur when synthesizing DNA strands in parallel, where each strand is appended one nucleotide at a time by the machine according to a template supersequence. If there is a cycle such that the machine fails, then the strands meant to be appended at this cycle will not be appended, and we refer to this as a synthesis defect. In this paper, we present two families of codes correcting synthesis defects, which are t-known-synthesis-defect correcting codes and t-synthesis-defect correcting codes. For the first one, it is assumed that the defective cycles are known, and each of the codeword is a quaternary sequence. We provide constructions for this family of codes for t = 1, 2, with redundancy log 4 and log n+18 log 3, respectively. For the second one, the codeword is a set of M ordered sequences, and we give constructions for t = 1, 2 to show a strategy for constructing this family of codes. Finally, we derive a lower bound on the redundancy for single-known-synthesis-defect correcting codes, which assures that our construction is almost optimal.
format Preprint
id arxiv_https___arxiv_org_abs_2405_02080
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Coding for Synthesis Defects
Lu, Ziyang
Kiah, Han Mao
Zhang, Yiwei
Grass, Robert N.
Yaakobi, Eitan
Information Theory
Motivated by DNA based data storage system, we investigate the errors that occur when synthesizing DNA strands in parallel, where each strand is appended one nucleotide at a time by the machine according to a template supersequence. If there is a cycle such that the machine fails, then the strands meant to be appended at this cycle will not be appended, and we refer to this as a synthesis defect. In this paper, we present two families of codes correcting synthesis defects, which are t-known-synthesis-defect correcting codes and t-synthesis-defect correcting codes. For the first one, it is assumed that the defective cycles are known, and each of the codeword is a quaternary sequence. We provide constructions for this family of codes for t = 1, 2, with redundancy log 4 and log n+18 log 3, respectively. For the second one, the codeword is a set of M ordered sequences, and we give constructions for t = 1, 2 to show a strategy for constructing this family of codes. Finally, we derive a lower bound on the redundancy for single-known-synthesis-defect correcting codes, which assures that our construction is almost optimal.
title Coding for Synthesis Defects
topic Information Theory
url https://arxiv.org/abs/2405.02080