Saved in:
Bibliographic Details
Main Authors: Cao, Ruiying, Chen, Xin
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2501.06801
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916697719439360
author Cao, Ruiying
Chen, Xin
author_facet Cao, Ruiying
Chen, Xin
contents DNA storage is now being considered as a new archival storage method for its durability and high information density, but still facing some challenges like high costs and low throughput. By reducing sequencing sample size for decoding digital data, minimizing DNA coverage depth helps lower both costs and system latency. Previous studies have mainly focused on minimizing coverage depth in uniform distribution channels under theoretical assumptions. In contrast, our work uses real DNA storage experimental data to extend this problem to log-normal distribution channels, a conclusion derived from our PCR and sequencing data analysis. In this framework, we investigate both noiseless and noisy channels. We first demonstrate a detailed positive correlation between MDS code rate and the expected minimum sequencing coverage depth. Moreover, we observe that the probability of successfully decoding all information in a single sequencing run decreases and then increases as code rate rises, when the sample size is optimized for complete decoding. Then we extend the lower bounds of the DNA coverage depth from uniform to log-normal noisy channels. The findings of this study provide valuable insights for the efficient execution of DNA storage experiments.
format Preprint
id arxiv_https___arxiv_org_abs_2501_06801
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Optimizing Sequencing Coverage Depth in DNA Storage: Insights From DNA Storage Data
Cao, Ruiying
Chen, Xin
Information Theory
DNA storage is now being considered as a new archival storage method for its durability and high information density, but still facing some challenges like high costs and low throughput. By reducing sequencing sample size for decoding digital data, minimizing DNA coverage depth helps lower both costs and system latency. Previous studies have mainly focused on minimizing coverage depth in uniform distribution channels under theoretical assumptions. In contrast, our work uses real DNA storage experimental data to extend this problem to log-normal distribution channels, a conclusion derived from our PCR and sequencing data analysis. In this framework, we investigate both noiseless and noisy channels. We first demonstrate a detailed positive correlation between MDS code rate and the expected minimum sequencing coverage depth. Moreover, we observe that the probability of successfully decoding all information in a single sequencing run decreases and then increases as code rate rises, when the sample size is optimized for complete decoding. Then we extend the lower bounds of the DNA coverage depth from uniform to log-normal noisy channels. The findings of this study provide valuable insights for the efficient execution of DNA storage experiments.
title Optimizing Sequencing Coverage Depth in DNA Storage: Insights From DNA Storage Data
topic Information Theory
url https://arxiv.org/abs/2501.06801