Error Probability Bounds for Coded-Index DNA Storage Systems

Research output: Contribution to journalArticlepeer-review

Abstract

The DNA storage channel is considered, in which a codeword is comprised of M unordered DNA molecules. At reading time, N molecules are sampled with replacement, and then each molecule is sequenced. A coded-index concatenated-coding scheme is considered, in which the m th molecule of the codeword is restricted to a subset of all possible molecules (an inner code), which is unique for each m. The decoder has low-complexity, and is based on first decoding each molecule separately (the inner code), and then decoding the sequence of molecules (an outer code). Only mild assumptions are made on the sequencing channel, in the form of the existence of an inner code and decoder with vanishing error. The error probability of a random code as well as an expurgated code is analyzed and shown to decay exponentially with N. This establishes the importance of increasing the coverage depth N/M in order to obtain low error probability.

Original languageEnglish
Pages (from-to)7005-7022
Number of pages18
JournalIEEE Transactions on Information Theory
Volume68
Issue number11
DOIs
StatePublished - 1 Nov 2022

Keywords

  • Capacity planning
  • Codes
  • Concatenated coding
  • DNA
  • DNA storage
  • Decoding
  • Encoding
  • Error probability
  • Sequential analysis
  • data storage
  • error exponent
  • permutation channel
  • reliability function
  • state-dependent channel

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Error Probability Bounds for Coded-Index DNA Storage Systems'. Together they form a unique fingerprint.

Cite this