Abstract
The DNA storage channel is considered, in which a codeword is comprised of M unordered DNA molecules. At reading time, N molecules are sampled with replacement, and then each molecule is sequenced. A coded-index concatenated-coding scheme is considered, in which the m th molecule of the codeword is restricted to a subset of all possible molecules (an inner code), which is unique for each m. The decoder has low-complexity, and is based on first decoding each molecule separately (the inner code), and then decoding the sequence of molecules (an outer code). Only mild assumptions are made on the sequencing channel, in the form of the existence of an inner code and decoder with vanishing error. The error probability of a random code as well as an expurgated code is analyzed and shown to decay exponentially with N. This establishes the importance of increasing the coverage depth N/M in order to obtain low error probability.
Original language | English |
---|---|
Pages (from-to) | 7005-7022 |
Number of pages | 18 |
Journal | IEEE Transactions on Information Theory |
Volume | 68 |
Issue number | 11 |
DOIs | |
State | Published - 1 Nov 2022 |
Keywords
- Capacity planning
- Codes
- Concatenated coding
- DNA
- DNA storage
- Decoding
- Encoding
- Error probability
- Sequential analysis
- data storage
- error exponent
- permutation channel
- reliability function
- state-dependent channel
All Science Journal Classification (ASJC) codes
- Information Systems
- Computer Science Applications
- Library and Information Sciences