Error-Correcting Codes for Combinatorial Composite DNA

Omer Sabary, Inbal Preuss, Ryan Gabrys, Zohar Yakhini, Leon Anavy, Eitan Yaakobi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Data storage in DNA is developing as a possible solution for archival digital data. Recently, to further increase the potential capacity of DNA-based data storage systems, the combinatorial composite DNA synthesis method was suggested. This approach extends the DNA alphabet by harnessing short DNA fragment reagents, known as shortmers. The shortmers are building blocks of the alphabet symbols, each consisting of a fixed number of shortmers. Thus, when information is read, it is possible that one of the shortmers that forms part of the composition of a symbol is missing and therefore the symbol cannot be determined. In this paper, we model this type of error as a type of asymmetric error and propose code constructions that can correct such errors in this setup. We also provide a lower bound on the redundancy of such error-correcting codes and give an explicit encoder and decoder for our construction. Our suggested error model is also supported by an analysis of data from actual experiments that produced DNA according to the combinatorial scheme. Lastly, we also provide a statistical evaluation of the probability of observing such error events, as a function of read depth.

Original languageEnglish
Title of host publication2024 IEEE International Symposium on Information Theory, ISIT 2024 - Proceedings
Pages109-114
Number of pages6
ISBN (Electronic)9798350382846
DOIs
StatePublished - 2024
Event2024 IEEE International Symposium on Information Theory, ISIT 2024 - Athens, Greece
Duration: 7 Jul 202412 Jul 2024

Publication series

NameIEEE International Symposium on Information Theory - Proceedings

Conference

Conference2024 IEEE International Symposium on Information Theory, ISIT 2024
Country/TerritoryGreece
CityAthens
Period7/07/2412/07/24

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Information Systems
  • Modelling and Simulation
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Error-Correcting Codes for Combinatorial Composite DNA'. Together they form a unique fingerprint.

Cite this