A Combinatorial Perspective on Random Access Efficiency for DNA Storage

Anina Gruica, Daniella Bar-Lev, Alberto Ravagnani, Eitan Yaakobi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We investigate the fundamental limits of the recently proposed random access coverage depth problem for DNA data storage. Under this paradigm, it is assumed that the user information consists of k information strands, which are encoded into n strands via some generator matrix G. In the sequencing process, the strands are read uniformly at random, since each strand is available in a large number of copies. In this context, the random access coverage depth problem refers to the expected number of reads (i.e., sequenced strands) until it is possible to decode a specific information strand, which is requested by the user. The goal is to minimize the maximum expectation over all possible requested information strands, and this value is denoted by Tmax(G). This paper introduces new techniques to investigate the random access coverage depth problem, which capture its combinatorial nature. We establish two general formulas to find Tmax(G) for arbitrary matrices. We introduce the concept of recovery balanced codes and combine all these results and notions to compute Tmax(G) for MDS, simplex, and Hamming codes. We also study the performance of modified systematic MDS matrices and our results show that the best results for T(G) are achieved with a specific mix of encoded strands and replication of the information strands.

Original languageEnglish
Title of host publication2024 IEEE International Symposium on Information Theory, ISIT 2024 - Proceedings
Pages675-680
Number of pages6
ISBN (Electronic)9798350382846
DOIs
StatePublished - 2024
Event2024 IEEE International Symposium on Information Theory, ISIT 2024 - Athens, Greece
Duration: 7 Jul 202412 Jul 2024

Publication series

NameIEEE International Symposium on Information Theory - Proceedings

Conference

Conference2024 IEEE International Symposium on Information Theory, ISIT 2024
Country/TerritoryGreece
CityAthens
Period7/07/2412/07/24

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Information Systems
  • Modelling and Simulation
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'A Combinatorial Perspective on Random Access Efficiency for DNA Storage'. Together they form a unique fingerprint.

Cite this