On the data persistency of replicated erasure codes in distributed storage systems

Roy Friedman, Rafał Kapelko, Karol Marchwicki

Research output: Contribution to journalArticlepeer-review

Abstract

This paper studies the fundamental problem of data persistency for a general family of redundancy schemes, called replicated erasure codes. In replicated erasure codes each document is divided into p chunks and then encoded into p+q chunks. Then, each of the p+q chunks is replicated into r replicas. We analyze two strategies of replicated erasure codes distribution: random (all chunks are spread randomly among storage nodes) and sequential (the chunks are sequentially placed into storage nodes). For both strategies we derive closed-form expression and asymptotic bounds for expected data persistency of replicated erasure codes when the storage nodes leave the storage system and erase their locally stored data. We observe that the maximal expected data persistency of replicated erasure codes for both placement strategies is attained for parameter p=1 and give formulas in terms of the beta function in this case.

Original languageEnglish
Article number105297
JournalInformation and Computation
Volume304
DOIs
StatePublished - May 2025

Keywords

  • Asymptotic
  • Erasure codes
  • Replication
  • Storage system

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics

Cite this