Optimal Rebuilding of Multiple Erasures in MDS Codes

Zhiying Wang, Itzhak Tamo, Jehoshua Bruck

Research output: Contribution to journalArticlepeer-review

Abstract

Maximum distance separable (MDS) array codes are widely used in storage systems due to their computationally efficient encoding and decoding procedures. An MDS code with r redundancy nodes can correct any r node erasures by accessing (reading) all the remaining information in the surviving nodes. However, in practice, e erasures are a more likely failure event, for some 1 ≤ e < r. Hence, a natural question is how much information do we need to access in order to rebuild e storage nodes. We define the rebuilding ratio as the fraction of remaining information accessed during the rebuilding of e erasures. In our previous work, we constructed MDS codes, called zigzag codes, that achieve the optimal rebuilding ratio of 1/r for the rebuilding of any systematic node when e=1 ; however, all the information needs to be accessed for the rebuilding of the parity node erasure. The (normalized) repair bandwidth is defined as the fraction of information transmitted from the remaining nodes during the rebuilding process. For codes that are not necessarily MDS, Dimakis et al. proposed the regenerating codes framework where any r erasures can be corrected by accessing some of the remaining information, and any e=1 erasure can be rebuilt from some subsets of surviving nodes with optimal repair bandwidth. In this paper, we present three results on rebuilding of codes: 1) we show a fundamental outer bound on the storage size of the node and the repair bandwidth similar to the regenerating codes framework, and show that zigzag codes achieve the optimal rebuilding ratio of e/r for systematic nodes of MDS codes, for any 1 ≤ e ≤ r ; 2) we construct systematic codes that achieve optimal rebuilding ratio of 1/r, for any systematic or parity node erasure; and 3) we present error correction algorithms for zigzag codes, and in particular demonstrate how these codes can be corrected beyond their minimum Hamming distances.

Original languageEnglish
Article number7762203
Pages (from-to)1084-1101
Number of pages18
JournalIEEE Transactions on Information Theory
Volume63
Issue number2
DOIs
StatePublished - Feb 2017

Keywords

  • Distributed storage
  • correcting erasures and errors
  • multiple erasures
  • regenerating codes

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Optimal Rebuilding of Multiple Erasures in MDS Codes'. Together they form a unique fingerprint.

Cite this