Abstract
Maximum distance separable (MDS) array codes are widely used in storage systems due to their computationally efficient encoding and decoding procedures. An MDS code with r redundancy nodes can correct any r node erasures by accessing (reading) all the remaining information in the surviving nodes. However, in practice, e erasures are a more likely failure event, for some 1 ≤ e < r. Hence, a natural question is how much information do we need to access in order to rebuild e storage nodes. We define the rebuilding ratio as the fraction of remaining information accessed during the rebuilding of e erasures. In our previous work, we constructed MDS codes, called zigzag codes, that achieve the optimal rebuilding ratio of 1/r for the rebuilding of any systematic node when e=1 ; however, all the information needs to be accessed for the rebuilding of the parity node erasure. The (normalized) repair bandwidth is defined as the fraction of information transmitted from the remaining nodes during the rebuilding process. For codes that are not necessarily MDS, Dimakis et al. proposed the regenerating codes framework where any r erasures can be corrected by accessing some of the remaining information, and any e=1 erasure can be rebuilt from some subsets of surviving nodes with optimal repair bandwidth. In this paper, we present three results on rebuilding of codes: 1) we show a fundamental outer bound on the storage size of the node and the repair bandwidth similar to the regenerating codes framework, and show that zigzag codes achieve the optimal rebuilding ratio of e/r for systematic nodes of MDS codes, for any 1 ≤ e ≤ r ; 2) we construct systematic codes that achieve optimal rebuilding ratio of 1/r, for any systematic or parity node erasure; and 3) we present error correction algorithms for zigzag codes, and in particular demonstrate how these codes can be corrected beyond their minimum Hamming distances.
Original language | English |
---|---|
Article number | 7762203 |
Pages (from-to) | 1084-1101 |
Number of pages | 18 |
Journal | IEEE Transactions on Information Theory |
Volume | 63 |
Issue number | 2 |
DOIs | |
State | Published - Feb 2017 |
Keywords
- Distributed storage
- correcting erasures and errors
- multiple erasures
- regenerating codes
All Science Journal Classification (ASJC) codes
- Information Systems
- Computer Science Applications
- Library and Information Sciences