Fault tolerance requirements for near term disk array storage systems are analyzed. The excellent reliability provided by RAID Level 5 data organization is seen to be insu cient for these systems. We consider various alternatives { improved MTBF and MTTR times as well as smaller reliability groups and increased numbers of check disks per group { to obtain the necessary improved reliability. The paper begins by introducing two data organization schemes based on maximum distance separable error correcting codes. Several gures of merit are calculated using a standard Markov failure and repair model for these organizations. Based on these results, the multiple check disk approach to improved reliability is an excellent option.
Walter A. Burkhard, Jai Menon