Highly available storage uses replication and other redundant storage to recover from a component failure. If parity data calculated from an erasure correcting code is not updated or becomes otherwise corrupted, recovery from a failure does not recover the correct data but mostly garbled data. This paper presents an algebraic signature scheme that can detect parity discrepancies for parity calculated with XORing, generalized Reed-Solomon codes, or convolutional array codes. Maintaining and checking the signature of client and parity data allows us to ensure coherence in the storage system and thus to accurately rebuild data on lost devices. Our scheme is combined with disk scrubbing, necessary to detect masked disk failures.
Thomas J. E. Schwarz