Despite the reliability of modern disks, recent studies have made it clear that a new class of faults, Undetected Disk Errors (UDEs) also known as silent data corruption events, b...
Eric Rozier, Wendy Belluomini, Veera Deenadhayalan...
As Internet applications become larger and more complex, the task of managing them becomes overwhelming. “Abnormal” events such as software updates, failures, attacks, and hots...
Peter Van Roy, Seif Haridi, Alexander Reinefeld, J...
Long running High Performance Computing (HPC) applications at scale must be able to tolerate inevitable faults if they are to harness current and future HPC systems. Message Passi...
The high demand for large scale storage capacity calls for the availability of massive storage solutions with high performance interconnects. Although cluster file systems are rap...
As the disks typically found in personal computers grow larger, protecting data by replicating it on a collection of “peer” systems rather than on dedicated high performance s...
Dmitry Brodsky, Michael J. Feeley, Norman C. Hutch...