Abstract. The Reliable Server Pooling (RSerPool) protocol suite currently under standardization by the IETF is designed to build systems providing highly available services by prov...
Thomas Dreibholz, Erwin P. Rathgeb, Michael Tü...
We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata manage...
Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Da...
Checkpointing is a widely used mechanism for supporting fault tolerance, but notorious in its high-cost disk access. The idea of memory-based checkpointing has been extensively stu...
In Fine-Grained Cycle Sharing (FGCS) systems, machine owners voluntarily share their unused CPU cycles with guest jobs, as long as the performance degradation is tolerable. For gu...
Tanzima Zerin Islam, Saurabh Bagchi, Rudolf Eigenm...
Large-scale distributed systems are hard to deploy, and distributed hash tables (DHTs) are no exception. To lower the barriers facing DHT-based applications, we have created a pub...
Sean C. Rhea, Brighten Godfrey, Brad Karp, John Ku...