Commercial cloud offerings, such as Amazon's EC2, let users allocate compute resources on demand, charging based on reserved time intervals. While this gives great flexibilit...
Despite great efforts on the design of ultra-reliable components, the increase of system size and complexity has outpaced the improvement of component reliability. As a result, fa...
Jiexing Gu, Ziming Zheng, Zhiling Lan, John White,...
Declustered data organizations in disk arrays (RAIDs) achieve less-intrusive reconstruction of data after a disk failure. We present PDDL, a new data layout for declustered disk a...
Thomas J. E. Schwarz, Jesse Steinberg, Walter A. B...
Clusters of workstations are increasingly being viewed as a cost-e ective alternative to parallel supercomputers. However, resource management and scheduling on workstations clust...
Abdur Chowdhury, Lisa D. Nicklas, Sanjeev Setia, E...
Failure detectors represent a very important building block in distributed applications. The speed and the accuracy of the failure detectors is critical to the performance of the ...