A long-term trend in high-performance computing is the increasing number of nodes in parallel computing platforms, which entails a higher failure probability. Fault tolerant progr...
Abstract— Content replication and distribution is an effective technology to reduce the response time for web accesses and has been proven quite popular among large Internet cont...
We describe the communication infrastructure (CI) for our fault-tolerant cluster middleware, which is optimized for two classes of communication: for the applications and for the ...
Ming Li, Wenchao Tao, Daniel Goldberg, Israel Hsu,...
This paper addresses the issue of fault-tolerance in applications that make use of network storage. A network abstraction called the Network Storage Stack is presented, along with...
Scott Atchley, Stephen Soltesz, James S. Plank, Mi...
In this paper, we present DKS(N, k, f), a family of infrastructures for building Peer-To-Peer applications. Each instance of DKS(N, k, f) is a fully decentralized overlay network ...
Luc Onana Alima, Sameh El-Ansary, Per Brand, Seif ...