Abstract Content Delivery Networks (CDNs) provide an efficient support for serving http and streaming media content while minimizing the network impact of content delivery as well...
Commodity computer clusters are often composed of hundreds of computing nodes. These generally off-the-shelf systems are not designed for high reliability. Node failures therefore...
Abstract— We consider reliable multicast in overlay networks where nodes have finite-size buffers and are subject to failures. We address issues of end-to-end reliability and th...
This paper describes a framework for achieving node-level fault tolerance (NLFT) in distributed realtime systems. The objective of NLFT is to mask errors at the node level in orde...
As the desire of scientists to perform ever larger computations drives the size of today’s high performance computers from hundreds, to thousands, and even tens of thousands of ...
Communication and node failures degrade the ability of a service discovery protocol to ensure Users receive the correct service information when the service changes. We propose th...
Vasughi Sundramoorthy, Pieter H. Hartel, Hans Scho...
This paper describes the implementation of a processorgroup membership protocol in an experimental real-time network. The protocol is appropriate for fault-tolerant distributed sy...
Delaunay triangulation (DT) is a useful geometric structure for networking applications. In this paper we investigate the design of join, leave, and maintenance protocols to const...
This paper presents a scalable, adaptive and timebounded general approach to assure reliable, real-time Node-Failure Detection (NFD) for large-scale, high load networks comprised ...
Matthew Gillen, Kurt Rohloff, Prakash Manghwani, R...
The success of sensor-driven applications is reliant on whether a steady stream of data can be provided by the underlying system. This need, however, poses great challenges to sen...