This paper describes the architecture and implementation of a Java-based appliance for collaborative review of crashes involving injured children in order to determine mechanisms o...
Next generation applications and architectures (for example, Grids) are driving radical changes in the nature of traffic, service models, technology, and cost, creating opportunit...
Tal Lavian, Joe Mambretti, Doug Cutrell, Howard J....
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...
As high performance clusters continue to grow in size, the mean time between failure shrinks. Thus, the issues of fault tolerance and reliability are becoming one of the challengi...
This paper presents three contributions to research on middleware load balancing. First, it describes the design of Cygnus, which is an extensible open-source middleware framework...
Jaiganesh Balasubramanian, Douglas C. Schmidt, Law...