In the recent past, several different methods for coordinating behavior in multi-robot teams have been proposed. Common to most of them is the use of communication to coordinate b...
Fault tolerance will be a fundamental imperative in the next decade as machines containing hundreds of thousands of cores will be installed at various locations. In this context, ...
Esteban Meneses, Celso L. Mendes, Laxmikant V. Kal...
This paper describes the issues confronted by the climateprediction.net project in creating a volunteer computing project using a large legacy climate model application. This appl...
In this paper, we present a mechanism to capture and reestablish the state of Java threads. We achieve this by extracting a thread's execution state from the application code ...
Eddy Truyen, Bert Robben, Bart Vanhaute, Tim Conin...
-- A hardware fault tolerance scheme for large multicomputers executing time-consuming non-interactive applications is described. Error detection and recovery are done mostly by so...