Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current t...
Arun Babu Nagarajan, Frank Mueller, Christian Enge...
In recent years, exciting technological advances have been made in development of flexible electronics. These technologies offer the opportunity to weave computation, communicat...
Roozbeh Jafari, Foad Dabiri, Philip Brisk, Majid S...
— Large Clusters, high availability clusters and Grid deployments often suffer from network, node or operating system faults and thus require the use of fault tolerant programmin...
Abstract--We present a fault tolerant task pool execution environment that is capable of performing fine-grain selective restart using a lightweight, distributed task completion tr...
James Dinan, Arjun Singri, P. Sadayappan, Sriram K...
The paradigm of mobile agents offers a powerful and flexible ity to develop distributed applications on a high-level of abstraction. One of the most interesting tasks for mobile ag...
Hartmut Vogler, Marie-Luise Moschgath, Thomas Kunk...