— The potential of truly large scale grids can only be realized with grid architectures and deployment strategies that lower the need for human administrative intervention, and t...
A long-term trend in high-performance computing is the increasing number of nodes in parallel computing platforms, which entails a higher failure probability. Fault tolerant progr...
— Wireless sensor-actuator networks, or WSANs, refer to a group of sensors and actuators which collect data from the environment and perform application-specific actions in resp...
Edith C. H. Ngai, Yangfan Zhou, Michael R. Lyu, Ji...
Fault tolerance is essential to the development of reliable mobile agent system in order to guarantee continuous execution of mobile agents. For this purpose, some previous works ...
SungJin Choi, MaengSoon Baik, HongSoo Kim, JunWeon...
Fault tolerance is an important issue for large machines with tens or hundreds of thousands of processors. Checkpoint-based methods, currently used on most machines, rollback all ...