Grid Resource Discovery Service is currently a very important focus of research. We propose a scheme that presents essential characteristics for efficient, self-configuring and fau...
Dynamic resource management is a crucial part of the infrastructure for emerging distributed real-time embedded systems, responsible for keeping mission-critical applications opera...
Paul Rubel, Joseph P. Loyall, Richard E. Schantz, ...
New application scenarios, such as Internet-scale computations, nomadic networks and mobile systems, require decentralized, scalable and open infrastructures. The peerto-peer (P2P...
To be able to fully exploit ever larger computing platforms, modern HPC applications and system software must be able to tolerate inevitable faults. Historically, MPI implementati...
Joshua Hursey, Jeffrey M. Squyres, Timothy Mattox,...
The demand for an efficient fault tolerance system has led to the development of complex monitoring infrastructure, which in turn has created an overwhelming task of data and even...