Abstract. Current solutions for fault-tolerance in HPC systems focus on dealing with the result of a failure. However, most are unable to handle runtime system configuration change...
Using virtual machines as a resource provisioning mechanism offers multiple benefits, most recently exploited by “infrastructure-as-a-service” clouds, but also poses several...
A challenging issue facing Grid communities is that while Grids can provide access to many heterogeneous resources, the resources to which access is provided often do not match th...
Ian T. Foster, Timothy Freeman, Katarzyna Keahey, ...
Data access in HPC infrastructures is realized via user-level networking and OS-bypass techniques through which nodes can communicate with high bandwidth and low-latency. Virtualiz...
The Data Diffusion Machine is a scalable virtual shared memory architecture. A hierarchical network is used to ensure that all data can be located in a time bounded by O(logp), wh...
Henk L. Muller, Paul W. A. Stallard, David H. D. W...