— Fault tolerance in MPI becomes a main issue in the HPC community. Several approaches are envisioned from user or programmer controlled fault tolerance to fully automatic fault ...
Aurelien Bouteiller, Boris Collin, Thomas Hé...
We design a self-stabilizing cluster routing algorithm based on the link-cluster architecture of wireless ad hoc networks. The network is divided into clusters. Each cluster has a...
Doina Bein, Ajoy Kumar Datta, Chakradhar R. Jaggan...
We investigate the behaviour of load-adaptive rerouting policies in the Wardrop model where decisions must be made on the basis of stale information. In this model, an infinite n...
Disk subsystem is known to be a major contributor to overall power consumption of high-end parallel systems. Past research proposed several architectural level techniques to reduc...
Seung Woo Son, Guangyu Chen, Mahmut T. Kandemir, A...
We are developing Virtuoso, a system for distributed computing using virtual machines (VMs). Virtuoso must be able to mix batch and interactive VMs on the same physical hardware, ...