Recent studies show that, voltage scaling, which is an efficient energy management technique, has a direct and negative effect on system reliability because of the increased rate...
Thread migration moves a single call-stack to another machine to improve either load balancing or locality. Current approaches for checkpointing and thread migration are either no...
ct Fault Tolerant MPI (FT-MPI)[6] was designed as a solution to allow applications different methods to handle process failures beyond simple check-point restart schemes. The init...
Graham E. Fagg, Thara Angskun, George Bosilca, Jel...
PC grids represent massive computation capacity at a low cost, but are challenging to employ for parallel computing because of variable and unpredictable performance and availabili...
Nagarajan Kanna, Jaspal Subhlok, Edgar Gabriel, Es...
We generalize the notion of slice introduced in our earlier paper [6]. A slice of a distributed computation with respect to a global predicate is the smallest computation that cont...