The distributed telecommunications sector not only requires minimal time to market, but also software that is reliable, available, maintainable and scalable. High level programmin...
Tw o parallel programming models represented b y OpenMP and MPI are compared for PDE solvers based on regular sparse numerical operators. As a typical representative of such an app...
No cache based techniques for roll-forward fault recovery exist at present. A split-cache approach is proposed that provides e cient support for checkpointing and roll-forward fau...
Commercial Scale-Out is a new research project at IBM Research. Its main goal is to investigate and develop technologies for the use of large scale parallelism in commercial appli...
Validating distributed systems is particularly difficult, since failures may occur due to a correlated occurrence of faults in different parts of the system. This paper describes ...
Michel Cukier, Ramesh Chandra, David Henke, Jessic...