Abstract -- Detection of execution anomalies is very important for the maintenance, development, and performance refinement of large scale distributed systems. Execution anomalies ...
We examine the task of concurrently computing alternative solutions to a problem. We restrict our interest to the case where only one of the solutions is needed; in this case we n...
Deploying Grid technologies by distributing an application over several machines has been widely used for scientific simulations, which have large requirements for computational r...
In this paper, we present ParaPART, a parallel version of a mesh partitioning tool, called PART, for distributed systems. PART takes into consideration the heterogeneities in proce...
We present S, the first system to provide transparent, lowoverhead application record-replay and the ability to go live from replayed execution. S i...