While monitoring, instrumented long running parallel applications generate huge amount of instrumentation data. Processing and storing this data incurs overhead, and perturbs the ...
—Modern large-scale grid computing for processing advanced science and engineering applications relies on geographically distributed clusters. In such highly distributed environm...
Daniel M. Batista, Luciano Chaves, Nelson L. S. da...
Abstract. Performance modeling is important for implementing efficient parallel applications and runtime systems. The LogP model captures the relevant aspects of message passing i...
Detecting whether a finite execution trace (or a computation) of a distributed program satisfies a given predicate, called predicate detection, is a fundamental problem in distr...
Abstract -- Detection of execution anomalies is very important for the maintenance, development, and performance refinement of large scale distributed systems. Execution anomalies ...