Typical computational grid users target only a single cluster and have to estimate the runtime of their jobs. Job schedulers prefer short-running jobs to maintain a high system ut...
Michael Klemm, Matthias Bezold, Stefan Gabriel, Ro...
This paper argues for an alternative way of designing coordination models for parallel and distributed environments based on a complete symmetry between and decoupling of producers...
Abstract--We present a fault tolerant task pool execution environment that is capable of performing fine-grain selective restart using a lightweight, distributed task completion tr...
James Dinan, Arjun Singri, P. Sadayappan, Sriram K...
Computer systems are increasingly parallel and heterogeneous, while programs are still largely written in sequential languages. The obvious suggestion that the compiler should auto...
Monitoring is a widely-used technique to check assumptions about the real-time behavior of a system, debug the code, or enforce the system to react if certain deadlines are passed...
Daniel Mahrenholz, Olaf Spinczyk, Wolfgang Schr&ou...