The trend towards many-core systems comes with various issues, among them their highly dynamic and non-predictable workloads. Hence, new paradigms for managing resources of many-c...
Sebastian Kobbe, Lars Bauer, Daniel Lohmann, Wolfg...
Performing experimental evaluation of fault tolerant distributed systems is a complex and tedious task, and automating as much as possible of the execution and evaluation of exper...
Despite many efforts, the predominant practice of debugging a distributed system is still printf-based log mining, which is both tedious and error-prone. In this paper, we present...
Abstract. This paper describes a new and novel scheme for job admission and resource allocation employed by the SODA scheduler in System S. Capable of processing enormous quantitie...
Joel L. Wolf, Nikhil Bansal, Kirsten Hildrum, Suja...
With the increasing functionality and complexity of distributed systems, resource failures are inevitable. While numerous models and algorithms for dealing with failures exist, th...
Derrick Kondo, Bahman Javadi, Alexandru Iosup, Dic...