Considerable research has focused on the problem of scheduling dynamically arriving independent parallel jobs on a given set of resources. There has also been some recent work in the direction of providing differentiated service to different classes of jobs using statically or dynamically calculated priorities assigned to the jobs. However, the potential and usability of a Quality of Service based scheme has not been much studied. In this paper, we extend a previously proposed scheme (QoPS) to provide Quality of Service to submitted jobs; we propose extensions to the algorithm in multiple aspects: (i) studying the effect of user tolerance towards missed deadlines on the overall profit attainable by the supercomputer center, (ii) providing artificial slack to some jobs to maximize the overall profit and (iii) utilizing a Kill-and-Restart mechanism to further improve the profit attainable.
Mohammad Islam, Pavan Balaji, P. Sadayappan, Dhaba