Excessive power consumption is becoming a major barrier to extracting the maximum performance from high-performance parallel systems. Therefore, techniques oriented towards reduci...
Abstract-- Multicore microprocessors have been largely motivated by the diminishing returns in performance and the increased power consumption of single-threaded ILP microprocessor...
Matthew Curtis-Maury, Karan Singh, Sally A. McKee,...
In this paper, we introduce the new technique of HighConfidence Software Monitoring (HCSM), which allows one to perform software monitoring with bounded overhead and concomitantl...
Sean Callanan, David J. Dean, Michael Gorbovitski,...
—Coordinated Checkpoint/Restart (C/R) is a widely deployed strategy to achieve fault-tolerance. However, C/R by itself is not capable enough to meet the demands of upcoming exasc...
— Existing workflow systems attempt to achieve high performance by intelligently scheduling tasks on resources, sometimes even attempting to move the largest data files on the hi...