The growth in complexity of modern systems makes it increasingly difficult to extract high-performance. The software stacks for such systems typically consist of multiple layers a...
While lower supply voltage is effective for energy reduction, it suffers performance loss. To mitigate the loss, we propose to execute only the part, which does not have any influ...
Parallel dataflow programs generate enormous amounts of distributed data that are short-lived, yet are critical for completion of the job and for good run-time performance. We ca...
Steven Y. Ko, Imranul Hoque, Brian Cho, Indranil G...
— Architectural resources and program recurrences are the main limitations to the amount of Instruction-Level Parallelism (ILP) exploitable from loops, the most time-consuming pa...
In this article we present the design choices and the evaluation of a batch scheduler for large clusters, named OAR. This batch scheduler is based upon an original design that emp...
Nicolas Capit, Georges Da Costa, Yiannis Georgiou,...