Process migration has been used to perform specialized tasks, such as load sharing and checkpoint/restarting long running applications. Implementation typically consists of modifi...
This paper presents a new modulo scheduling algorithm for clustered microarchitectures. The main feature of the proposed scheme is that the assignment of instructions to clusters ...
Extracting performance from many-core architectures requires software engineers to create multi-threaded applications, which significantly complicates the already daunting task of...
A structured approach to parallel programming allows to construct applications by composing skeletons, i.e., recurring patterns of task- and data-parallelism. First academic and co...
The increase in the use of parallel distributed architectures in order to solve large-scale scientific problems has generated the need for performance prediction for both determi...