Competitive parallel execution (CPE) is a simple yet attractive technique to improve the performance of sequential programs on multi-core and multi-processor systems. A sequential...
—One of the main obstacles in obtaining high performance from message-passing multicomputer systems is the inevitable communication overhead which is incurred when tasks executin...
Effective use of the memory hierarchy is critical for achieving high performance on embedded systems. We focus on the class of streaming applications, which is increasingly preval...
Janis Sermulins, William Thies, Rodric M. Rabbah, ...
Recently, multi-core architectures with alternative memory subsystem designs have emerged. Instead of using hardwaremanaged cache hierarchies, they employ software-managed embedde...
Modern GPUs offer much computing power at a very modest cost. Even though CUDA and other related recent developments are accelerating the use of GPUs for general purpose applicati...