Using multi-GPU systems, including GPU clusters, is gaining popularity in scientific computing. However, when using multiple GPUs concurrently, the conventional data parallel GPU...
Exploiting locality at run-time is a complementary approach to a compiler approach for those applications with dynamic memory access patterns. This paper proposes a memory-layout ...
The Self Distributing Virtual Machine (SDVM) is a parallel computing machine which consists of a cluster of customary computers. The participating machines may have different comp...
—We formulate the problem of delay constrained energy-efficient broadcast in cooperative multihop wireless networks. We show that this important problem is not only NPcomplete, ...
OpenMP has emerged as a widely accepted standard for writing shared memory programs. Hardware-specific extensions such as data placement are usually needed to improve the scalabi...