We propose strategies to efficiently execute a query workload, which consists of multiple related queries submitted against a scientific dataset, on a distributed-memory system in...
We present a new cache oblivious scheme for iterative stencil computations that performs beyond system bandwidth limitations as though gigabytes of data could reside in an enormou...
Robert Strzodka, Mohammed Shaheen, Dawid Pajak, Ha...
Since the advent of electronic computing, the processors’ clock speed has risen tremendously. Now that energy efficiency requirements have stopped that trend, the number of proc...
In this paper we propose and evaluate the Adaptive++ technique, a novel runtime-only data prefetching strategy for software-based distributed shared-memory systems (software DSMs)...
Ricardo Bianchini, Raquel Pinto, Claudio Luis de A...
This paper describes a novel approach to generate an optimized schedule to run threads on distributed shared memory (DSM) systems. The approach relies upon a binary instrumentatio...