Abstract Barrier synchronizations can be very expensive on multiprogramming environment because no process can go past a barrier until all the processes have arrived. If a process ...
Parallel computers are now commonly used for computational science and engineering, and many applications in these areas use random number generators. For some applications, such ...
In order to achieve high performance, contemporary microprocessors must effectively process the four major instruction types: ALU, branch, load, and store instructions. This paper...
Bryan Black, Brian Mueller, Stephanie Postal, Ryan...
In this paper we propose and evaluate the Adaptive++ technique, a novel runtime-only data prefetching strategy for software-based distributed shared-memory systems (software DSMs)...
Ricardo Bianchini, Raquel Pinto, Claudio Luis de A...