The computational power and memory bandwidth of graphics processing units (GPUs) have turned them into attractive platforms for general-purpose applications. In this paper, we expl...
Antonio Ruiz, Manuel Ujaldon, Jose Antonio Andrade...
In System on Chip (SoC) design, growing design complexity has esigners to start designs at higher abstraction levels. This paper proposes an SoC design methodology that makes full...
Power consumption and DRAM latencies are serious concerns in modern chip-multiprocessor (CMP or multi-core) based compute systems. The management of the DRAM row buffer can signif...
Kshitij Sudan, Niladrish Chatterjee, David Nellans...
We present a number of optimization techniques to compute prefix sums on linked lists and implement them on multithreaded GPUs using CUDA. Prefix computations on linked structures ...
PM-PVM is a portable implementation of PVM designed to work on SMP architectures supporting multithreading. PM-PVM portability is achieved through the implementation of the PVM fu...