We present a number of optimization techniques to compute prefix sums on linked lists and implement them on multithreaded GPUs using CUDA. Prefix computations on linked structures ...
—Accelerators are special purpose processors designed to speed up compute-intensive sections of applications. Two extreme endpoints in the spectrum of possible accelerators are F...
Shuai Che, Jie Li, Jeremy W. Sheaffer, Kevin Skadr...
We discuss the mapping of elementary ray tracing operations-acceleration structure traversal and primitive intersection--onto wide SIMD/SIMT machines. Our focus is on NVIDIA GPUs,...
With the increasing programmability of GPUs (graphics processing units), these units are emerging as an attractive computing platform not only for traditional graphics computation ...