Sciweavers

420 search results - page 4 / 84
» Scalable Parallel Programming with CUDA
Sort
View
PPOPP
2011
ACM
12 years 10 months ago
GRace: a low-overhead mechanism for detecting data races in GPU programs
In recent years, GPUs have emerged as an extremely cost-effective means for achieving high performance. Many application developers, including those with no prior parallel program...
Mai Zheng, Vignesh T. Ravi, Feng Qin, Gagan Agrawa...
IPPS
2010
IEEE
13 years 5 months ago
Optimization of linked list prefix computations on multithreaded GPUs using CUDA
We present a number of optimization techniques to compute prefix sums on linked lists and implement them on multithreaded GPUs using CUDA. Prefix computations on linked structures ...
Zheng Wei, Joseph JáJá
CGO
2010
IEEE
13 years 11 months ago
Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs
In this paper we describe techniques for compiling finegrained SPMD-threaded programs, expressed in programming models such as OpenCL or CUDA, to multicore execution platforms. Pr...
John A. Stratton, Vinod Grover, Jaydeep Marathe, B...
CGF
2008
168views more  CGF 2008»
13 years 7 months ago
Interactive Global Illumination for Deformable Geometry in CUDA
Interactive global illumination for fully deformable scenes with dynamic relighting is currently a very elusive goal in the area of realistic rendering. In this work we propose a ...
Arne Schmitz, Markus Tavenrath, Leif Kobbelt
ISPASS
2009
IEEE
14 years 2 months ago
Analyzing CUDA workloads using a detailed GPU simulator
Modern Graphic Processing Units (GPUs) provide sufficiently flexible programming models that understanding their performance can provide insight in designing tomorrow’s manyco...
Ali Bakhoda, George L. Yuan, Wilson W. L. Fung, He...