Sciweavers

1521 search results - page 211 / 305
» Kernelized map matching
Sort
View
EUROGRAPHICS
2010
Eurographics
14 years 4 months ago
Fast Ray Sorting and Breadth-First Packet Traversal for GPU Ray Tracing
We present a novel approach to ray tracing execution on commodity graphics hardware using CUDA. We decompose a standard ray tracing algorithm into several data-parallel stages tha...
Kirill Garanzha and Charles Loop
IPPS
2009
IEEE
14 years 2 months ago
Parallel solvers for dense linear systems for heterogeneous computational clusters
This paper describes the design and the implementation of parallel routines in the Heterogeneous ScaLAPACK library that solve a dense system of linear equations. This library is w...
Ravi Reddy Manumachu, Alexey L. Lastovetsky, Pedro...
IPPS
2009
IEEE
14 years 2 months ago
Singular value decomposition on GPU using CUDA
Linear algebra algorithms are fundamental to many computing applications. Modern GPUs are suited for many general purpose processing tasks and have emerged as inexpensive high per...
Sheetal Lahabar, P. J. Narayanan
IPPS
2009
IEEE
14 years 2 months ago
Understanding the design trade-offs among current multicore systems for numerical computations
In this paper, we empirically evaluate fundamental design trade-offs among the most recent multicore processors and accelerator technologies. Our primary aim is to aid application...
Seunghwa Kang, David A. Bader, Richard W. Vuduc
CLUSTER
2008
IEEE
14 years 2 months ago
Context-aware address translation for high performance SMP cluster system
—User-level communication allows an application process to access the network interface directly. Bypassing the kernel requires that a user process accesses the network interface...
Moon-Sang Lee, Joonwon Lee, Seungryoul Maeng