Sciweavers

5 search results - page 1 / 1
» Optimizing data permutations for SIMD devices
Sort
View
PLDI
2006
ACM
14 years 4 months ago
Optimizing data permutations for SIMD devices
The widespread presence of SIMD devices in today’s microprocessors has made compiler techniques for these devices tremendously important. One of the most important and difficul...
Gang Ren, Peng Wu, David A. Padua
APPT
2009
Springer
14 years 5 months ago
Performance Improvement of Multimedia Kernels by Alleviating Overhead Instructions on SIMD Devices
SIMD extension is one of the most common and effective technique to exploit data-level parallelism in today’s processor designs. However, the performance of SIMD architectures i...
Asadollah Shahbahrami, Ben H. H. Juurlink
DATE
2003
IEEE
90views Hardware» more  DATE 2003»
14 years 4 months ago
Interactive Ray Tracing on Reconfigurable SIMD MorphoSys
MorphoSys is a reconfigurable SIMD architecture. In this paper, a BSP-based ray tracing is gracefully mapped onto MorphoSys. The mapping highly exploits ray-tracing parallelism. A...
Haitao Du, Marcos Sanchez-Elez, Nozar Tabrizi, Nad...
INTENSIVE
2009
IEEE
14 years 5 months ago
Accelerating K-Means on the Graphics Processor via CUDA
In this paper an optimized k-means implementation on the graphics processing unit (GPU) is presented. NVIDIA’s Compute Unified Device Architecture (CUDA), available from the G8...
Mario Zechner, Michael Granitzer
AIPR
2008
IEEE
14 years 26 days ago
Low-cost, high-speed computer vision using NVIDIA's CUDA architecture
In this paper, we introduce real time image processing techniques using modern programmable Graphic Processing Units (GPU). GPUs are SIMD (Single Instruction, Multiple Data) device...
Seung In Park, Sean P. Ponce, Jing Huang, Yong Cao...