Sciweavers

244 search results - page 22 / 49
» Parallel Iterator for Parallelizing Object-Oriented Applicat...
Sort
View
ICPP
2009
IEEE
14 years 2 months ago
LeWI: A Runtime Balancing Algorithm for Nested Parallelism
Abstract—We present LeWI: a novel load balancing algorithm, that can balance applications with very different patterns of imbalance. Our algorithm can balance fine grain imbalan...
Marta Garcia, Julita Corbalán, Jesús...
PLDI
2000
ACM
13 years 12 months ago
Exploiting superword level parallelism with multimedia instruction sets
Increasing focus on multimedia applications has prompted the addition of multimedia extensions to most existing general purpose microprocessors. This added functionality comes pri...
Samuel Larsen, Saman P. Amarasinghe
ICS
2009
Tsinghua U.
14 years 2 months ago
Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs
Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture...
Jiayuan Meng, Kevin Skadron
ISPASS
2010
IEEE
13 years 5 months ago
Weak execution ordering - exploiting iterative methods on many-core GPUs
Abstract--On NVIDIA's many-core GPUs, there is no synchronization function among parallel thread blocks. When finegranularity of data communication and synchronization is requ...
Jianmin Chen, Zhuo Huang, Feiqi Su, Jih-Kwon Peir,...
CVPR
2008
IEEE
14 years 9 months ago
A Parallel Decomposition Solver for SVM: Distributed dual ascend using Fenchel Duality
We introduce a distributed algorithm for solving large scale Support Vector Machines (SVM) problems. The algorithm divides the training set into a number of processing nodes each ...
Tamir Hazan, Amit Man, Amnon Shashua