Sciweavers

104 search results - page 10 / 21
» Optimization Schemas for Parallel Implementation of Nondeter...
Sort
View
160
Voted
CC
2012
Springer
250views System Software» more  CC 2012»
13 years 11 months ago
Improving Performance of OpenCL on CPUs
Abstract. Data-parallel languages like OpenCL and CUDA are an important means to exploit the computational power of today’s computing devices. In this paper, we deal with two asp...
Ralf Karrenberg, Sebastian Hack
137
Voted
CONCURRENCY
2000
101views more  CONCURRENCY 2000»
15 years 3 months ago
Wide-area parallel programming using the remote method invocation model
Java's support for parallel and distributed processing makes the language attractive for metacomputing applications, such as parallel applications that run on geographically ...
Rob van Nieuwpoort, Jason Maassen, Henri E. Bal, T...
157
Voted
LCTRTS
2005
Springer
15 years 9 months ago
Cache aware optimization of stream programs
Effective use of the memory hierarchy is critical for achieving high performance on embedded systems. We focus on the class of streaming applications, which is increasingly preval...
Janis Sermulins, William Thies, Rodric M. Rabbah, ...
222
Voted
ASPLOS
2009
ACM
16 years 4 months ago
DMP: deterministic shared memory multiprocessing
Current shared memory multicore and multiprocessor systems are nondeterministic. Each time these systems execute a multithreaded application, even if supplied with the same input,...
Joseph Devietti, Brandon Lucia, Luis Ceze, Mark Os...
ICFP
2007
ACM
16 years 3 months ago
Feedback directed implicit parallelism
In this paper we present an automated way of using spare CPU resources within a shared memory multi-processor or multi-core machine. Our approach is (i) to profile the execution o...
Tim Harris, Satnam Singh