Sciweavers

1022 search results - page 187 / 205
» Automatic data and computation decomposition on distributed ...
Sort
View
HPCA
2008
IEEE
14 years 8 months ago
Runahead Threads to improve SMT performance
In this paper, we propose Runahead Threads (RaT) as a valuable solution for both reducing resource contention and exploiting memory-level parallelism in Simultaneous Multithreaded...
Tanausú Ramírez, Alex Pajuelo, Olive...
PDP
2010
IEEE
14 years 2 months ago
Lessons Learnt Porting Parallelisation Techniques for Irregular Codes to NUMA Systems
—This work presents a study undertaken to characterise the behaviour of some parallelisation techniques for irregular codes, previously developed for SMP architectures, on a seve...
Juan Angel Lorenzo, Juan Carlos Pichel, David LaFr...
DCOSS
2005
Springer
14 years 1 months ago
A Local Facility Location Algorithm for Sensor Networks
In this paper we address a well-known facility location problem (FLP) in a sensor network environment. The problem deals with finding the optimal way to provide service to a (poss...
Denis Krivitski, Assaf Schuster, Ran Wolff
SCOPES
2004
Springer
14 years 27 days ago
An Integer Linear Programming Approach to Classify the Communication in Process Networks
New embedded signal processing architectures are emerging that are composed of loosely coupled heterogeneous components like CPUs or DSPs, specialized IP cores, reconfigurable uni...
Alexandru Turjan, Bart Kienhuis, Ed F. Deprettere
CCGRID
2011
IEEE
12 years 11 months ago
Small Discrete Fourier Transforms on GPUs
– Efficient implementations of the Discrete Fourier Transform (DFT) for GPUs provide good performance with large data sizes, but are not competitive with CPU code for small data ...
S. Mitra, A. Srinivasan