Sciweavers

131 search results - page 8 / 27
» Automatic thread distribution for nested parallelism in Open...
Sort
View
IPPS
2005
IEEE
14 years 1 months ago
Automated Analysis of Memory Access Behavior
Abstract— We developed an automated environment to measure the memory access behavior of applications on high performance clusters. Code optimization for processor caches is cruc...
Michael Gerndt, Tianchao Li
ICCS
2004
Springer
14 years 24 days ago
Improving Geographical Locality of Data for Shared Memory Implementations of PDE Solvers
On cc-NUMA multi-processors, the non-uniformity of main memory latencies motivates the need for co-location of threads and data. We call this special form of data locality, geogra...
Henrik Löf, Markus Nordén, Sverker Hol...
HPCA
2002
IEEE
14 years 7 months ago
CableS: Thread Control and Memory Management Extensions for Shared Virtual Memory Clusters
Clusters of high-end workstations and PCs are currently used in many application domains to perform large-scale computations or as scalable servers for I/O bound tasks. Although c...
Peter Jamieson, Angelos Bilas
ISHPC
2003
Springer
14 years 18 days ago
Performance Study of a Whole Genome Comparison Tool on a Hyper-Threading Multiprocessor
We developed a multithreaded parallel implementation of a sequence alignment algorithm that is able to align whole genomes with reliable output and reasonable cost. This paper pres...
Juan del Cuvillo, Xinmin Tian, Guang R. Gao, Milin...
ICPP
2009
IEEE
14 years 2 months ago
LeWI: A Runtime Balancing Algorithm for Nested Parallelism
Abstract—We present LeWI: a novel load balancing algorithm, that can balance applications with very different patterns of imbalance. Our algorithm can balance fine grain imbalan...
Marta Garcia, Julita Corbalán, Jesús...