Sciweavers

155 search results - page 23 / 31
» On the Automatic Parallelization of the Perfect Benchmarks
Sort
View
PPOPP
2006
ACM
14 years 1 months ago
Optimizing irregular shared-memory applications for distributed-memory systems
In prior work, we have proposed techniques to extend the ease of shared-memory parallel programming to distributed-memory platforms by automatic translation of OpenMP programs to ...
Ayon Basumallik, Rudolf Eigenmann
ISCA
1994
IEEE
117views Hardware» more  ISCA 1994»
13 years 12 months ago
Evaluating Stream Buffers as a Secondary Cache Replacement
Today's commodity microprocessors require a low latency memory system to achieve high sustained performance. The conventional high-performance memory system provides fast dat...
Subbarao Palacharla, Richard E. Kessler
MIDDLEWARE
2007
Springer
14 years 2 months ago
Garbage Collecting the Grid: A Complete DGC for Activities
Abstract. Grids are becoming more and more dynamic, running parallel applications on large scale and heterogeneous resources. Explicitly stopping a whole distributed application is...
Denis Caromel, Guillaume Chazarain, Ludovic Henrio
SC
2005
ACM
14 years 1 months ago
The MHETA Execution Model for Heterogeneous Clusters
The availability of inexpensive “off the shelf” machines increases the likelihood that parallel programs run on heterogeneous clusters of machines. These programs are increasi...
Mario Nakazawa, David K. Lowenthal, Wenduo Zhou
ICCS
2004
Springer
14 years 1 months ago
Improving Geographical Locality of Data for Shared Memory Implementations of PDE Solvers
On cc-NUMA multi-processors, the non-uniformity of main memory latencies motivates the need for co-location of threads and data. We call this special form of data locality, geogra...
Henrik Löf, Markus Nordén, Sverker Hol...