Sciweavers

131 search results - page 12 / 27
» Automatic thread distribution for nested parallelism in Open...
Sort
View
PAAPP
2002
76views more  PAAPP 2002»
13 years 7 months ago
Performance of PDE solvers on a self-optimizing NUMA architecture
Abstract. The performance of shared-memory (OpenMP) implementations of three different PDE solver kernels representing finite difference methods, finite volume methods, and spectra...
Sverker Holmgren, Markus Nordén, Jarmo Rant...
IPPS
2006
IEEE
14 years 1 months ago
Recent advances in checkpoint/recovery systems
Checkpoint and Recovery (CPR) systems have many uses in high-performance computing. Because of this, many developers have implemented it, by hand, into their applications. One of ...
Greg Bronevetsky, Rohit Fernandes, Daniel Marques,...
HPCA
1998
IEEE
13 years 11 months ago
The Potential for Using Thread-Level Data Speculation to Facilitate Automatic Parallelization
As we look to the future, and the prospect of a billion transistors on a chip, it seems inevitable that microprocessors will exploit having multiple parallel threads. To achieve t...
J. Gregory Steffan, Todd C. Mowry
ICS
2001
Tsinghua U.
13 years 12 months ago
Computer aided hand tuning (CAHT): "applying case-based reasoning to performance tuning"
For most parallel and high performance systems, tuning guides provide the users with advices to optimize the execution time of their programs. Execution time may be very sensitive...
Antoine Monsifrot, François Bodin
IPPS
2002
IEEE
14 years 10 days ago
Implementing the NAS Benchmark MG in SAC
SAC is a purely functional array processing language designed with numerical applications in mind. It supports generic, high-level program specifications in the style of APL. How...
Clemens Grelck