The significant speed-gap between processor and memory and the limited chip memory bandwidth make last-level cache performance crucial for future chip multiprocessors. To use the...
We propose a model for describing and predicting the parallel performance of a broad class of parallel numerical software on distributed memory architectures. The purpose of this ...
Giuseppe Romanazzi, Peter K. Jimack, Christopher E...
A design pattern is a mechanism for encapsulating the knowledge of experienced designers into a re-usable artifact. Parallel design patterns reflect commonly occurring parallel co...
Kai Tan, Duane Szafron, Jonathan Schaeffer, John A...
Workstation clusters are becoming an interesting alternative to dedicated multiprocessors. In this environment, the probability of a failure, during an application's executio...
The use of multi-core, multi-processor machines is opening new opportunities for software speculation, where program code is speculatively executed to improve performance at the a...