Sciweavers

481 search results - page 23 / 97
» Performance Modeling and Measurement of Parallelized Code fo...
Sort
View
HPCA
2007
IEEE
14 years 1 months ago
An Adaptive Shared/Private NUCA Cache Partitioning Scheme for Chip Multiprocessors
The significant speed-gap between processor and memory and the limited chip memory bandwidth make last-level cache performance crucial for future chip multiprocessors. To use the...
Haakon Dybdahl, Per Stenström
AES
2011
Springer
232views Cryptology» more  AES 2011»
12 years 7 months ago
Reliable performance prediction for multigrid software on distributed memory systems
We propose a model for describing and predicting the parallel performance of a broad class of parallel numerical software on distributed memory architectures. The purpose of this ...
Giuseppe Romanazzi, Peter K. Jimack, Christopher E...
PPOPP
2003
ACM
14 years 24 days ago
Using generative design patterns to generate parallel code for a distributed memory environment
A design pattern is a mechanism for encapsulating the knowledge of experienced designers into a re-usable artifact. Parallel design patterns reflect commonly occurring parallel co...
Kai Tan, Duane Szafron, Jonathan Schaeffer, John A...
PODC
1994
ACM
13 years 11 months ago
A Checkpoint Protocol for an Entry Consistent Shared Memory System
Workstation clusters are becoming an interesting alternative to dedicated multiprocessors. In this environment, the probability of a failure, during an application's executio...
Nuno Neves, Miguel Castro, Paulo Guedes
IEEEPACT
2007
IEEE
14 years 1 months ago
Fast Track: Supporting Unsafe Optimizations with Software Speculation
The use of multi-core, multi-processor machines is opening new opportunities for software speculation, where program code is speculatively executed to improve performance at the a...
Kirk Kelsey, Chengliang Zhang, Chen Ding