Sciweavers

481 search results - page 23 / 97
» Performance Modeling and Measurement of Parallelized Code fo...
Sort
View
129
Voted
HPCA
2007
IEEE
15 years 8 months ago
An Adaptive Shared/Private NUCA Cache Partitioning Scheme for Chip Multiprocessors
The significant speed-gap between processor and memory and the limited chip memory bandwidth make last-level cache performance crucial for future chip multiprocessors. To use the...
Haakon Dybdahl, Per Stenström
204
Voted
AES
2011
Springer
232views Cryptology» more  AES 2011»
14 years 2 months ago
Reliable performance prediction for multigrid software on distributed memory systems
We propose a model for describing and predicting the parallel performance of a broad class of parallel numerical software on distributed memory architectures. The purpose of this ...
Giuseppe Romanazzi, Peter K. Jimack, Christopher E...
154
Voted
PPOPP
2003
ACM
15 years 7 months ago
Using generative design patterns to generate parallel code for a distributed memory environment
A design pattern is a mechanism for encapsulating the knowledge of experienced designers into a re-usable artifact. Parallel design patterns reflect commonly occurring parallel co...
Kai Tan, Duane Szafron, Jonathan Schaeffer, John A...
140
Voted
PODC
1994
ACM
15 years 6 months ago
A Checkpoint Protocol for an Entry Consistent Shared Memory System
Workstation clusters are becoming an interesting alternative to dedicated multiprocessors. In this environment, the probability of a failure, during an application's executio...
Nuno Neves, Miguel Castro, Paulo Guedes
144
Voted
IEEEPACT
2007
IEEE
15 years 8 months ago
Fast Track: Supporting Unsafe Optimizations with Software Speculation
The use of multi-core, multi-processor machines is opening new opportunities for software speculation, where program code is speculatively executed to improve performance at the a...
Kirk Kelsey, Chengliang Zhang, Chen Ding