Sciweavers

481 search results - page 60 / 97
» Performance Modeling and Measurement of Parallelized Code fo...
Sort
View
ICA3PP
2005
Springer
14 years 2 months ago
A Practical Comparison of Cluster Operating Systems Implementing Sequential and Transactional Consistency
Shared Memory is an interesting communication paradigm for SMP machines and clusters. Weak consistency models have been proposed to improve efficiency of shared memory applications...
Stefan Frenz, Renaud Lottiaux, Michael Schött...
IPPS
2006
IEEE
14 years 2 months ago
Reducing the associativity and size of step caches in CRCW operation
Step caches are caches in which data entered to an cache array is kept valid only until the end of ongoing step of execution. Together with an advanced pipelined multithreaded arc...
M. Forsell
ICPP
1994
IEEE
14 years 28 days ago
Cachier: A Tool for Automatically Inserting CICO Annotations
Shared memory in a parallel computer provides prowith the valuable abstraction of a shared address space--through which any part of a computation can access any datum. Although un...
Trishul M. Chilimbi, James R. Larus
HPCA
2009
IEEE
14 years 9 months ago
Design and implementation of software-managed caches for multicores with local memory
Heterogeneous multicores, such as Cell BE processors and GPGPUs, typically do not have caches for their accelerator cores because coherence traffic, cache misses, and latencies fr...
Sangmin Seo, Jaejin Lee, Zehra Sura
IPPS
2006
IEEE
14 years 2 months ago
Coterminous locality and coterminous group data prefetching on chip-multiprocessors
Due to shared cache contentions and interconnect delays, data prefetching is more critical in alleviating penalties from increasing memory latencies and demands on Chip-Multiproce...
Xudong Shi, Zhen Yang, Jih-Kwon Peir, Lu Peng, Yen...