Sciweavers

2020 search results - page 301 / 404
» PAPER - Accelerating parallel evaluations of ROCS
Sort
View
PPOPP
2009
ACM
16 years 4 months ago
A compiler-directed data prefetching scheme for chip multiprocessors
Data prefetching has been widely used in the past as a technique for hiding memory access latencies. However, data prefetching in multi-threaded applications running on chip multi...
Dhruva Chakrabarti, Mahmut T. Kandemir, Mustafa Ka...
PPOPP
2009
ACM
16 years 4 months ago
MPIWiz: subgroup reproducible replay of mpi applications
Message Passing Interface (MPI) is a widely used standard for managing coarse-grained concurrency on distributed computers. Debugging parallel MPI applications, however, has alway...
Ruini Xue, Xuezheng Liu, Ming Wu, Zhenyu Guo, Weng...
PPOPP
2010
ACM
16 years 1 months ago
Is transactional programming actually easier?
Chip multi-processors (CMPs) have become ubiquitous, while tools that ease concurrent programming have not. The promise of increased performance for all applications through ever ...
Christopher J. Rossbach, Owen S. Hofmann, Emmett W...
PPOPP
2010
ACM
16 years 1 months ago
Using data structure knowledge for efficient lock generation and strong atomicity
To achieve high-performance on multicore systems, sharedmemory parallel languages must efficiently implement atomic operations. The commonly used and studied paradigms for atomici...
Gautam Upadhyaya, Samuel P. Midkiff, Vijay S. Pai
CLUSTER
2007
IEEE
15 years 10 months ago
Balancing productivity and performance on the cell broadband engine
— The Cell Broadband Engine (BE) is a heterogeneous multicore processor, combining a general-purpose POWER architecture core with eight independent single-instructionmultiple-dat...
Sadaf R. Alam, Jeremy S. Meredith, Jeffrey S. Vett...