Sciweavers

201 search results - page 25 / 41
» Estimating the Parallel Start-Up Overhead for Parallelizing ...
Sort
View
PPOPP
2010
ACM
14 years 4 months ago
Using data structure knowledge for efficient lock generation and strong atomicity
To achieve high-performance on multicore systems, sharedmemory parallel languages must efficiently implement atomic operations. The commonly used and studied paradigms for atomici...
Gautam Upadhyaya, Samuel P. Midkiff, Vijay S. Pai
SPAA
2009
ACM
14 years 4 months ago
Optimizing transactions for captured memory
In this paper, we identify transaction-local memory as a major source of overhead from compiler instrumentation in software transactional memory (STM). Transaction-local memory is...
Aleksandar Dragojevic, Yang Ni, Ali-Reza Adl-Tabat...
PLDI
2004
ACM
14 years 29 days ago
Min-cut program decomposition for thread-level speculation
With billion-transistor chips on the horizon, single-chip multiprocessors (CMPs) are likely to become commodity components. Speculative CMPs use hardware to enforce dependence, al...
Troy A. Johnson, Rudolf Eigenmann, T. N. Vijaykuma...
PLDI
1998
ACM
13 years 11 months ago
The Implementation of the Cilk-5 Multithreaded Language
The fth release of the multithreaded language Cilk uses a provably good \work-stealing" scheduling algorithm similar to the rst system, but the language has been completely r...
Matteo Frigo, Charles E. Leiserson, Keith H. Randa...
PPOPP
2010
ACM
14 years 4 months ago
Lazy binary-splitting: a run-time adaptive work-stealing scheduler
We present Lazy Binary Splitting (LBS), a user-level scheduler of nested parallelism for shared-memory multiprocessors that builds on existing Eager Binary Splitting work-stealing...
Alexandros Tzannes, George C. Caragea, Rajeev Baru...