To achieve high-performance on multicore systems, sharedmemory parallel languages must efficiently implement atomic operations. The commonly used and studied paradigms for atomici...
In this paper, we identify transaction-local memory as a major source of overhead from compiler instrumentation in software transactional memory (STM). Transaction-local memory is...
Aleksandar Dragojevic, Yang Ni, Ali-Reza Adl-Tabat...
With billion-transistor chips on the horizon, single-chip multiprocessors (CMPs) are likely to become commodity components. Speculative CMPs use hardware to enforce dependence, al...
Troy A. Johnson, Rudolf Eigenmann, T. N. Vijaykuma...
The fth release of the multithreaded language Cilk uses a provably good \work-stealing" scheduling algorithm similar to the rst system, but the language has been completely r...
Matteo Frigo, Charles E. Leiserson, Keith H. Randa...
We present Lazy Binary Splitting (LBS), a user-level scheduler of nested parallelism for shared-memory multiprocessors that builds on existing Eager Binary Splitting work-stealing...
Alexandros Tzannes, George C. Caragea, Rajeev Baru...