This paper explores area/parallelism tradeo s in the design of distributed shared-memory (DSM) multiprocessors built out of large single-chip computing nodes. In this context, are...
Heterogeneous multi-core processors, such as the IBM Cell processor, can deliver high performance. However, these processors are notoriously difficult to program: different cores...
In this paper, we identify transaction-local memory as a major source of overhead from compiler instrumentation in software transactional memory (STM). Transaction-local memory is...
Aleksandar Dragojevic, Yang Ni, Ali-Reza Adl-Tabat...
Most modern Chip Multiprocessors (CMP) feature shared cache on chip. For multithreaded applications, the sharing reduces communication latency among co-running threads, but also r...
In general-purpose microprocessors, recent trends have pushed towards 64-bit word widths, primarily to accommodate the large addressing needs of some programs. Many integer proble...