—Remote atomic memory operations are critical for achieving high-performance synchronization in tightly-coupled systems. Previous approaches to implementing atomic memory operati...
Keith D. Underwood, Michael Levenhagen, K. Scott H...
The Cell BE is a multicore processor with eight vector accelerators (called SPEs) that implement explicit cache management through direct memory access engines. While the Cell has...
Srinivas Chellappa, Franz Franchetti, Markus P&uum...
Parabix (parallel bit streams for XML) is an open-source XML parser that employs the SIMD (single-instruction multiple-data) capabilities of modern-day commodity processors to del...
Tracing algorithms visit reachable nodes in a graph and are central to activities such as garbage collection, marshalling etc. Traditional sequential algorithms use a worklist, re...
Power consumption and DRAM latencies are serious concerns in modern chip-multiprocessor (CMP or multi-core) based compute systems. The management of the DRAM row buffer can signif...
Kshitij Sudan, Niladrish Chatterjee, David Nellans...