At the core of contemporary high performance computer systems is the communication infrastructure. For this reason, there has been a lot of work on providing low-latency, high-ban...
Sven Karlsson, Stavros Passas, George Kotsis, Ange...
Synchronization operations, such as fence and locking, are used in many parallel operations accessing shared memory. However, a process which is blocked waiting for a fence operat...
Darius Buntinas, Amina Saify, Dhabaleswar K. Panda...
Abstract—The Sparse Matrix-Vector Multiplication kernel exhibits limited potential for taking advantage of modern shared memory architectures due to its large memory bandwidth re...
Kornilios Kourtis, Georgios I. Goumas, Nectarios K...
Commodity computer systems contain more and more processor cores and exhibit increasingly diverse architectural tradeoffs, including memory hierarchies, interconnects, instructio...
Abstract. Shared counters are the key to solving a variety of coordination problems on multiprocessor machines, such as barrier synchronization and index distribution. It is desire...