The efforts of an expert to parallelize and optimize a dense linear algebra algorithm for distributed-memory targets are largely mechanical and repetitive. We demonstrate that the...
Bryan Marker, Andy Terrel, Jack Poulson, Don S. Ba...
CPHASH is a concurrent hash table for multicore processors. CPHASH partitions its table across the caches of cores and uses message passing to transfer lookups/inserts to a partit...
Zviad Metreveli, Nickolai Zeldovich, M. Frans Kaas...
The linked-list data structure is fundamental and ubiquitous. Lockfree versions of the linked-list are well known. However, the existence of a practical wait-free linked-list has ...
Shahar Timnat, Anastasia Braginsky, Alex Kogan, Er...
Lock-freedom is a progress guarantee that ensures overall program progress. Wait-freedom is a stronger progress guarantee that ensures the progress of each thread in the program. ...
In program debugging, reproducibility of bugs is a key requirement. Unfortunately, bugs in concurrent programs are notoriously difficult to reproduce because bugs due to concurre...