OpenMP relies heavily on barrier synchronization to coordinate the work of threads that are performing the computations in a parallel region. A good implementation of barriers is ...
Ramachandra C. Nanjegowda, Oscar Hernandez, Barbar...
This paper presents the initial design of the Cyclops-64 (C64) system software infrastructure and tools under development as a joint effort between IBM T.J. Watson Research Center...
Juan del Cuvillo, Weirong Zhu, Ziang Hu, Guang R. ...
We present a study of multithreaded implementations of Thorup’s algorithm for solving the Single Source Shortest Path (SSSP) problem for undirected graphs. Our implementations l...
Joseph R. Crobak, Jonathan W. Berry, Kamesh Maddur...
Search-based graph queries, such as finding short paths and isomorphic subgraphs, are dominated by memory latency. If input graphs can be partitioned appropriately, large cluster...
Jonathan W. Berry, Bruce Hendrickson, Simon Kahan,...
Simultaneous multithreading (SMT) represents a fundamental shift in processor capability. SMT's ability to execute multiple threads simultaneously within a single CPU offers ...