Global addressing of shared data simplifies parallel programming and complements message passing models commonly found in distributed memory machines. A number of programming sys...
Beng-Hong Lim, Chi-Chao Chang, Grzegorz Czajkowski...
Commodity computer clusters are often composed of hundreds of computing nodes. These generally off-the-shelf systems are not designed for high reliability. Node failures therefore...
This paper describes several algorithms to perform all-to-all communication on a two-dimensional mesh connected computer with wormhole routing. We discuss both direct algorithms, ...
In this paper we show the power of sampling techniques in designing efficient distributed algorithms. In particular, we show that using sampling techniques, on some networks, sele...
We examine the ability of CMPs, due to their lower onchip communication latencies, to exploit data parallelism at inner-loop granularities similar to that commonly targeted by vec...