Emerging microprocessors offer unprecedented parallel computing capabilities and deeper memory hierarchies, increasing the importance of loop transformations in optimizing compile...
Selection of spaceborne computing platforms requires balance among several competing factors. Traditional performance analysis techniques are illsuited for this purpose due to the...
We present a novel distributed algorithm for the maximal independent set (MIS) problem.1 On bounded-independence graphs (BIG) our deterministic algorithm finishes in O(log n) time,...
In prior work, we have proposed techniques to extend the ease of shared-memory parallel programming to distributed-memory platforms by automatic translation of OpenMP programs to ...
Improving memory performance at software level is more effective in reducing the rapidly expanding gap between processor and memory performance. Loop transformations (e.g. loop un...
Surendra Byna, Xian-He Sun, William Gropp, Rajeev ...