ir increasing use of abstraction, modularity, delayed binding, polymorphism, and source reuse, especially when these attributes are used in combination. Modern processor architectu...
Hemant G. Rotithor, Kevin W. Harris, Mark W. Davis
Effective use of communication networks is critical to the performance and scalability of parallel applications. Partitioned Global Address Space languages like UPC bring the pro...
In LAPACK many matrix operations are cast as block algorithms which iteratively process a panel using an unblocked algorithm and then update a remainder matrix using the high perf...
The advantages of pattern-based programming have been well-documented in the sequential literature. However patterns have yet to make their way into mainstream parallel computing,...
Steven Bromling, Steve MacDonald, John Anvik, Jona...
Improving memory performance at software level is more effective in reducing the rapidly expanding gap between processor and memory performance. Loop transformations (e.g. loop un...
Surendra Byna, Xian-He Sun, William Gropp, Rajeev ...