Divide and conquer algorithms are a good match for modern parallel machines: they tend to have large amounts of inherent parallelism and they work well with caches and deep memory...
In the future, if we are to continue to expect improved application performance we will have to achieve it by exploiting course-grained hardware parallelism rather then simply rel...
With the increasing availability of high-performance massively parallel computer systems, the prevalence of sophisticated scientific simulation has grown rapidly. The complexity ...
Felipe Bertrand, Randall Bramley, Alan Sussman, Da...
We present an analysis which takes as its input a sequential program, augmented with annotations indicating potential parallelization opportunities, and a sequential proof, writte...
Tuning parallel applications requires the use of effective tools for detecting performance bottlenecks. Along a parallel program execution, many individual situations of performan...