A central problem in executing performance critical parallel and distributed applications on shared networks is the selection of computation nodes and communication paths for exec...
Divide and conquer algorithms are a good match for modern parallel machines: they tend to have large amounts of inherent parallelism and they work well with caches and deep memory...
The use of threads is becoming commonplace in both sequential and parallel programs. This paper describes our design and initial experience with non-trace based performance instru...
Writing parallel applications for computational grids is a challenging task. To achieve good performance, algorithms designed for local area networks must be adapted to the differ...
Thilo Kielmann, Rutger F. H. Hofman, Henri E. Bal,...