This paper presents a new compiler optimization algorithm that parallelizes applications for symmetric, sharedmemory multiprocessors. The algorithm considers data locality, parall...
In this paper, we present an efficient algorithm, called CASS-II, for task clustering without task duplication. Unlike the DSC algorithm, which is empirically the best known algor...
We present a parallel code generation algorithm for complete applications and a new experimental methodology that tests the efficacy of our approach. The algorithm optimizes for d...
In this article we introduce a grid runtime system called TGrid which is designed to run hierarchically structured task-parallel programs on heterogenous environments and can also...
As we look to the future, and the prospect of a billion transistors on a chip, it seems inevitable that microprocessors will exploit having multiple parallel threads. To achieve t...