This paper describes a compiler for stream programs that efficiently schedules computational kernels and stream memory operations, and allocates on-chip storage. Our compiler uses...
Commodity accelerator technologies including reconfigurable devices provide an order of magnitude performance improvement compared to mainstream microprocessor systems. A number o...
Sadaf R. Alam, Jeffrey S. Vetter, Melissa C. Smith
The thesis of this research is that the task of exposing the parallelism in a given application should be left to the algorithm designer, who has intimate knowledge of the applica...
Clusters of high-end workstations and PCs are currently used in many application domains to perform large-scale computations or as scalable servers for I/O bound tasks. Although cl...
Current research in parallel programming is focused on closing the gap between globally indexed algorithms and the separate address spaces of processors on distributed memory mult...