This paper describes a compiler for stream programs that efficiently schedules computational kernels and stream memory operations, and allocates on-chip storage. Our compiler uses...
Over the past decade, system architectures have started on a clear trend towards increased parallelism and heterogeneity, often resulting in speedups of 10x to 100x. Despite numer...
The increasing availability of commodity multicore processors is making parallel computing available to the masses. Traditional parallel languages are largely intended for large-s...
Matthew Fluet, Mike Rainey, John H. Reppy, Adam Sh...
A fully automatic, compiler-driven approach to parallelisation can result in unpredictable time and space costs for compiled code. On the other hand, a fully manual approach to pa...
We present the architecture of nreduce, a distributed virtual machine which uses parallel graph reduction to run programs across a set of computers. It executes code written in a ...
Peter M. Kelly, Paul D. Coddington, Andrew L. Wend...