Abstract. Nested data-parallel programs often have large memory requirements due to their high degree of parallelism. Piecewise execution is an implementation technique used to min...
This paper considers the issue of dynamic task control in the context of a parallel Haskell implementation on the GRIP multiprocessor. For the rst time, we report the e ect of our ...
Kevin Hammond, James S. Mattson Jr., Simon L. Peyt...
Abstract. Embedded Runge-Kutta methods are among the most popular methods for the solution of non-stiff initial value problems of ordinary differential equations (ODEs). We investi...
The Tensor Contraction Engine (TCE) is a domain-specific compiler for implementing complex tensor contraction expressions arising in quantum chemistry applications modeling elect...
Data-parallel primitives for performing operations on the PM1 quadtree and the bucket PMR quadtree are presented using the scan model. Algorithms are described for building these ...