The Tensor Contraction Engine (TCE) is a domain-specific compiler for implementing complex tensor contraction expressions arising in quantum chemistry applications modeling elect...
We describe the parallelization of an efficient algorithm for balanced truncation that allows to reduce models with state-space dimension up to O(105 ). The major computational tas...
This paper describes the development of the PALS system, an implementation of Prolog that efficiently exploits or-parallelism on share-nothing platforms. PALS makes use of a novel ...
In this paper we present an efficient algorithm for compile-time scheduling and clustering of parallel programs onto parallel processing systems with distributed memory, which is ...
In this work we introduce and analyze algorithms for fractal image compression on massively parallel SIMD arrays. The di erent algorithms discussed di er signi cantly in terms of ...