We present a simulation-based performance model to analyze a parallel sparse LU factorization algorithm on modern cached-based, high-end parallel architectures. We consider supern...
In previous work we presented an algorithm for cloning parallel simulations that enables multiple simulated execution paths to be explored simultaneously. The method is targeted f...
The emergence of highly parallel computing platforms is enabling new trade-offs in algorithm design for automatic speech recognition. It naturally motivates the following investig...
Jike Chong, Ekaterina Gonina, Kisun You, Kurt Keut...
We present a parallel priority data structure that improves the running time of certain algorithms for problems that lack a fast and work-efficient parallel solution. As a main a...
Abstract. Bulk Synchronous Parallel ML or BSML is a functional dataparallel language for programming bulk synchronous parallel (BSP) algorithms. The execution time can be estimated...