Speedups demonstrated for finding the biconnected components of a graph: 9x to 33x on the Explicit Multi-Threading (XMT) many-core computing platform relative to the best serial ...
—Much of dense linear algebra has been successfully blocked to concentrate the majority of its time in the Level 3 BLAS, which are not only efficient for serial computation, but...
We develop a serial algorithm for separable median filtering that requires only two comparisons per element when the window size is three. In addition, fast parallel CREW PRAM al...