In this paper, we analyze the performance of parallel multithreaded algorithms that use dag-consistent distributed shared memory. Specifically, we analyze execution time, page fau...
Robert D. Blumofe, Matteo Frigo, Christopher F. Jo...
Computing the solution to a system of linear equations is a fundamental problem in scientific computing, and its acceleration has drawn wide interest in the FPGA community [1–3]...
—Much of dense linear algebra has been successfully blocked to concentrate the majority of its time in the Level 3 BLAS, which are not only efficient for serial computation, but...
To meet growing terabit link rates, highly parallel and scalable architectures are needed for IP lookup engines in next generation routers. This paper proposes an SRAM-based multi...
- We present a parallel conjugate gradient solver for the Poisson problem optimized for multi-GPU platforms. Our approach includes a novel heuristic Poisson preconditioner well sui...