We study the performance of three parallel algorithms and their hybrid variants for solving tridiagonal linear systems on a GPU: cyclic reduction (CR), parallel cyclic reduction (...
The shift from single to multiple core architectures means that programmers must write concurrent, multithreaded programs in order to increase application performance. Unfortunate...
Emery D. Berger, Ting Yang, Tongping Liu, Gene Nov...
Abstract. In this paper, we propose an approach to automatic compiler parallelization based on language extensions that is applicable to a broader range of program structures and a...
— This paper presents a parallel external-memory algorithm for performing a breadth-first traversal of an implicit graph on a cluster of workstations. The algorithm is a paralle...
—Particle filter is a powerful visual tracking tool based on sequential Monte Carlo framework, and it needs large numbers of samples to properly approximate the posterior density...
Guangyu Zhu, Dawei Liang, Yang Liu, Qingming Huang...