Sciweavers

CLUSTER
2011
IEEE

Achieving Scalable Parallelization for the Hessenberg Factorization

13 years 1 months ago
Achieving Scalable Parallelization for the Hessenberg Factorization
—Much of dense linear algebra has been successfully blocked to concentrate the majority of its time in the Level 3 BLAS, which are not only efficient for serial computation, but also scale well for parallelism. For the Hessenberg factorization, which is a critical step in computing the eigenvalues and vectors, however, performance of the best known algorithm is still strongly limited by the memory speed, which does not tend to scale well at all. In this paper we present an adaptation of our Parallel Cache Assignment (PCA) technique to the Hessenberg factorization, and show that it achieves superlinear speedup over the corresponding serial algorithm and a more than fourfold speedup over the best known algorithm for small and medium sized problems.
Anthony M. Castaldo, R. Clint Whaley
Added 18 Dec 2011
Updated 18 Dec 2011
Type Journal
Year 2011
Where CLUSTER
Authors Anthony M. Castaldo, R. Clint Whaley
Comments (0)