Sciweavers

27 search results - page 5 / 6
» Parallel Cholesky Factorization of a Block Tridiagonal Matri...
Sort
View
EUROPAR
2005
Springer
14 years 1 months ago
Automatic Tuning of PDGEMM Towards Optimal Performance
Sophisticated parallel matrix multiplication algorithms like PDGEMM exhibit a complex structure and can be controlled by a large set of parameters including blocking factors and bl...
Sascha Hunold, Thomas Rauber
HPCC
2007
Springer
14 years 1 months ago
A Block JRS Algorithm for Highly Parallel Computation of SVDs
This paper presents a new algorithm for computing the singular value decomposition (SVD) on multilevel memory hierarchy architectures. This algorithm is based on one-sided JRS iter...
Mostafa I. Soliman, Sanguthevar Rajasekaran, Reda ...
ICPPW
2002
IEEE
14 years 13 days ago
A Programming Methodology for Designing Block Recursive Algorithms on Various Computer Networks
In this paper, we use the tensor product notation as the framework of a programming methodology for designing block recursive algorithms on various computer networks. In our previ...
Min-Hsuan Fan, Chua-Huang Huang, Yeh-Ching Chung
PPOPP
2010
ACM
14 years 4 months ago
Scaling LAPACK panel operations using parallel cache assignment
In LAPACK many matrix operations are cast as block algorithms which iteratively process a panel using an unblocked algorithm and then update a remainder matrix using the high perf...
Anthony M. Castaldo, R. Clint Whaley
ICPP
2002
IEEE
14 years 13 days ago
Analysis of Memory Hierarchy Performance of Block Data Layout
Recently, several experimental studies have been conducted on block data layout as a data transformation technique used in conjunction with tiling to improve cache performance. In...
Neungsoo Park, Bo Hong, Viktor K. Prasanna