Sciweavers

224 search results - page 6 / 45
» A Flexible Class of Parallel Matrix Multiplication Algorithm...
Sort
View
CLUSTER
2002
IEEE
14 years 3 months ago
Mixed Mode Matrix Multiplication
In modern clustering environments where the memory hierarchy has many layers (distributed memory, shared memory layer, cache,  ¡ ¢  ), an important question is how to fully u...
Meng-Shiou Wu, Srinivas Aluru, Ricky A. Kendall
EUROPAR
2006
Springer
14 years 2 months ago
Optimization of Dense Matrix Multiplication on IBM Cyclops-64: Challenges and Experiences
Abstract. This paper presents a study of performance optimization of dense matrix multiplication on IBM Cyclops-64(C64) chip architecture. Although much has been published on how t...
Ziang Hu, Juan del Cuvillo, Weirong Zhu, Guang R. ...
EUROPAR
2009
Springer
14 years 5 months ago
High Performance Matrix Multiplication on Many Cores
Moore’s Law suggests that the number of processing cores on a single chip increases exponentially. The future performance increases will be mainly extracted from thread-level par...
Nan Yuan, Yongbin Zhou, Guangming Tan, Junchao Zha...
PPSC
1997
14 years 5 days ago
Parallel Extensions to the Matrix Template Library
We present the preliminary design for a C++ template library to enable the compositional construction of matrix classes suitable for high performance numerical linear algebra comp...
Andrew Lumsdaine, Brian C. McCandless
PC
2002
158views Management» more  PC 2002»
13 years 10 months ago
On parallel block algorithms for exact triangularizations
We present a new parallel algorithm to compute an exact triangularization of large square or rectangular and dense or sparse matrices in any field. Using fast matrix multiplicatio...
Jean-Guillaume Dumas, Jean-Louis Roch