Sciweavers

656 search results - page 22 / 132
» Scalable Parallel Matrix Multiplication on Distributed Memor...
Sort
View
CLUSTER
2006
IEEE
14 years 1 months ago
Matrix Multiplication on Two Interconnected Processors
This paper presents a new partitioning algorithm to perform matrix multiplication on two interconnected heterogeneous processors. Data is partitioned in a way which minimizes the ...
Brett A. Becker, Alexey L. Lastovetsky
IPPS
2006
IEEE
14 years 1 months ago
Architecture of a multi-context FPGA using a hybrid multiple-valued/binary context switching signal
Multi-context FPGAs have multiple memory bits per configuration bit forming configuration planes for fast switching between contexts. Large amount of memory causes significant ove...
Yoshihiro Nakatani, Masanori Hariyama, Michitaka K...
PC
2002
158views Management» more  PC 2002»
13 years 7 months ago
On parallel block algorithms for exact triangularizations
We present a new parallel algorithm to compute an exact triangularization of large square or rectangular and dense or sparse matrices in any field. Using fast matrix multiplicatio...
Jean-Guillaume Dumas, Jean-Louis Roch
ICS
1992
Tsinghua U.
13 years 11 months ago
Optimizing for parallelism and data locality
Previous research has used program transformation to introduce parallelism and to exploit data locality. Unfortunately,these twoobjectives have usuallybeen considered independentl...
Ken Kennedy, Kathryn S. McKinley
PARLE
1994
13 years 11 months ago
Run-Time Optimization of Sparse Matrix-Vector Multiplication on SIMD Machines
Sparse matrix-vector multiplication forms the heart of iterative linear solvers used widely in scientific computations (e.g., finite element methods). In such solvers, the matrix-v...
Louis H. Ziantz, Can C. Özturan, Boleslaw K. ...