Sciweavers

2932 search results - page 4 / 587
» Optimizing Memory System Performance for Communication in Pa...
Sort
View
DAGSTUHL
2008
13 years 8 months ago
Improving the Performance of a Verified Linear System Solver Using Optimized Libraries and Parallel Computation
Abstract. A parallel version of the self-verified method for solving linear systems was presented in [19, 18]. In this research we propose improvements aiming at a better performan...
Mariana Luderitz Kolberg, Gerd Bohlender, Dalcidio...
IPPS
2000
IEEE
13 years 11 months ago
Scalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers
Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity ON , where 2  3. We show that such an algorithm can be parallelize...
Keqin Li
IPPS
2003
IEEE
14 years 8 days ago
Global Communication Optimization for Tensor Contraction Expressions under Memory Constraints
The accurate modeling of the electronic structure of atoms and molecules involves computationally intensive tensor contractions involving large multi-dimensional arrays. The effi...
Daniel Cociorva, Xiaoyang Gao, Sandhya Krishnan, G...
HPCA
1999
IEEE
13 years 11 months ago
Limits to the Performance of Software Shared Memory: A Layered Approach
Much research has been done in fast communication on clusters and in protocols for supporting software shared memory across them. However, the end performance of applications that...
Angelos Bilas, Dongming Jiang, Yuanyuan Zhou, Jasw...
CLUSTER
2006
IEEE
14 years 1 months ago
Designing High Performance and Scalable MPI Intra-node Communication Support for Clusters
As new processor and memory architectures advance, clusters start to be built from larger SMP systems, which makes MPI intra-node communication a critical issue in high performanc...
Lei Chai, Albert Hartono, Dhabaleswar K. Panda