Sciweavers

2932 search results - page 4 / 587
» Optimizing Memory System Performance for Communication in Pa...
Sort
View
122
Voted
DAGSTUHL
2008
15 years 3 months ago
Improving the Performance of a Verified Linear System Solver Using Optimized Libraries and Parallel Computation
Abstract. A parallel version of the self-verified method for solving linear systems was presented in [19, 18]. In this research we propose improvements aiming at a better performan...
Mariana Luderitz Kolberg, Gerd Bohlender, Dalcidio...
IPPS
2000
IEEE
15 years 6 months ago
Scalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers
Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity ON , where 2  3. We show that such an algorithm can be parallelize...
Keqin Li
IPPS
2003
IEEE
15 years 7 months ago
Global Communication Optimization for Tensor Contraction Expressions under Memory Constraints
The accurate modeling of the electronic structure of atoms and molecules involves computationally intensive tensor contractions involving large multi-dimensional arrays. The effi...
Daniel Cociorva, Xiaoyang Gao, Sandhya Krishnan, G...
147
Voted
HPCA
1999
IEEE
15 years 6 months ago
Limits to the Performance of Software Shared Memory: A Layered Approach
Much research has been done in fast communication on clusters and in protocols for supporting software shared memory across them. However, the end performance of applications that...
Angelos Bilas, Dongming Jiang, Yuanyuan Zhou, Jasw...
111
Voted
CLUSTER
2006
IEEE
15 years 8 months ago
Designing High Performance and Scalable MPI Intra-node Communication Support for Clusters
As new processor and memory architectures advance, clusters start to be built from larger SMP systems, which makes MPI intra-node communication a critical issue in high performanc...
Lei Chai, Albert Hartono, Dhabaleswar K. Panda