Abstract. A parallel version of the self-verified method for solving linear systems was presented in [19, 18]. In this research we propose improvements aiming at a better performan...
Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity ON , where 2 3. We show that such an algorithm can be parallelize...
The accurate modeling of the electronic structure of atoms and molecules involves computationally intensive tensor contractions involving large multi-dimensional arrays. The effi...
Daniel Cociorva, Xiaoyang Gao, Sandhya Krishnan, G...
Much research has been done in fast communication on clusters and in protocols for supporting software shared memory across them. However, the end performance of applications that...
As new processor and memory architectures advance, clusters start to be built from larger SMP systems, which makes MPI intra-node communication a critical issue in high performanc...