Abstract. This paper presents a study of performance optimization of dense matrix multiplication on IBM Cyclops-64(C64) chip architecture. Although much has been published on how t...
Ziang Hu, Juan del Cuvillo, Weirong Zhu, Guang R. ...
Abstract. Definitions for the uniform representation of d-dimensional matrices serially in Morton-order (or Z-order) support both their use with cartesian indices, and their divide...
The paper analyzes the performance of parallel global optimization algorithm, which is used to optimize grillage-type foundations. The parallel algorithm is obtained by using the a...
Milda Baravykaite, Rimantas Belevicius, Raimondas ...
In this paper, we use the tensor product notation as the framework of a programming methodology for designing block recursive algorithms. We first express a computational problem ...
This paper presents a new partitioning algorithm to perform matrix multiplication on two interconnected heterogeneous processors. Data is partitioned in a way which minimizes the ...