Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity ON , where 2 3. We show that such an algorithm can be parallelize...
During the last half-decade, a number of research efforts have centered around developing software for generating automatically tuned matrix multiplication kernels. These include ...
John A. Gunnels, Fred G. Gustavson, Greg Henry, Ro...
The FFLAS project has established that exact matrix multiplication over finite fields can be performed at the speed of the highly optimized numerical BLAS routines. Since many a...
Abstract. In this paper we consider the problem of computing a minimum cycle basis in a graph G with m edges and n vertices. The edges of G have non-negative weights on them. The p...
Telikepalli Kavitha, Kurt Mehlhorn, Dimitrios Mich...
We reduce the problem of computing the rank and a nullspace basis of a univariate polynomial matrix to polynomial matrix multiplication. For an input n×n matrix of degree d over ...
We introduce a 64-bit ANSI/IEEE Std 754-1985 floating point design of a hardware matrix multiplier optimized for FPGA implementations. A general block matrix multiplication algor...
Yong Dou, Stamatis Vassiliadis, Georgi Kuzmanov, G...
We further develop the group-theoretic approach to fast matrix multiplication introduced by Cohn and Umans, and for the first time use it to derive algorithms asymptotically fast...
In a Batched Colored Intersection Searching Problem (CI), one is given a set of n geometric objects (of a certain class). Each object is colored by one of c colors, and the goal i...
This paper presents a new partitioning algorithm to perform matrix multiplication on two interconnected heterogeneous processors. Data is partitioned in a way which minimizes the ...
Moore’s Law suggests that the number of processing cores on a single chip increases exponentially. The future performance increases will be mainly extracted from thread-level par...
Nan Yuan, Yongbin Zhou, Guangming Tan, Junchao Zha...