We present a fast and scalable matrix multiplication algorithm on distributed memory concurrent computers, whose performance is independent of data distribution on processors, and...
Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity ON , where 2 3. We show that such an algorithm can be parallelize...
We present a new fast and scalable matrix multiplication algorithm, called DIMMA Distribution-Independent Matrix Multiplication Algorithm, for block cyclic data distribution on ...
The known fast sequential algorithms for multiplying two N N matrices (over an arbitrary ring) have time complexity ON , where 2 3. The current best value of is less than 2.3755....
—Simplified order-16 Integer Cosine Transform (ICT) has been proved to be an efficient coding tool especially for High-Definition (HD) video coding and is much simpler than ICT a...
Jie Dong, King Ngi Ngan, Chi-Keung Fong, Wai-kuen ...