Sciweavers

39 search results - page 4 / 8
» Optimized Dense Matrix Multiplication on a Many-Core Archite...
Sort
View
ICCS
2009
Springer
14 years 2 months ago
Generating Empirically Optimized Composed Matrix Kernels from MATLAB Prototypes
The development of optimized codes is time-consuming and requires extensive architecture, compiler, and language expertise, therefore, computational scientists are often forced to ...
Boyana Norris, Albert Hartono, Elizabeth R. Jessup...
ICCS
2009
Springer
14 years 2 months ago
A Note on Auto-tuning GEMM for GPUs
The development of high performance dense linear algebra (DLA) critically depends on highly optimized BLAS, and especially on the matrix multiplication routine (GEMM). This is espe...
Yinan Li, Jack Dongarra, Stanimire Tomov
DAC
2004
ACM
14 years 8 months ago
Sparse transformations and preconditioners for hierarchical 3-D capacitance extraction with multiple dielectrics
Capacitance extraction is an important problem that has been extensively studied. This paper presents a significant improvement for the fast multipole accelerated boundary element...
Shu Yan, Vivek Sarin, Weiping Shi
CSE
2009
IEEE
13 years 11 months ago
A Comparative Study of Blocking Storage Methods for Sparse Matrices on Multicore Architectures
Sparse Matrix-Vector multiplication (SpMV) is a very challenging computational kernel, since its performance depends greatly on both the input matrix and the underlying architectur...
Vasileios Karakasis, Georgios I. Goumas, Nectarios...
CORR
2010
Springer
124views Education» more  CORR 2010»
13 years 7 months ago
Integer-Forcing Linear Receivers
Abstract--Linear receivers are often used to reduce the implementation complexity of multiple antenna systems. In a traditional linear receiver architecture, the receive antennas a...
Jiening Zhan, Bobak Nazer, Uri Erez, Michael Gastp...