Basic Linear Algebra Subprograms

160

SPAA
2010
ACM

129views Distributed And Parallel Com...» more SPAA 2010»

Managing the complexity of lookahead for LU factorization with pivoting

15 years 5 months ago

We describe parallel implementations of LU factorization with pivoting for multicore architectures. Implementations that differ in two different dimensions are discussed: (1) usin...

Ernie Chan, Robert A. van de Geijn, Andrew Chapman

claim paper

Read More »

154

click to vote

ERSA
2007

86views Hardware» more ERSA 2007»

High-Precision BLAS on FPGA-enhanced Computers

15 years 8 months ago

Download www.umac.mo

The emergence of high-density reconfigurable hardware devices gives scientists and engineers an option to accelerating their numerical computing applications on low-cost but power...

Chuan He, Guan Qin, Richard E. Ewing, Wei Zhao

claim paper

Read More »

187

click to vote

ARCS
2008
Springer

136views Software Engineering» more ARCS 2008»

An Optimized ZGEMM Implementation for the Cell BE

15 years 8 months ago

Download www.unixer.de

: The architecture of the IBM Cell BE processor represents a new approach for designing CPUs. The fast execution of legacy software has to stand back in order to achieve very high ...

Timo Schneider, Torsten Hoefler, Simon Wunderlich,...

claim paper

Read More »

178

click to vote

PARA
1995
Springer

174views Applied Computing» more PARA 1995»

A Proposal for a Set of Parallel Basic Linear Algebra Subprograms

15 years 10 months ago

Download phase.hpcc.jp

This paper describes a proposal for a set of Parallel Basic Linear Algebra Subprograms PBLAS. The PBLAS are targeted at distributed vector-vector, matrix-vector and matrixmatrix...

Jaeyoung Choi, Jack Dongarra, Susan Ostrouchov, An...

claim paper

Read More »

182

click to vote

ISPDC
2008
IEEE

126views Distributed And Parallel Com...» more ISPDC 2008»

Heterogeneous PBLAS: Optimization of PBLAS for Heterogeneous Computational Clusters

16 years 29 days ago

Download hcl.ucd.ie

This paper presents a package, called Heterogeneous PBLAS (HeteroPBLAS), which is built on top of PBLAS and provides optimized parallel basic linear algebra subprograms for hetero...

Ravi Reddy Manumachu, Alexey L. Lastovetsky, Pedro...

claim paper

Read More »

140

click to vote

IPPS
2009
IEEE

100views Distributed And Parallel Com...» more IPPS 2009»

Generation of Synthetic Floating-Point benchmark circuits

16 years 1 months ago

Download www.cse.cuhk.edu.hk

Synthetic Floating-Point (SFP), a synthetic benchmark generator program for ﬂoating-point circuits is presented. SFP consists of two independent modules for characterisation and...

T. Chun Pong Chau, S. Man Ho Ho, Philip H. W. Leon...

claim paper

Read More »

151

click to vote

SC
2009
ACM

240views Applied Computing» more SC 2009»

Automating the generation of composed linear algebra kernels

16 years 1 months ago

Download ecee.colorado.edu

Memory bandwidth limits the performance of important kernels in many scientiﬁc applications. Such applications often use sequences of Basic Linear Algebra Subprograms (BLAS), an...

Geoffrey Belter, Elizabeth R. Jessup, Ian Karlin, ...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers