Sciweavers

36 search results - page 7 / 8
» LAPACK: a portable linear algebra library for high-performan...
Sort
View
ICS
1993
Tsinghua U.
13 years 11 months ago
Static and Dynamic Evaluation of Data Dependence Analysis
—Data dependence analysis techniques are the main component of today’s strategies for automatic detection of parallelism. Parallelism detection strategies are being incorporate...
Paul Petersen, David A. Padua
IPPS
2010
IEEE
13 years 5 months ago
Tile QR factorization with parallel panel processing for multicore architectures
To exploit the potential of multicore architectures, recent dense linear algebra libraries have used tile algorithms, which consist in scheduling a Directed Acyclic Graph (DAG) of...
Bilel Hadri, Hatem Ltaief, Emmanuel Agullo, Jack D...
ARCS
2008
Springer
13 years 9 months ago
An Optimized ZGEMM Implementation for the Cell BE
: The architecture of the IBM Cell BE processor represents a new approach for designing CPUs. The fast execution of legacy software has to stand back in order to achieve very high ...
Timo Schneider, Torsten Hoefler, Simon Wunderlich,...
PVM
2010
Springer
13 years 5 months ago
Massively Parallel Finite Element Programming
Abstract. Today’s large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data struc...
Timo Heister, Martin Kronbichler, Wolfgang Bangert...
ICCS
2009
Springer
14 years 2 months ago
A Note on Auto-tuning GEMM for GPUs
The development of high performance dense linear algebra (DLA) critically depends on highly optimized BLAS, and especially on the matrix multiplication routine (GEMM). This is espe...
Yinan Li, Jack Dongarra, Stanimire Tomov