Sciweavers

14 search results - page 3 / 3
» High-performance implementation of the level-3 BLAS
Sort
View
IPPS
2009
IEEE
14 years 3 months ago
Singular value decomposition on GPU using CUDA
Linear algebra algorithms are fundamental to many computing applications. Modern GPUs are suited for many general purpose processing tasks and have emerged as inexpensive high per...
Sheetal Lahabar, P. J. Narayanan
IPPS
2006
IEEE
14 years 2 months ago
Parallel ICA methods for EEG neuroimaging
HiPerSAT, a C++ library and tools, processes EEG data sets with ICA (Independent Component Analysis) methods. HiPerSAT uses BLAS, LAPACK, MPI and OpenMP to achieve a high performa...
D. B. Keith, C. C. Hoge, Robert M. Frank, Allen D....
CORR
2007
Springer
141views Education» more  CORR 2007»
13 years 8 months ago
A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures
As multicore systems continue to gain ground in the High Performance Computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in or...
Alfredo Buttari, Julien Langou, Jakub Kurzak, Jack...
PC
2002
114views Management» more  PC 2002»
13 years 8 months ago
Optimizing noncontiguous accesses in MPI-IO
The I/O access patterns of many parallel applications consist of accesses to a large number of small, noncontiguous pieces of data. If an application's I/O needs are met by m...
Rajeev Thakur, William Gropp, Ewing L. Lusk