Sciweavers

624 search results - page 30 / 125
» High Performance Matrix Multiplication on Many Cores
Sort
View
IPPS
2006
IEEE
14 years 3 months ago
A study of the on-chip interconnection network for the IBM Cyclops64 multi-core architecture
The designs of high-performance processor architectures are moving toward the integration of a large number of multiple processing cores on a single chip. The IBM Cyclops-64 (C64)...
Yingping Zhang, Taikyeong Jeong, Fei Chen, Haiping...
PPOPP
2010
ACM
14 years 6 months ago
Scaling LAPACK panel operations using parallel cache assignment
In LAPACK many matrix operations are cast as block algorithms which iteratively process a panel using an unblocked algorithm and then update a remainder matrix using the high perf...
Anthony M. Castaldo, R. Clint Whaley
DATE
2002
IEEE
107views Hardware» more  DATE 2002»
14 years 2 months ago
High-Speed Non-Linear Asynchronous Pipelines
Many approaches recently proposed for high-speed asynchronous pipelines are applicable only to linear datapaths. However, real systems typically have non-linearities in their data...
Recep O. Ozdag, Peter A. Beerel, Montek Singh, Ste...
WIRN
2005
Springer
14 years 2 months ago
Ensembles Based on Random Projections to Improve the Accuracy of Clustering Algorithms
We present an algorithmic scheme for unsupervised cluster ensembles, based on randomized projections between metric spaces, by which a substantial dimensionality reduction is obtai...
Alberto Bertoni, Giorgio Valentini
SIAMSC
2010
120views more  SIAMSC 2010»
13 years 7 months ago
Weighted Matrix Ordering and Parallel Banded Preconditioners for Iterative Linear System Solvers
The emergence of multicore architectures and highly scalable platforms motivates the development of novel algorithms and techniques that emphasize concurrency and are tolerant of ...
Murat Manguoglu, Mehmet Koyutürk, Ahmed H. Sa...