Sciweavers

624 search results - page 30 / 125
» High Performance Matrix Multiplication on Many Cores
Sort
View
138
Voted
IPPS
2006
IEEE
15 years 10 months ago
A study of the on-chip interconnection network for the IBM Cyclops64 multi-core architecture
The designs of high-performance processor architectures are moving toward the integration of a large number of multiple processing cores on a single chip. The IBM Cyclops-64 (C64)...
Yingping Zhang, Taikyeong Jeong, Fei Chen, Haiping...
PPOPP
2010
ACM
16 years 1 months ago
Scaling LAPACK panel operations using parallel cache assignment
In LAPACK many matrix operations are cast as block algorithms which iteratively process a panel using an unblocked algorithm and then update a remainder matrix using the high perf...
Anthony M. Castaldo, R. Clint Whaley
DATE
2002
IEEE
107views Hardware» more  DATE 2002»
15 years 9 months ago
High-Speed Non-Linear Asynchronous Pipelines
Many approaches recently proposed for high-speed asynchronous pipelines are applicable only to linear datapaths. However, real systems typically have non-linearities in their data...
Recep O. Ozdag, Peter A. Beerel, Montek Singh, Ste...
164
Voted
WIRN
2005
Springer
15 years 9 months ago
Ensembles Based on Random Projections to Improve the Accuracy of Clustering Algorithms
We present an algorithmic scheme for unsupervised cluster ensembles, based on randomized projections between metric spaces, by which a substantial dimensionality reduction is obtai...
Alberto Bertoni, Giorgio Valentini
146
Voted
SIAMSC
2010
120views more  SIAMSC 2010»
15 years 2 months ago
Weighted Matrix Ordering and Parallel Banded Preconditioners for Iterative Linear System Solvers
The emergence of multicore architectures and highly scalable platforms motivates the development of novel algorithms and techniques that emphasize concurrency and are tolerant of ...
Murat Manguoglu, Mehmet Koyutürk, Ahmed H. Sa...