Sciweavers

624 search results - page 12 / 125
» High Performance Matrix Multiplication on Many Cores
Sort
View
ISCA
2012
IEEE
279views Hardware» more  ISCA 2012»
11 years 10 months ago
Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems
When multiple processor (CPU) cores and a GPU integrated together on the same chip share the off-chip main memory, requests from the GPU can heavily interfere with requests from t...
Rachata Ausavarungnirun, Kevin Kai-Wei Chang, Lava...
ICS
2007
Tsinghua U.
14 years 1 months ago
Adaptive Strassen's matrix multiplication
Strassen’s matrix multiplication (MM) has benefits with respect to any (highly tuned) implementations of MM because Strassen’s reduces the total number of operations. Strasse...
Paolo D'Alberto, Alexandru Nicolau
EUROPAR
2009
Springer
14 years 11 days ago
Two-Dimensional Matrix Partitioning for Parallel Computing on Heterogeneous Processors Based on Their Functional Performance Mod
Abstract. The functional performance model (FPM) of heterogeneous processors has proven to be more realistic than the traditional models because it integrates many important featur...
Alexey L. Lastovetsky, Ravi Reddy
ACIVS
2006
Springer
14 years 1 months ago
Dedicated Hardware for Real-Time Computation of Second-Order Statistical Features for High Resolution Images
We present a novel dedicated hardware system for the extraction of second-order statistical features from high-resolution images. The selected features are based on gray level co-o...
Dimitris G. Bariamis, Dimitrios K. Iakovidis, Dimi...
WCE
2007
13 years 9 months ago
Sparse Matrix Multiplication Using UPC
—Partitioned global address space (PGAS) languages, such as Unified Parallel C (UPC) have the promise of being productive. Due to the shared address space view that they provide,...
Hoda El-Sayed, Eric Wright