Sciweavers

624 search results - page 19 / 125
» High Performance Matrix Multiplication on Many Cores
Sort
View
CLUSTER
2007
IEEE
14 years 2 months ago
Balancing productivity and performance on the cell broadband engine
— The Cell Broadband Engine (BE) is a heterogeneous multicore processor, combining a general-purpose POWER architecture core with eight independent single-instructionmultiple-dat...
Sadaf R. Alam, Jeremy S. Meredith, Jeffrey S. Vett...
CSE
2008
IEEE
13 years 8 months ago
A High-Throughput Multi-cluster NoC Architecture
During the last years a large number of research works has focused on problems related to multi-core processors. Due to the possibilities of many cores, the number of opportunitie...
Henrique C. Freitas, Philippe Olivier Alexandre Na...
ISCA
1996
IEEE
99views Hardware» more  ISCA 1996»
13 years 12 months ago
High-Bandwidth Address Translation for Multiple-Issue Processors
In an effort to push the envelope of system performance, microprocessor designs are continually exploiting higher levels of instruction-level parallelism, resulting in increasing ...
Todd M. Austin, Gurindar S. Sohi
ISCOPE
1999
Springer
14 years 14 hour ago
Generic Graph Algorithms for Sparse Matrix Ordering
Fill-reducing sparse matrix orderings have been a topic of active research for many years. Although most such algorithms are developed and analyzed within a graph-theoretical frame...
Lie-Quan Lee, Jeremy G. Siek, Andrew Lumsdaine
IPPS
2009
IEEE
14 years 2 months ago
Singular value decomposition on GPU using CUDA
Linear algebra algorithms are fundamental to many computing applications. Modern GPUs are suited for many general purpose processing tasks and have emerged as inexpensive high per...
Sheetal Lahabar, P. J. Narayanan