Sciweavers

624 search results - page 17 / 125
» High Performance Matrix Multiplication on Many Cores
Sort
View
ARCS
2008
Springer
13 years 9 months ago
An Optimized ZGEMM Implementation for the Cell BE
: The architecture of the IBM Cell BE processor represents a new approach for designing CPUs. The fast execution of legacy software has to stand back in order to achieve very high ...
Timo Schneider, Torsten Hoefler, Simon Wunderlich,...
HPCA
2006
IEEE
14 years 8 months ago
Last level cache (LLC) performance of data mining workloads on a CMP - a case study of parallel bioinformatics workloads
With the continuing growth in the amount of genetic data, members of the bioinformatics community are developing a variety of data-mining applications to understand the data and d...
Aamer Jaleel, Matthew Mattina, Bruce L. Jacob
MICRO
2009
IEEE
120views Hardware» more  MICRO 2009»
14 years 2 months ago
Reducing peak power with a table-driven adaptive processor core
The increasing power dissipation of current processors and processor cores constrains design options, increases packaging and cooling costs, increases power delivery costs, and de...
Vasileios Kontorinis, Amirali Shayan, Dean M. Tull...
CASCON
2008
164views Education» more  CASCON 2008»
13 years 9 months ago
High performance XML parsing using parallel bit stream technology
Parabix (parallel bit streams for XML) is an open-source XML parser that employs the SIMD (single-instruction multiple-data) capabilities of modern-day commodity processors to del...
Robert D. Cameron, Kenneth S. Herdy, Dan Lin
PPOPP
2006
ACM
14 years 1 months ago
McRT-STM: a high performance software transactional memory system for a multi-core runtime
Applications need to become more concurrent to take advantage of the increased computational power provided by chip level multiprocessing. Programmers have traditionally managed t...
Bratin Saha, Ali-Reza Adl-Tabatabai, Richard L. Hu...