Sciweavers

657 search results - page 54 / 132
» Analysis of Multithreaded Architectures for Parallel Computi...
Sort
View
DAC
2008
ACM
14 years 11 months ago
Federation: repurposing scalar cores for out-of-order instruction issue
Future SoCs will contain multiple cores. For workloads with significant parallelism, prior work has shown the benefit of many small, multi-threaded, scalar cores. For workloads th...
David Tarjan, Michael Boyer, Kevin Skadron
EUROPAR
2007
Springer
14 years 4 months ago
Search Strategies for Automatic Performance Analysis Tools
Periscope is a distributed automatic online performance analysis system for large scale parallel systems. It consists of a set of analysis agents distributed on the parallel machin...
Michael Gerndt, Edmond Kereku
IPPS
2010
IEEE
13 years 8 months ago
Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures
This work presents the first extensive study of singlenode performance optimization, tuning, and analysis of the fast multipole method (FMM) on modern multicore systems. We consid...
Aparna Chandramowlishwaran, Samuel Williams, Leoni...
IPPS
2009
IEEE
14 years 5 months ago
Exploring the effect of block shapes on the performance of sparse kernels
In this paper we explore the impact of the block shape on blocked and vectorized versions of the Sparse Matrix-Vector Multiplication (SpMV) kernel and build upon previous work by ...
Vasileios Karakasis, Georgios I. Goumas, Nectarios...
HPCA
2005
IEEE
14 years 10 months ago
Performance, Energy, and Thermal Considerations for SMT and CMP Architectures
Simultaneous multithreading (SMT) and chip multiprocessing (CMP) both allow a chip to achieve greater throughput, but their relative energy-efficiency and thermal properties are s...
Yingmin Li, David Brooks, Zhigang Hu, Kevin Skadro...