Sciweavers

82 search results - page 9 / 17
» How much parallelism is there in irregular applications
Sort
View
IPPS
2003
IEEE
14 years 28 days ago
ECO: An Empirical-Based Compilation and Optimization System
In this paper, we describe a compilation system that automates much of the process of performance tuning that is currently done manually by application programmers interested in h...
Nastaran Baradaran, Jacqueline Chame, Chun Chen, P...
EUROPAR
2008
Springer
13 years 9 months ago
Empirical Analysis of a Large-Scale Hierarchical Storage System
To prepare for future peta- or exa-scale computing, it is important to gain a good understanding on what impacts a hierarchical storage system would have on the performance of data...
Weikuan Yu, Sarp Oral, Shane Canon, Jeffrey S. Vet...
VLSISP
2008
173views more  VLSISP 2008»
13 years 7 months ago
Fast Bit Gather, Bit Scatter and Bit Permutation Instructions for Commodity Microprocessors
Advanced bit manipulation operations are not efficiently supported by commodity word-oriented microprocessors. Programming tricks are typically devised to shorten the long sequence...
Yedidya Hilewitz, Ruby B. Lee
EUROPAR
2006
Springer
13 years 11 months ago
Optimization of Dense Matrix Multiplication on IBM Cyclops-64: Challenges and Experiences
Abstract. This paper presents a study of performance optimization of dense matrix multiplication on IBM Cyclops-64(C64) chip architecture. Although much has been published on how t...
Ziang Hu, Juan del Cuvillo, Weirong Zhu, Guang R. ...
SOSP
1989
ACM
13 years 8 months ago
Performance of Firefly RPC
In this paper, we report on the performance of the remote procedure call implementation for the Firefly multiprocessor and analyze the implementation to account precisely for all ...
Michael D. Schroeder, Michael Burrows