Sciweavers

127 search results - page 25 / 26
» Parallel Performance Prediction for Multigrid Codes on Distr...
Sort
View
HPCA
2006
IEEE
14 years 2 months ago
Increasing the cache efficiency by eliminating noise
Caches are very inefficiently utilized because not all the excess data fetched into the cache, to exploit spatial locality, is utilized. We define cache utilization as the percent...
Prateek Pujara, Aneesh Aggarwal
SPAA
1993
ACM
14 years 19 days ago
Supporting Sets of Arbitrary Connections on iWarp Through Communication Context Switches
In this paper we introduce the ConSet communication model for distributed memory parallel computers. The communication needs of an application program can be satisfied by some ar...
Anja Feldmann, Thomas Stricker, Thomas E. Warfel
WMPI
2004
ACM
14 years 2 months ago
The Opie compiler from row-major source to Morton-ordered matrices
The Opie Project aims to develop a compiler to transform C codes written for row-major matrix representation into equivalent codes for Morton-order matrix representation, and to a...
Steven T. Gabriel, David S. Wise
PPOPP
2009
ACM
14 years 9 months ago
OpenMP to GPGPU: a compiler framework for automatic translation and optimization
GPGPUs have recently emerged as powerful vehicles for generalpurpose high-performance computing. Although a new Compute Unified Device Architecture (CUDA) programming model from N...
Seyong Lee, Seung-Jai Min, Rudolf Eigenmann
DAC
2003
ACM
14 years 9 months ago
Using estimates from behavioral synthesis tools in compiler-directed design space exploration
This paper considers the role of performance and area estimates from behavioral synthesis in design space exploration. We have developed a compilation system that automatically ma...
Byoungro So, Pedro C. Diniz, Mary W. Hall