Sciweavers

155 search results - page 8 / 31
» On the Automatic Parallelization of the Perfect Benchmarks
Sort
View
HPCA
2001
IEEE
14 years 8 months ago
Reducing DRAM Latencies with an Integrated Memory Hierarchy Design
In this papel; we address the severe performance gap caused by high processor clock rates and slow DRAM accesses. We show that even with an aggressive, next-generation memory syst...
Wei-Fen Lin, Steven K. Reinhardt, Doug Burger
ICS
2005
Tsinghua U.
14 years 1 months ago
Towards automatic translation of OpenMP to MPI
We present compiler techniques for translating OpenMP shared-memory parallel applications into MPI messagepassing programs for execution on distributed memory systems. This transl...
Ayon Basumallik, Rudolf Eigenmann
DATE
2010
IEEE
153views Hardware» more  DATE 2010»
14 years 29 days ago
Recursion-driven parallel code generation for multi-core platforms
—We present Huckleberry, a tool for automatically generating parallel implementations for multi-core platforms from sequential recursive divide-and-conquer programs. The recursiv...
Rebecca L. Collins, Bharadwaj Vellore, Luca P. Car...
MIDDLEWARE
2010
Springer
13 years 6 months ago
Automatically Generating Symbolic Prefetches for Distributed Transactional Memories
Abstract. Developing efficient distributed applications while managing complexity can be challenging. Managing network latency is a key challenge for distributed applications. We ...
Alokika Dash, Brian Demsky
ICS
1993
Tsinghua U.
13 years 12 months ago
The EM-4 Under Implicit Parallelism
: The EM-4 is a supercomputer that offers very fast inter processor communication and support for multi threading. In this paper we demonstrate that the EM-4, Together with an auto...
Lubomir Bic, Mayez A. Al-Mouhamed