Many parallel scientific applications need high-performance I/O. Unfortunately, end-to-end parallel-I/O performance has not been able to keep up with substantial improvements in p...
In prior work, we have proposed techniques to extend the ease of shared-memory parallel programming to distributed-memory platforms by automatic translation of OpenMP programs to ...
The inability to perform optimizations based on application-specific information presents a hurdle to the deployment of pervasive LAN file systems across WAN environments. This pa...
— The Cell Broadband Engine (BE) is a heterogeneous multicore processor, combining a general-purpose POWER architecture core with eight independent single-instructionmultiple-dat...
Sadaf R. Alam, Jeremy S. Meredith, Jeffrey S. Vett...
Stencil computation (SC) is of critical importance for broad scientific and engineering applications. However, it is a challenge to optimize complex, highorder SC on emerging clus...
Liu Peng, Richard Seymour, Ken-ichi Nomura, Rajiv ...