We describe the Java runtime parallelizing machine (Jrpm), a complete system for parallelizing sequential programs automatically. Jrpm is based on a chip multiprocessor (CMP) with...
Compile-time code transformations which expose instruction-level parallelism (ILP) typically take into account the constraints imposed by all execution scenarios in the program. H...
Richard E. Hank, Scott A. Mahlke, Roger A. Bringma...
Abstract. Dynamic program optimization is the only recourse for optimizing compilers when machine and program parameters necessary for applying an optimization technique are unknow...
—The performance bottleneck for many scientific applications is the cost of memory access inside linear algebra kernels. Tuning such kernels for memory efficiency is a complex ...
This paper discusses some of the issues involved in implementing a shared-address space programming model on large-scale, distributed-memory multiprocessors. While such a programm...
David A. Kranz, Kirk L. Johnson, Anant Agarwal, Jo...