State-of-the-art run-time systems are a poor match to diverse, dynamic distributed applications because they are designed to provide support to a wide variety of applications, with...
Lawrence Rauchwerger, Nancy M. Amato, Josep Torrel...
This paper presents recursion unrolling, a technique for improving the performance of recursive computations. Conceptually, recursion unrolling inlines recursive calls to reduce c...
The internal mechanism used for a dependence test constrains its accuracy and determines its speed. The internal mechanism used for our Access Region Test (ART) is fundamentally d...
Abstract. This paper describes how the use of software libraries, which is prevalent in high performance computing, can benefit from compiler optimizations in much the same way tha...
Abstract. Processing and analyzing large volumes of data plays an increasingly important role in many domains of scienti c research. We are developing a compiler which processes da...
Renato Ferreira, Gagan Agrawal, Ruoming Jin, Joel ...
This paper proposes a simple and efficient implementation method for a hierarchical coarse grain task parallel processing scheme on a SMP machine. OSCAR multigrain parallelizing c...
Embedded systems consisting of the application program ROM, RAM, the embedded processor core, and any custom hardware on a single wafer are becoming increasingly common in applicat...