Processing and analyzing large volumes of data plays an increasingly important role in many domains of scienti c research. High-level language and compiler support for developing applications that analyze and process such datasets has, however, been lacking so far. In this paper, we present a set of language extensions and a prototype compiler for supporting high-level objectoriented programming of data intensive reduction operations over multidimensional data. We have chosen a dialect of Java with data-parallel extensions for specifying collection of objects, a parallel for loop, and reduction variables as our source high-level language. Our compiler analyzes parallel loops and optimizes the processing of datasets through the use of an existing run-time system, called Active Data Repository ADR. We show how loop ssion followed by interprocedural static program slicing can be used by the compiler to extract required information for the run-time system. We present the design of a compi...
Renato Ferreira, Gagan Agrawal, Joel H. Saltz