There is a rapidly growing set of applications, referred to as data driven applications, in which analysis of large amounts of data drives the next steps taken by the scientist, e.g., running new simulations, doing additional measurements, extending the analysis to larger data collections. Critical steps in data analysis are to extract the data of interest from large and potentially distributed datasets and to move it from storage clusters to compute clusters for processing. We have developed a middleware framework, called GridDB-Lite, that is designed to efficiently support these two steps. In this paper, we describe the application of GridDB-Lite in large scale oil reservoir simulation studies and experimentally evaluate several optimizations that can be employed in the GridDB-Lite runtime system.
Sivaramakrishnan Narayanan, Ümit V. Ça