A Grid Information Service (GIS) stores information about the resources of a distributed computing environment and answers questions about it. We are developing RGIS, a GIS system based on the relational data model. RGIS users can write SQL queries that search for complex compositions of resources that meet collective requirements. Executing these queries can be very expensive, however. In response, we introduce the nondeterministic query, an extension to the SELECT statement, which allows the user (and RGIS) to trade off between the query’s running time and the number of results. The results are a random sample of the deterministic results, which we argue is sufficient and appropriate. Herein we describe RGIS, the nondeterministic query extension, and its implementation. Our evaluation shows that a meaningful tradeoff between query time and results returned is achievable, and that the tradeoff can be used to keep query time largely independent of query complexity.
Peter A. Dinda, Dong Lu