This paper describes query processing in the DBO database system. Like other database systems designed for ad-hoc, analytic processing, DBO is able to compute the exact answer to queries over a large relational database in a scalable fashion. Unlike any other system designed for analytic processing, DBO can constantly maintain a guess as to the final answer to an aggregate query throughout execution, along with statistically meaningful bounds for the guess's accuracy. As DBO gathers more and more information, the guess gets more and more accurate, until it is 100% accurate as the query is completed. This allows users to stop the execution at any time that they are happy with the query accuracy, and encourages exploratory data analysis. Categories and Subject Descriptors G3 [Probability and Statistics]: Probabilistic Algorithms; H.2.4 [Database Management - Systems]: Query Processing General Terms Algorithms, Performance Keywords Sampling, Online Aggregation, Randomized Algorithms...
Christopher M. Jermaine, Subramanian Arumugam, Abh