Many applications today need to manage large data sets with uncertainties. In this paper we describe the foundations of managing data where the uncertainties are quantified as probabilities. We review the basic definitions of the probabilistic data model, present some fundamental theoretical result for query evaluation on probabilistic databases, and discuss several challenges, open problems, and research directions. Categories and Subject Descriptors F.4.1 [Mathematical Logic]; G.3 [Probability and statistics]; H.2.5 [Heterogeneous databases] General Terms Algorithms, Management, Theory Keywords probabilistic databases, query processing
Nilesh N. Dalvi, Dan Suciu