Maintaining statistics on multidimensional data distributions is crucial for predicting the run-time and result size of queries and data analysis tasks with acceptable accuracy. To...
Graphs are canonical examples of high-dimensional non-Euclidean data sets, and are emerging as a common data structure in many fields. While there are many algorithms to analyze ...
Benjamin A. Miller, Nadya T. Bliss, Patrick J. Wol...
Software systems are designed and engineered to process data. However, software is data too. The size and variety of today's software artifacts and the multitude of stakehold...
Testing of database applications is of great importance. A significant issue in database application testing consists in the availability of representative data. In this paper we...
Abstract. This paper describes a full data-driven system for question answering. The system uses pattern matching and statistical techniques to identify the relevant passages as we...