Virtually all histograms store for each bucket the number of distinct values it contains and their average frequency. In this paper, we question this paradigm. We start out by inv...
The huge amount of data available from Internet information sources has focused much attention on the sharing of distributed information through Peer Data Management Systems (PDMS...
Probabilistic data have recently become popular in applications such as scientific and geospatial databases. For images and other spatial datasets, probabilistic values can capture...
Accurate and efficient integration of geospatial data is an important problem with applications in areas such as emergency response and urban planning. Some of the key challenges ...
Sampling is a popular method of data collection when it is impossible or too costly to reach the entire population. For example, television show ratings in the United States are g...