As more data (especially scientific data) is digitized and put on the Web, the importance of tracking and sharing its provenance metadata grows. Besides capturing the annotation pr...
Li Ding, Jie Bao, James Michaelis, Jun Zhao, Debor...
Inductive learning of first-order theory based on examples has serious bottleneck in the enormous hypothesis search space needed, making existing learning approaches perform poorl...
Abstract. In this paper we present static and dynamic studies of duplicate and near-duplicate documents in the Web. The static and dynamic studies involve the analysis of similar c...
Sampling is a popular method of data collection when it is impossible or too costly to reach the entire population. For example, television show ratings in the United States are g...
The reverse k-nearest neighbor (RkNN) problem, i.e. finding all objects in a data set the k-nearest neighbors of which include a specified query object, is a generalization of the...