This paper addresses the challenging problem of learning from multiple annotators whose labeling accuracy (reliability) differs and varies over time. We propose a framework based ...
With the advance of the Semantic Web, varying RDF data were increasingly generated, published, queried, and reused via the Web. For example, the DBpedia, a community effort to extr...
Challenging the implicit reliance on document collections, this paper discusses the pros and cons of using query logs rather than document collections, as self-contained sources o...
Bayesian networks are graphical representations of probability distributions. In virtually all of the work on learning these networks, the assumption is that we are presented with...
We propose efficient techniques for processing various TopK count queries on data with noisy duplicates. Our method differs from existing work on duplicate elimination in two sign...
Sunita Sarawagi, Vinay S. Deshpande, Sourabh Kasli...