— One of the most prominent data quality problems is the existence of duplicate records. Current data cleaning systems usually produce one clean instance (repair) of the input da...
George Beskales, Mohamed A. Soliman, Ihab F. Ilyas...
Decentralized and unstructured peer-to-peer (P2P) networks such as Gnutella are attractive for Internet-scale information retrieval and search systems because they require neither...
In this paper, we investigate the problem of improving the relevance of a Web search engine by adapting it to the dynamic needs of the user. We examine a representative case of su...
The mode of a multiset of labels, is a label that occurs at least as often as any other label. The input to the range mode problem is an array A of size n. A range query [i, j] mus...
Prediction is emerging as an essential ingredient for real-time monitoring, planning and decision support applications such as intrusion detection, e-commerce pricing and automate...