In many Web applications, such as blog classification and newsgroup classification, labeled data are in short supply. It often happens that obtaining labeled data in a new domain ...
While scalable data mining methods are expected to cope with massive Web data, coping with evolving trends in noisy data in a continuous fashion, and without any unnecessary stopp...
In this paper we explore private computation built on vector addition and its applications in privacypreserving data mining. Vector addition is a surprisingly general tool for imp...
CiteSeer is currently a very large source of meta-data information on the World Wide Web (WWW). This meta-data is the key material for the Semantic Web. Still, CiteSeer is not yet...
Yves Petinot, C. Lee Giles, Vivek Bhatnagar, Prade...
Data quality is critical for many information-intensive applications. One of the best opportunities to improve data quality is during entry. USHER provides a theoretical, data-dri...
Kuang Chen, Joseph M. Hellerstein, Tapan S. Parikh