The availability and the accuracy of the data dictate the success of a data mining application. Increasingly, there is a need to resort to on-line data collection to address the p...
In this paper, we present a semi-supervised learning method for web page classification, leveraging click logs to augment training data by propagating class labels to unlabeled si...
Soo-Min Kim, Patrick Pantel, Lei Duan, Scott Gaffn...
Non-negative matrix factorization (NMF) is a recently developed method for finding parts-based representation of non-negative data such as face images. Although it has successfully...
Estimating the number of distinct values is a wellstudied problem, due to its frequent occurrence in queries and its importance in selecting good query plans. Previous work has sh...
In this paper we unify two supposedly distinct tasks in multimedia retrieval. One task involves answering queries with a few examples. The other involves learning models for seman...