We propose and evaluate QuWi (Quality in Wikipedia), a framework for quality control in Wikipedia. We build upon a previous proposal by Mizzaro [11], who proposed a method for sub...
Alberto Cusinato, Vincenzo Della Mea, Francesco Di...
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Contextual advertising on web pages has become very popular recently and it poses its own set of unique text mining challenges. Often advertisers wish to either target (or avoid) ...
Yi Zhang, Arun C. Surendran, John C. Platt, Mukund...
This report is part of the seminar Digital Information Curation held by Prof. Dr. Marc H. Scholl and Dr. Andr?e Seifert during the winter term 2005/06. Its intention is to summari...
Peter Buneman, Sanjeev Khanna, Keishi Tajima, Wang...
There has been much recent interest in on-line data mining. Existing mining algorithms designed for stored data are either not applicable or not effective on data streams, where r...