Many text databases on the web are "hidden" behind search interfaces, and their documents are only accessible through querying. Search engines typically ignore the conte...
Panagiotis G. Ipeirotis, Luis Gravano, Mehran Saha...
When automatically extracting information from the world wide web, most established methods focus on spotting single HTMLdocuments. However, the problem of spotting complete web s...
Martin Ester, Hans-Peter Kriegel, Matthias Schuber...
When text categorization is applied to complex tasks, it is tedious and expensive to hand-label the large amounts of training data necessary for good performance. In this paper, we...
We introduce a new stacking-like approach for multi-value classification. We apply this classification scheme using Naive Bayes, Rocchio and kNN classifiers on the well-known Reute...
Extracting and processing information from web pages is an important task in many areas like constructing search engines, information retrieval, and data mining from the Web. Comm...
Milos Kovacevic, Michelangelo Diligenti, Marco Gor...