The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
We introduce a new method for automatically constructing concept hierarchies where the concept nodes follow a generalization / specialization relation. Starting from a set of conc...
Typically, searching for information in a document collection amounts to refining a query and then scanning a large number of documents to determine their relevance. Active Summar...
We present initial results from an international and multi-disciplinary research collaboration that aims at the construction of a reference corpus of web genres. The primary appli...
Georg Rehm, Marina Santini, Alexander Mehler, Pave...
The authors of topic map-based learning resources face major difficulties in constructing the underlying ontologies. In this paper we propose two approaches to address this problem...