As part of a large effort to acquire large repositories of facts from unstructured text on the Web, a seed-based framework for textual information extraction allows for weakly sup...
Recent progress in information extraction technology has enabled a vast array of applications that rely on structured data that is embedded in natural-language text. In particular...
The World-Wide Web consists not only of a huge number of unstructured texts, but also a vast amount of valuable structured data. Web tables [2] are a typical type of structured in...
Cindy Xide Lin, Bo Zhao, Tim Weninger, Jiawei Han,...
Jedi (Java based Extraction and Dissemination of Information) is a lightweight tool for the creation of wrappers and mediators to extract, combine, and reconcile information from ...
Gerald Huck, Peter Fankhauser, Karl Aberer, Erich ...
In this paper we propose an attribute retrieval approach which extracts and ranks attributes from Web tables. We use simple heuristics to filter out improbable attributes and we ...