As part of a large effort to acquire large repositories of facts from unstructured text on the Web, a seed-based framework for textual information extraction allows for weakly sup...
This paper reports the partial results of an exploratory study which intends to develop a methodology for a Web feed-based aggregation content service to electronic journals in In...
We present a browser-extending Semantic Web extraction system that maps HTML documents to tables and, where possible, to rules. First, the basic data extractor ViPER distills and ...
Nowadays, images have become widely available on the World Wide Web (WWW). It’s essential to develop effective ways for managing and retrieving such abundant images. Advantageou...
Challenging the implicit reliance on document collections, this paper discusses the pros and cons of using query logs rather than document collections, as self-contained sources o...