Tools for mining information from data can create added value for the Internet. As the majority of electronic documents available over the network are in unstructured textual form...
We present a browser-extending Semantic Web extraction system that maps HTML documents to tables and, where possible, to rules. First, the basic data extractor ViPER distills and ...
Information extraction from HTML pages has been conventionally treated as plain text documents extended with HTML tags. However, the growing maturity and correct usage of HTML/XHT...
We present a system to automatically generate RSS feeds from HTML documents that consist of time-series items with date expressions, e.g., archives of weblogs, BBSs, chats, mailin...
Recently, the publishing of structured, semantic information as linked data has gained quite some momentum. For ordinary users on the Internet, however, this information is not yet...