In order to let software programs gain full benefit from semi-structured web sources, wrapper programs must be built to provide a “machine-readable” view over them. A signific...
We propose a novel approach that identifies web page templates and extracts the unstructured data. Extracting only the body of the page and eliminating the template increases the ...
Watson is a gateway to the Semantic Web: it collects, analyzes and gives access to ontologies and semantic data available online. Its objective is to support the development of ne...
- Filtering the immense amount of data available electronically over the World Wide Web is an important task of search engines in data mining applications. Users when performing se...
A multilingual Internet-based employment advertisement system is described. Job ads are submitted as e-mailtexts, analysed by an example-based pattern matcher and stored in langua...
Harold L. Somers, Bill Black, Joakim Nivre, Torbj&...