This paper presents a novel method for extracting information from collections of Web pages across different sites. Our method uses a standard wrapper induction algorithm and explo...
Detection of template and noise blocks in web pages is an important step in improving the performance of information retrieval and content extraction. Of the many approaches propos...
A web site should be easy to browse by visitors. However, sometimes the reality is quite different. Situations like several unrelated topics in a single web page may lead to confus...
We present a novel approach to automatic information extraction from Deep Web Life Science databases using wrapper induction. Traditional wrapper induction techniques focus on lear...
This study stems from a suggestion in the literature (Lohse and Spiller, 1999) that for some products an increase in the amount of information presented on a web site has a negati...