Detection of template and noise blocks in web pages is an important step in improving the performance of information retrieval and content extraction. Of the many approaches propos...
A critical problem in developing information agents for the Web is accessing data that is formatted for human use. We have developed a set of tools for extracting data from web si...
Craig A. Knoblock, Kristina Lerman, Steven Minton,...
A web site should be easy to browse by visitors. However, sometimes the reality is quite different. Situations like several unrelated topics in a single web page may lead to confus...
In this paper, we present automated techniques for bootstrapping and populating specialized domain ontologies by organizing and mining a set of relevant overlapping Web sites prov...
Hasan Davulcu, Srinivas Vadrevu, Saravanakumar Nag...