Recovering semantic relations between different parts of web pages are of great importance for multi-platform web interface development, as they make it possible to re-distribute ...
In this paper, an approach for the implementation of a qualitybased Web search engine is proposed. Quality retrieval is introduced and an overview on previous efforts to implement...
One of the basic methods of web usage mining are association rules that indicate relationships among common use of web pages. Positive and confined negative association rules are ...
We present Opal, a light-weight framework for interactively locating missing web pages (http status code 404). Opal is an example of “in vivo” preservation: harnessing the col...
Web information extraction is a fundamental issue for web information management and integrations. A common approach is to use wrappers to extract data from web pages or documents...
Many web pages and resources are primarily relevant to certain geographic locations. For example, in many queries web pages on restaurants, hotels, or movie theaters are only rele...
Search engine quality is impacted by two factors: the quality of the ranking/matching algorithm used and the freshness of the search engine’s index, which maintains a “snapsho...
Jie Xu, Qinglan Li, Huiming Qu, Alexandros Labrini...
This paper pursues the recently emerging paradigm of searching for entities that are embedded in Web pages. We utilize informationextraction techniques to identify entity candidat...
Julia Stoyanovich, Srikanta J. Bedathur, Klaus Ber...
In the research area of automatic web information extraction, there is a need for permanent and annotated web page collections enabling objective performance evaluation of differen...
Detection of template and noise blocks in web pages is an important step in improving the performance of information retrieval and content extraction. Of the many approaches propos...