In this paper, we present the results of our work that seek to negotiate the gap between low-level features and high-level concepts in the domain of web document retrieval. This wo...
A vast amount of documents in the Web have duplicates, which is a challenge for developing efficient methods that would compute clusters of similar documents. In this paper we use ...
In this paper we present the World-Wide Web Wrapper Factory (W4F), a Java toolkit to generate wrappers for Web data sources. Some key features of W4F are an expressive language to...
Recommender systems evaluate and filter the great amount of information available on the Web to assist people in their search processes. A fuzzy linguistic evaluation method of We...
Enrique Herrera-Viedma, Eduardo Peis, Jesus Canelo...
: Automatically recognising which HTML documents on the Web contain items of interest for a user is non-trivial. As a step toward solving this problem, we propose an approach based...