This paper presents an extensive study about the evolution of textual content on the Web, which shows how some new pages are created from scratch while others are created using al...
This paper proposes a novel approach to integrate heterogeneous data in P2P networks. The approach includes a tool for building P2P semantic link networks, mechanisms for peer sch...
Abstract. Weighting models use lexical statistics, such as term frequencies, to derive term weights, which are used to estimate the relevance of a document to a query. Apart from t...
Organising large-scale Web information retrieval systems into hierarchies of topic-specific search resources can improve both the quality of results and the efficient use of com...
The Web is a valuable source of language speci c resources but the process of collecting, organizing and utilizing these resources is di cult. We describe CorpusBuilder, an approa...