Abstract. In this paper we present static and dynamic studies of duplicate and near-duplicate documents in the Web. The static and dynamic studies involve the analysis of similar c...
Geographic information retrieval encompasses important tasks including finding the location of a user, and locations relevant to their search queries. Web-based search engines rec...
This paper describes the current state of RUgle, a system for classifying and indexing papers made available on the World Wide Web, in a domain-independent and universal manner. B...
It has been observed that anchor text in web documents is very useful in improving the quality of web text search for some classes of queries. By examining properties of anchor te...
Distributed heterogeneous search systems are an emerging phenomenon in Web search, in which independent topic-specific search engines provide search services, and metasearchers d...