Recently, there has been increased interest in the retrieval and integration of hidden Web data with a view to leverage high-quality information available in online databases. Alt...
In order to return relevant search results, a search engine must keep its local repository synchronized to the Web, but it is usually impossible to attain perfect freshness. Hence...
Search engines largely rely on Web robots to collect information from the Web. Due to the unregulated open-access nature of the Web, robot activities are extremely diverse. Such c...
The enormity and rapid growth of the web-graph forces quantities such as its pagerank to be computed under missing information consisting of outlinks of pages that have not yet be...
In this paper, we describe the design and initial implementation of a geographic search engine prototype for Germany, based on a large crawl of the de domain. Geographic search en...
Alexander Markowetz, Yen-Yu Chen, Torsten Suel, Xi...