Web search engines crawl the web to fetch the data that they index. In this paper we re-examine that need, and evaluate the network costs associated with data acquisition, and alt...
Nick Craswell, Francis Crimmins, David Hawking, Al...
I show that the World Wide Web is a small world, in the sense that sites are highly clustered yet the path length between them is small. I also demonstrate the advantages of a sear...
In this paper we present our technique for finding semantically similar clusters within web documents obtained from a set of queries retrieved from the Google search engine. This ...
We introduce a new, powerful class of text proximity queries: find an instance of a given "answer type" (person, place, distance) near "selector" tokens matchi...
In this paper, we propose to mine query hierarchies from clickthrough data, which is within the larger area of automatic acquisition of knowledge from the Web. When a user submits...
Dou Shen, Min Qin, Weizhu Chen, Qiang Yang, Zheng ...