With the page explosion of WWW, how to cover more useful information with limited storage and computation resources becomes more and more important in web IR research. Using web p...
Hypothesis generation is a crucial initial step for making scientific discoveries. This paper addresses the problem of automatically discovering interesting hypotheses from the we...
We present a gateway between the WWW and the Gnutella peer-topeer network that permits searchers on one side to be able to search and retrieve files on the other side of the gatew...
It is crucial for a web crawler to distinguish between ephemeral and persistent content. Ephemeral content (e.g., quote of the day) is usually not worth crawling, because by the t...
This article compares search effectiveness when using query-based Internet search (via the Google search engine), directory-based search (via Yahoo) and phrasebased query reformul...