The deep web contains an order of magnitude more information than the surface web, but that information is hidden behind the web forms of a large number of web sites. Metasearch e...
Jeffrey P. Bigham, Ryan S. Kaminsky, Jeffrey Nicho...
Search computing is a novel discipline whose goal is to answer complex, multi-domain queries. Such queries typically require combining in their results domain knowledge extracted ...
It is crucial for a web crawler to distinguish between ephemeral and persistent content. Ephemeral content (e.g., quote of the day) is usually not worth crawling, because by the t...
Anchor text has been considered as a useful resource to complement the representation of target pages and is broadly used in web search. However, previous research only uses anchor...
The increasingly prominent new subset of Web pages, called `blogs' differs from traditional Web pages both in characteristics and potential to applications. We explore three ...