Abstract. Modern document collections often contain groups of documents with overlapping or shared content. However, most information retrieval systems process each document separa...
Andrei Z. Broder, Nadav Eiron, Marcus Fontoura, Mi...
It is crucial for a web crawler to distinguish between ephemeral and persistent content. Ephemeral content (e.g., quote of the day) is usually not worth crawling, because by the t...
We have performed a set of experiments made to investigate the utility of morphological analysis to improve retrieval of documents written in languages with relatively large morph...
A unique type of Web service, called a Social Network Service (SNS), first appeared in 2003. Some researches suggested a method to extract meaningful information from SNSs. Such ...
Content on the Internet is always changing. We explore the value of biasing search result snippets towards new webpage content. We present results from a user study comparing trad...
Krysta Marie Svore, Jaime Teevan, Susan T. Dumais,...