Accurate web page classification often depends crucially on information gained from neighboring pages in the local web graph. Prior work has exploited the class labels of nearby p...
The traditional crawlers used by search engines to build their collection of Web pages frequently gather unmodified pages that already exist in their collection. This creates unne...
DNS is one of the most actively used distributed databases on earth, accessed by millions of people every day to transparently convert host names into IP addresses and vice versa....
We track a large set of "rapidly" changing web pages and examine the assumption that the arrival of content changes follows a Poisson process on a microscale. We demonst...
We present an approach for answering Entity Retrieval queries using click-through information in query log data from a commercial Web search engine. We compare results using click...
Bodo Billerbeck, Gianluca Demartini, Claudiu S. Fi...