Recent experiments and analysis suggest that there are about 800 million publicly-indexable Web pages. However, unlike books in a traditional library, Web pages continue to change even after they are initially published by their authors and indexed by search engines. This paper describes preliminary data on and statistical analysis of the frequency and nature of Web page modifications. Using empirical models and a novel analytic metric of `up-to-dateness', we estimate the rate at which Web search engines must re-index the Web to remain current.
Brian E. Brewington, George Cybenko