Freshness has been increasingly realized by commercial search engines as an important criteria for measuring the quality of search results. However, most information retrieval methods focus on the relevance of page content to given queries without considering the recency issue. In this work, we mine page freshness from web user maintenance activities and incorporate this feature into web search. We first quantify how fresh the web is over time from two distinct perspectives—the page itself and its in-linked pages—and then exploit a temporal correlation between two types of freshness measures to quantify the confidence of page freshness. Results demonstrate page freshness can be better quantified when combining with temporal freshness correlation. Experiments on a realworld archival web corpus show that incorporating the combined page freshness into the searching process can improve ranking performance significantly on both relevance and freshness. Categories and Subject Descri...
Na Dai, Brian D. Davison