In this paper, we propose a novel method to infer the web user’s Information Content (IC), which is the information that the user must examine to complete her task. In particular, our method learns to predict which words (called IC-words) will be in these essential web pages (IC-pages). We first collected relevant training data usnig an empirical study, where users explicitly identified which pages were IC-pages. We then examined page-content information from these clickstreams, to determine “browsing properties” of each individual word ¢ — i.e., how often was ¢ in the title of a page in each session, or in the anchor to a page that was followed, or a link that was skipped, etc. This training data also labeled each word as an IC-word or not. We used this to train a classifier to identify the browsing properties associated with IC-words. Notice this classifier can predict which words are IC given any page sequence, even if those pages are in web-sites that have not been v...