Sciweavers

IJCAI
2003

Predicting Web Information Content

14 years 27 days ago
Predicting Web Information Content
In this paper, we propose a novel method to infer the web user’s Information Content (IC), which is the information that the user must examine to complete her task. In particular, our method learns to predict which words (called IC-words) will be in these essential web pages (IC-pages). We first collected relevant training data usnig an empirical study, where users explicitly identified which pages were IC-pages. We then examined page-content information from these clickstreams, to determine “browsing properties” of each individual word ¢ — i.e., how often was ¢ in the title of a page in each session, or in the anchor to a page that was followed, or a link that was skipped, etc. This training data also labeled each word as an IC-word or not. We used this to train a classifier to identify the browsing properties associated with IC-words. Notice this classifier can predict which words are IC given any page sequence, even if those pages are in web-sites that have not been v...
Tingshao Zhu, Russell Greiner, Gerald Häubl,
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2003
Where IJCAI
Authors Tingshao Zhu, Russell Greiner, Gerald Häubl, Robert Price
Comments (0)