For intelligent interfaces attempting to learn a user’s interests, the cost of obtaining labeled training instances is prohibitive because the user must directly label each training instance, and few users are willing to do so. We present an approach that circumvents the need for human-labeled pages. Instead, we learn “surrogate” tasks where the desired output is easily measured, such as the number of hyperlinks clicked on a page or the amount of scrolling performed. Our assumption is that these outputs will highly correlate with the user’s interests. In other words, by unobtrusively “observing” the user’s behavior we are able to learn functions of value. For example, an intelligent browser could silently observe the user’s browsing behavior during the day, then use these training examples to learn such functions and gather, during the middle of the night, pages that are likely to be of interest to the user. Previous work has focused on learning a user profile by passi...
Jeremy Goecks, Jude W. Shavlik