Many text documents naturally have two kinds of labels. For example, we may label web pages from universities according to their categories, such as "student" or "fa...
In this paper, we describe a system that can extract record structures from web pages with no direct human supervision. Records are commonly occurring HTML-embedded data tuples th...
In this paper we present a new document representation model based on implicit user feedback obtained from search engine queries. The main objective of this model is to achieve be...
There are many clustering tasks which are closely related in the real world, e.g. clustering the web pages of different universities. However, existing clustering approaches neglec...
Prototyping and evaluation combine to explore ways that an effective, integrative organization of project-related information might emerge as a by-product of a person's effor...
William Jones, Predrag V. Klasnja, Andrea Civan, M...