As the web expands exponentially, the need to put some order to its content becomes apparent. Hypertext categorization, that is the automatic classification of web documents into ...
Work on evaluating and improving the relevance of web search engines typically use human relevance judgments or clickthrough data. Both these methods look at the problem of learni...
Hao Ma, Raman Chandrasekar, Chris Quirk, Abhishek ...
We address the problem of publishing parliamentary proceedings in a digital sustainable manner. We give an extensive requirements analysis, and based on that propose a uniform XML...
Ambiguous person names are a problem in many forms of written text, including that which is found on the Web. In this paper we explore the use of unsupervised clustering techniques...
: As more information is made available online, users collect information in personal information spaces like bookmarks and emails. While most users feel that organizing these coll...