Abstract. For document-centric work, meta-information in form of annotations has proven useful to enhance search and other retrieval tasks. Since creating annotations manually is a...
Malte Kiesel, Sven Schwarz, Ludger van Elst, Georg...
: Hypertext categorization is the automatic classification of web documents into predefined classes. It poses new challenges for automatic categorization because of the rich inform...
Unlike conventional data or text, Web pages typically contain a large amount of information that is not part of the main contents of the pages, e.g., banner ads, navigation bars, ...
This paper examines several different approaches to exploiting structural information in semi-structured document categorization. The methods under consideration are designed for ...
Feature selection is a critical component of many pattern recognition applications. There are two distinct mechanisms for feature selection, namely the wrapper method and the filt...