Inverted files are widely used to index documents in large-scale information retrieval systems. An inverted file consists of posting lists, which can be stored in either a documen...
While several hierarchical classification methods have been applied to web content, such techniques invariably rely on a pre-defined taxonomy of documents. We propose a new techni...
Background: Structure-dependent substitution matrices increase the accuracy of sequence alignments when the 3D structure of one sequence is known, and are successful e.g. in fold ...
Fully automatic methods that extract lists of objects from the Web have been studied extensively. Record extraction, the first step of this object extraction process, identifies...
This paper examines several different approaches to exploiting structural information in semi-structured document categorization. The methods under consideration are designed for ...