Traditionally, text classifiers are built from labeled training examples. Labeling is usually done manually by human experts (or the users), which is a labor intensive and time co...
We develop new algorithms for learning monadic node selection queries in unranked trees from annotated examples, and apply them to visually interactive Web information extraction. ...
In this paper we propose PARTfs which adopts a supervised machine learning algorithm, namely partial decision trees, as a method for feature subset selection. In particular, it is...
This paper describes a new method for the classification of a HTML document into a hierarchy of categories. The hierarchy of categories is involved in all phases of automated docum...
The problem of determining the appropriate number of components is important in finite mixture modeling for pattern classification. This paper considers the application of an unsu...