Some discourse structures such as enumerative structures have typographical, punctuational and laying out characteristics which (1) make them easily identifiable and (2) convey hi...
The cluster assumption is exploited by most semi-supervised learning (SSL) methods. However, if the unlabeled data is merely weakly related to the target classes, it becomes quest...
A technique is presented that uses visual relationships between word images in a document to improve the recognition of the text it contains. This technique takes advantage of the...
Text clustering is most commonly treated as a fully automated task without user supervision. However, we can improve clustering performance using supervision in the form of pairwi...
In this paper, we present the main features of a text mining based search engine for the UK Educational Evidence Portal available at the UK National Centre for Text Mining (NaCTeM...
Sophia Ananiadou, John McNaught, James Thomas, Mar...