In this paper we present a method for detecting the text genre quickly and easily following an approach originally proposed in authorship attribution studies which uses as style m...
Efstathios Stamatatos, Nikos Fakotakis, George K. ...
In word sense disambiguation (WSD), the heuristic of choosing the most common sense is extremely powerful because the distribution of the senses of a word is often skewed. The pro...
Diana McCarthy, Rob Koeling, Julie Weeds, John A. ...
This work addresses the problem of classifying the genre of text, which is useful for a variety of language processing problems. We propose statistics of POS histograms as classiï...
Sergey Feldman, Marius A. Marin, Mari Ostendorf, M...
A desired property of a measure of connective strength in bigrams is that the measure should be insensitive to corpus size. This paper investigates the stability of three differen...
Stop word detection is attempted in this work in the context of retrieval of document images in the compressed domain. Algorithms are presented to identify text lines and words an...