Traditional bag-of-words model and recent wordsequence kernel are two well-known techniques in the field of text categorization. Bag-of-words representation neglects the word orde...
Lei Zhang, Debbie Zhang, Simeon J. Simoff, John K....
Traditional supervised classification algorithms require a large number of labelled examples to perform accurately. Semi-supervised classification algorithms attempt to overcome t...
This work addresses the problem of classifying the genre of text, which is useful for a variety of language processing problems. We propose statistics of POS histograms as classiï...
Sergey Feldman, Marius A. Marin, Mari Ostendorf, M...
Various strategies are proposed to identify and classify three types of proper nouns in Chinese texts. Clues from character, sentence and paragraph levels are employed to resolve ...
We consider a parsed text corpus as an instance of a labelled directed graph, where nodes represent words and weighted directed edges represent the syntactic relations between the...