Abstract--We present a tool that facilitates the efficient extension of morphological lexica. The tool exploits information from a morphological lexicon, a morphological grammar an...
In many text classification applications, it is appealing to take every document as a string of characters rather than a bag of words. Previous research studies in this area mostl...
: The availability of versioned text collections such as the Internet Archive opens up opportunities for time-aware exploration of their contents. In this paper, we propose time-tr...
Klaus Berberich, Srikanta J. Bedathur, Gerhard Wei...
We describe an efficient technique to weigh word-based features in binary classification tasks and show that it significantly improves classification accuracy on a range of proble...
Justin Martineau, Tim Finin, Anupam Joshi, Shamit ...
We introduce a multi-stage ensemble framework, ErrorDriven Generalist+Expert or Edge, for improved classification on large-scale text categorization problems. Edge first trains a ...