In automated text categorization, given a small number of labeled documents, it is very challenging, if not impossible, to build a reliable classifier that is able to achieve high...
Zenglin Xu, Rong Jin, Kaizhu Huang, Michael R. Lyu...
Distributions of the senses of words are often highly skewed. This fact is exploited by word sense disambiguation (WSD) systems which back off to the predominant (most frequent) s...
Leximancer is a software system for performing conceptual analysis of text data in a largely language independent manner. The system is modelled on Content Analysis and provides u...
—Reading text from photographs is a challenging problem that has received a signicant amount of attention. Two key components of most systems are (i) text detection from images a...
Adam Coates, Blake Carpenter, Carl Case, Sanjeev S...
Due to the globalization on the Web, many companies and institutions need to efficiently organize and search repositories containing multilingual documents. The management of the...