Text classification poses some specific challenges. One such challenge is its high dimensionality where each document (data point) contains only a small subset of them. In this pap...
Mining retrospective events from text streams has been an important research topic. Classic text representation model (i.e., vector space model) cannot model temporal aspects of d...
Xin Zhao, Rishan Chen, Kai Fan, Hongfei Yan, Xiaom...
Many applications require analyzing vast amounts of textual data, but the size and inherent noise of such data can make processing very challenging. One approach to these issues i...
David G. Underhill, Luke McDowell, David J. Marche...
In this paper, we study the use of support vector machine in text categorization. Unlike other machine learning techniques, it allows easy incorporation of new documents into an e...
A crucial preprocessing stage in applications such as OCR is text extraction from mixed-type documents. The present work, in contrast to most until now, successfully faces the pro...