In order to reduce the rejection rate of our automatic reading system, we propose to pre-classify the business documents by introducing an Automatic Recognition of Documents stage...
A new dictionary-based text categorization approach is proposed to classify the chemical web pages efficiently. Using a chemistry dictionary, the approach can extract chemistry-re...
Chunyan Liang, Li Guo, Zhaojie Xia, Feng-Guang Nie...
Abstract. This paper investigates a new extension of the Probabilistic Latent Semantic Analysis (PLSA) model [6] for text classification where the training set is partially labeled...
K-Means is a clustering algorithm that is widely applied in many fields, including pattern classification and multimedia analysis. Due to real-time requirements and computational-c...
VideoCLEF 2009 offered three tasks related to enriching video content for improved multimedia access in a multilingual environment. For each task, video data (Dutch-language telev...