The word error rate of any optical character recognition system (OCR) is usually substantially below its component or character error rate. This is especially true of Indic langua...
Venkat Rasagna, Anand Kumar 0002, C. V. Jawahar, R...
The processing and management of XML data are popular research issues. However, operations based on the structure of XML data have not received strong attention. These operations ...
Theodore Dalamagas, Tao Cheng, Klaas-Jan Winkel, T...
Coreferencing entities across documents in a large corpus enables advanced document understanding tasks such as question answering. This paper presents a novel cross document core...
Jian Huang 0002, Sarah M. Taylor, Jonathan L. Smit...
Nonnegative Matrix Factorization (NMF) has been proven to be effective in text mining. However, since NMF is a well-known unsupervised components analysis technique, the existing ...
Abstract. In order to organize huge document collections, labeled hierarchical structures are used frequently. Users are most efficient in navigating such hierarchies, if they refl...