A methodology for automatically identifying and clustering semantic features or topics in a heterogeneous text collection is presented. Textual data is encoded using a low rank no...
Farial Shahnaz, Michael W. Berry, V. Paul Pauca, R...
Automatic semantic classification of video databases is very useful for users searching and browsing but it is a very challenging research problem as well. Combination of visual an...
—This paper proposes a model-based text line segmentation algorithm for machine-printed document images. The model is based on geometric configuration which uses the interline sp...
Simple word matching between the user query and document is common, as are mis-matches of meaning that occur as a consequence, and errors in recall. These defects in the "bag...
In recent years several models have been proposed for text categorization. Within this, one of the widely applied models is the vector space model (VSM), where independence betwee...