We present a probabilistic model for a document corpus that combines many of the desirable features of previous models. The model is called “GaP” for Gamma-Poisson, the distri...
Computer-based annotation is increasing in popularity as a mechanism for revising documents and sharing comments over the Internet. One reason behind this surge is that viewpoints...
David R. Karger, Boris Katz, Jimmy J. Lin, Dennis ...
This paper investigates query translation in cross-lingual information retrieval, especially the challenges caused by ambiguity and polysemi. We base our ideas on feature vectors a...
Document image matching is the key technique for document registration and retrieval. In this paper, a new matching algorithm based on document component block list and component ...
Abstract: Document analysis and text mining techniques are used to preprocess documents in information retrieval systems, to extract concepts in ontology construction processes, an...