We present a new method for blind document bleed through removal based on separate Markov Random Field (MRF) regularization for the recto and for the verso side, where separate pri...
Two dimensional plots (2-D) in digital documents on the web are an important source of information that is largely under-utilized. In this paper, we outline how data and text can ...
Saurabh Kataria, William Browuer, Prasenjit Mitra,...
There is no blank to mark word boundaries in Chinese text. As a result, identifying words is difficult, because of segmentation ambiguities and occurrences of unknown words. Conve...
Background: Traditional genome alignment methods consider sequence alignment as a variation of the string edit distance problem, and perform alignment by matching characters of th...
It is crucial in many information systems to organize short text segments, such as keywords in documents and queries from users, into a well-formed topic hierarchy. In this paper,...