In traditional text classification, a classifier is built using labeled training documents of every class. This paper studies a different problem. Given a set P of documents of a ...
Mathematical texts can be computerized in many ways that capture differing amounts of the mathematical meaning. At one end, there is document imaging, which captures the arrangeme...
The vast majority of copied documents generally consist of text, and the copy quality mostly depends on the text's reproduction quality. A new technique to enhance dark text ...
For privacy reasons, sensitive content may be revised before it is released. The revision often consists of redaction, that is, the “blacking out” of sensitive words and phras...
Reading frequently involves not just looking at words on a page, but also underlining, highlighting and commenting, either on the text or in a separate notebook. This combination ...
Bill N. Schilit, Gene Golovchinsky, Morgan N. Pric...