This paper presents a lightweight method for unsupervised extraction of paraphrases from arbitrary textual Web documents. The method differs from previous approaches to paraphrase...
In this paper, we study query evaluation on Active XML documents (AXML for short), a new generation of XML documents that has recently gained popularity. AXML documents are XML do...
—Most Web and legacy paper-based documents are available in human comprehensible text form, not readily accessible to or understood by computer programs. Here, we investigate an ...
Models of latent document semantics such as the mixture of multinomials model and Latent Dirichlet Allocation have received substantial attention for their ability to discover top...
Daniel David Walker, William B. Lund, Eric K. Ring...
We explore connections between digital libraries and interactive document image analysis. Digital libraries can provide useful data and metadata for research in automated document...