: Mass digitization of document collections with further processing and semantic annotation is an increasing activity among libraries and archives at large for preservation, browsi...
It is well known that anchor text plays a critical role in a variety of search tasks performed over hypertextual domains, including enterprise search, wiki search, and web search....
Donald Metzler, Jasmine Novak, Hang Cui, Srihari R...
Developing better systems for document image analysis requires understanding errors, their sources, and their effects. The interactions between various processing steps are comple...
In this paper we analyze our recent research on the use of document analysis techniques for metadata extraction from PDF papers. We describe a package that is designed to extract ...
—Time is a universal and essential aspect of data in any investigative analysis. It helps analysts establish causality, build storylines from evidence, and reject infeasible hypo...