In this study, we present experiences of parallelizing XPath queries using the Xalan XPath engine on shared-address space multi-core systems. For our evaluation, we consider a sce...
With the widespread adoption of XML as the message format to disseminate content over distributed systems including Web Services and Publish-Subscribe systems, different methods ha...
Decoding noisy document images is commonly needed in applications such as enterprise content management. Available OCR solutions are still not satisfactory especially on noisy ima...
This paper describes ongoing research into the use of a domain-retargetable reverse engineering environment to aid the structural understanding of large information spaces. In par...
Structured documents contain elements defined by the author(s) and annotations assigned by other people or processes. Structured documents pose challenges for probabilistic retrie...