We present a framework to analyze color documents of complex layout. In addition, no assumption is made on the layout. Our framework combines in a content-driven bottom-up approac...
With the new interest in historical documents insight grew that electronic access to these texts causes many specific problems. In the first part of the paper we survey the presen...
Andreas Hauser, Markus Heller, Elisabeth Leiss, Kl...
For a very long time, it has been considered that the only way of automatically extracting similar groups of words from a text collection for which no semantic information exists ...
We propose a new integrated approach based on Markov logic networks (MLNs), an effective combination of probabilistic graphical models and firstorder logic for statistical relatio...
Jedi (Java based Extraction and Dissemination of Information) is a lightweight tool for the creation of wrappers and mediators to extract, combine, and reconcile information from ...
Gerald Huck, Peter Fankhauser, Karl Aberer, Erich ...