In this article, we introduce a new problem: the construction of multi-structured documents. We first offer an overview of existing solutions to the representation of such docum...
Large amounts of low- to medium-quality English texts are now being produced by machine translation (MT) systems, optical character readers (OCR), and non-native speakers of Engli...
We present in this paper a combination of Machine Learning based Information Retrieval (IR) techniques and stochastic language modelling in a hierarchical system that extracts sur...
: Building the Semantic Web requires the use of powerful tools to create, manage and extend domain ontologies represented with Semantic Web languages. Though many tools have been a...
Elena Paslaru Bontas, Sebastian Tietz, Thomas Schr...
We address the issue of detecting automatically occurrences of high level patterns in audiovisual documents. These patterns correspond to recurring sequences of shots, which are co...