This paper is about the reproduction of ancient texts with vectorised fonts. While for OCR only recognition rates count, a reproduction process does not necessarily require the re...
A compressed full-text self-index for a text T , of size u, is a data structure used to search for patterns P, of size m, in T , that requires reduced space, i.e. space that depend...
In a data warehousing process, the data preparation phase is crucial. Mastering this phase allows substantial gains in terms of time and performance when performing a multidimensio...
Robustness, the ability to analyze any input regardless of its grammaticality, is a desirable property for any system dealing with unrestricted natural language text. Error-repair...
In this paper we will present a set of experiments using large digitalized collections of books to show that logical structures can be extracted with good quality when working at ...