Abstract. We present a model for complex documents possibly consisting of a hierarchically structured set of images or texts. Documents are represented both at the form level (as s...
Carlo Meghini, Fabrizio Sebastiani, Umberto Stracc...
To summarize is to reducein complexity, and hencein length, while retaining some of the essential qualities of the original. This paper focusses on document extracts, a particular...
Information Retrieval (IR) systems try to identify documents relevant to user queries, which are representations of user information needs. Interaction, context, and document struc...
A new system is presented for general symbol segmentation, which is applicable for segmentation of any connected string of symbols, including characters and line diagrams. Using a...
Sumo is a formalism for universal segmentation of text. Its purpose is to provide a framework for the creation of segmentation applications. It is called universal as the formalis...