Abstract. We present a model for complex documents possibly consisting of a hierarchically structured set of images or texts. Documents are represented both at the form level (as s...
Carlo Meghini, Fabrizio Sebastiani, Umberto Stracc...
In this paper, we propose a method of text retrieval from document images using a similarity measure based on an N-Gram algorithm. We directly extract image features instead of us...
Social annotation has gained increasing popularity in many Web-based applications, leading to an emerging research area in text analysis and information retrieval. This paper is c...
: We are presenting a set of multilingual text analysis tools that can help analysts in any field to explore large document collections quickly in order to determine whether the do...
Camelia Ignat, Bruno Pouliquen, Ralf Steinberger, ...
This paper reports on the participation of ITC-irst in the Cross Language Evaluation Forum 2003; in particular, in the monolingual, bilingual, small multilingual, and spoken docum...