Labeling text data is quite time-consuming but essential for automatic text classification. Especially, manually creating multiple labels for each document may become impractical ...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
This paper describes the Online Arabic handwriting recognition competition held at ICDAR 2009. This first competition uses the ADAB-database with Arabic online handwritten words. ...
Double-sided manuscripts are often degraded by bleedthrough interference. Such degradation must be corrected to facilitate human perception and machine recognition. Most approache...
We are interested in retrieving information from conversational speech corpora, such as call-center data. This data comprises spontaneous speech conversations with low recording q...