A prototype system has been designed to automate the extraction of bibliographic data (e.g., article title, authors, , affiliation and others) from online biomedical journals to p...
This paper describes our participation in the TREC-9 Spoken Document Retrieval (SDR) track. The THISL SDR system consists of a realtime version of a hybrid connectionist/HMM large...
A new segmentation algorithm for multifont Farsi/Arabic texts based on conditional labeling of up and down contours was presented in [1]. A preprocessing technique was used to adju...
Mona Omidyeganeh, Reza Azmi, Kambiz Nayebi, Abbas ...
Abstract: The thematic text segmentation task consists in identifying the most important thematic breaks in a document in order to cut it into homogeneous passages. We propose in t...
Sylvain Lamprier, Tassadit Amghar, Bernard Levrat,...
We present a semi-Markov model for recognizing scene text that integrates character and word segmentation with recognition. Using wavelet features, it requires only approximate lo...
Allen R. Hanson, Erik G. Learned-Miller, Jerod J. ...