This paper describes our participation in the TREC-9 Spoken Document Retrieval (SDR) track. The THISL SDR system consists of a realtime version of a hybrid connectionist/HMM large...
A new segmentation algorithm for multifont Farsi/Arabic texts based on conditional labeling of up and down contours was presented in [1]. A preprocessing technique was used to adju...
Mona Omidyeganeh, Reza Azmi, Kambiz Nayebi, Abbas ...
Abstract: The thematic text segmentation task consists in identifying the most important thematic breaks in a document in order to cut it into homogeneous passages. We propose in t...
Sylvain Lamprier, Tassadit Amghar, Bernard Levrat,...
We present a semi-Markov model for recognizing scene text that integrates character and word segmentation with recognition. Using wavelet features, it requires only approximate lo...
Allen R. Hanson, Erik G. Learned-Miller, Jerod J. ...
While the public has increasingly access to medical information, specialized medical language is often difficult for lay people to understand and there is a need to bridge the gap ...