In this paper, we evaluate the performance on Arabic handwriting of the text-independent writer identification methods that we developed and tested on Western script in recent yea...
This paper describes our participation in the TREC-9 Spoken Document Retrieval (SDR) track. The THISL SDR system consists of a realtime version of a hybrid connectionist/HMM large...
We present a new method for discovering a segmental discourse structure of a document while categorizing each segment's function and importance. Segments are determined by a ...
In this paper, we present the AutoCat system for product classification. AutoCat uses a vector space model, modified to consider product attributes unavailable in traditional docu...
In this paper, we propose to study the characteristics for analyzing subjective content in documents. For that purpose, we present and evaluate a novel method based on abstraction...