Separating machine printed text and handwriting from overlapping text is a challenging problem in the document analysis field and no reliable algorithms have been developed thus f...
This paper explores the use of the homotopy method for training a semi-supervised Hidden Markov Model (HMM) used for sequence labeling. We provide a novel polynomial-time algorith...
We present a new approach to automatic summarization based on neural nets, called NetSum. We extract a set of features from each sentence that helps identify its importance in the...
Krysta Marie Svore, Lucy Vanderwende, Christopher ...
We study dimensionality reduction or feature selection in text document categorization problem. We focus on the first step in building text categorization systems, that is the cho...
This paper presents the evaluation of the dictionary look-up component of Mayo Clinic's Information Extraction system. The component was tested on a corpus of 160 free-text c...
Karin Schuler, Vinod Kaggal, James J. Masanz, Phil...