Mathematical texts can be computerized in many ways that capture differing amounts of the mathematical meaning. At one end, there is document imaging, which captures the arrangeme...
We describe a novel method for using Genetic Programming to create compact classification rules based on combinations of N-Grams (character strings). Genetic programs acquire fitne...
As part of a large effort to acquire large repositories of facts from unstructured text on the Web, a seed-based framework for textual information extraction allows for weakly sup...
A major obstacle that decreases the performance of text classifiers is the extremely high dimensionality of text data. To reduce the dimension, a number of approaches based on rou...
We present a class of richly structured, undirected hidden variable models suitable for simultaneously modeling text along with other attributes encoded in different modalities. O...