Developing automated agents that intelligently perform complex real world tasks is time consuming and expensive. The most expensive part of developing these intelligent task perfo...
We describe an approach to simultaneous tokenization and part-of-speech tagging that is based on separating the closed and open-class items, and focusing on the likelihood of the ...
Interaction with large unstructured datasets is difficult because existing approaches, such as keyword search, are not always suited to describing concepts corresponding to the di...
Saleema Amershi, James Fogarty, Ashish Kapoor, Des...
We propose a method for the classification of matrices. We use a linear classifier with a novel regularization scheme based on the spectral 1-norm of its coefficient matrix. The s...
Semi-structured data such as XML and HTML is attracting considerable attention. It is important to develop various kinds of data mining techniques that can handle semistructured d...