— We present a general approach for the hierarchical segmentation and labeling of document layout structures. This approach models document layout as a grammar and performs a glo...
We present a perceptron-style discriminative approach to machine translation in which large feature sets can be exploited. Unlike discriminative reranking approaches, our system c...
Biomedical literature is an important source of information for chemical compounds. However, different representations and nomenclatures for chemical entities exist, which makes th...
Tiago Grego, Piotr Pezik, Francisco M. Couto, Diet...
Abstract XML documents have recently become ubiquitous because of their varied applicability in a number of applications. Classification is an important problem in the data mining ...
End-user interactive concept learning is a technique for interacting with large unstructured datasets, requiring insights from both human-computer interaction and machine learning...
Saleema Amershi, James Fogarty, Ashish Kapoor, Des...