Spreadsheets applications allow data to be stored with low development overheads, but also with low data quality. Reporting on data from such sources is difficult using traditiona...
Using a lexicon can often improve character recognition under challenging conditions, such as poor image quality or unusual fonts. We propose a flexible probabilistic model for c...
Jerod J. Weinman, Erik G. Learned-Miller, Allen R....
In adding syntax to statistical MT, there is a tradeoff between taking advantage of linguistic analysis, versus allowing the model to exploit linguistically unmotivated mappings l...
Annotated corpora are valuable resources for NLP which are often costly to create. We introduce a method for transferring annotation from a morphologically annotated corpus of a so...
This paper discusses the problem of marrying structural similarity with semantic relatedness for Information Extraction from text. Aiming at accurate recognition of relations, we ...
Sophia Katrenko, Pieter W. Adriaans, Maarten van S...