For privacy reasons, sensitive content may be revised before it is released. The revision often consists of redaction, that is, the “blacking out” of sensitive words and phras...
We present a quantitative evaluation of one well-known word alignment algorithm, as well as an analysis of frequent errors in terms of this model's underlying assumptions. De...
Today's digital libraries increasingly include not only printed text but also scanned handwritten pages and other multimedia material. There are, however, few tools available...
We present here a method for automatically projecting structural information across translations, including canonical citation structure (such as chapters and sections), speaker i...
Statistical Machine Translation (MT) systems have achieved impressive results in recent years, due in large part to the increasing availability of parallel text for system trainin...
Zhiyi Song, Stephanie Strassel, Gary Krug, Kazuaki...