We address the problem of extracting bilingual chunk pairs from parallel text to create training sets for statistical machine translation. We formulate the problem in terms of a s...
Geometric layout analysis plays an important role in document image understanding. Many algorithms known in literature work well on standard document images, achieving high text l...
Faisal Shafait, Joost van Beusekom, Daniel Keysers...
Phrasal segmentation models define a mapping from the words of a sentence to sequences of translatable phrases. We discuss the estimation of these models from large quantities of ...
Words in Chinese text are not naturally separated by delimiters, which poses a challenge to standard machine translation (MT) systems. In MT, the widely used approach is to apply ...
Jia Xu, Jianfeng Gao, Kristina Toutanova, Hermann ...
Identifying discourse relations in a text is essential for various tasks in Natural Language Processing, such as automatic text summarization, question-answering, and dialogue gene...
Hugo Hernault, Danushka Bollegala, Mitsuru Ishizuk...