A method is presented for segmenting documents into conceptually related areas. Determining the equivalence of text is often based on the number of word repetitions. This approach...
We consider the problem of improving named entity recognition (NER) systems by using external dictionaries--more specifically, the problem of extending state-of-the-art NER system...
Currently there are several approaches to machine translation (MT) based on different paradigms; e.g., phrasal, hierarchical and syntax-based. These three approaches yield similar...
Antti-Veikko I. Rosti, Necip Fazil Ayan, Bing Xian...
Current system combination methods usually use confusion networks to find consensus translations among different systems. Requiring one-to-one mappings between the words in candid...
Yang Feng, Yang Liu, Haitao Mi, Qun Liu, Yajuan L&...
We address the task of unsupervised topic segmentation of speech data operating over raw acoustic information. In contrast to existing algorithms for topic segmentation of speech,...
Igor Malioutov, Alex Park, Regina Barzilay, James ...