We demonstrate that the bidirectionality of deep grammars, allowing them to generate as well as parse sentences, can be used to automatically and effectively identify errors in th...
A collection of 3208 reported errors of Chinese words were analyzed. Among which, 7.2% involved rarely used character, and 98.4% were assigned common classifications of their caus...
We introduce the novel task of determining whether a newswire article is "true" or satirical. We experiment with SVMs, feature scaling, and a number of lexical and seman...
Combining word alignments trained in two translation directions has mostly relied on heuristics that are not directly motivated by intended applications. We propose a novel method...
We study the global topology of the syntactic and semantic distributional similarity networks for English through the technique of spectral analysis. We observe that while the syn...
Chris Biemann, Monojit Choudhury, Animesh Mukherje...
The GIVE Challenge is a recent shared task in which NLG systems are evaluated over the Internet. In this paper, we validate this novel NLG evaluation methodology by comparing the ...
Alexander Koller, Kristina Striegnitz, Donna Byron...
Efficient processing of tera-scale text data is an important research topic. This paper proposes lossless compression of Ngram language models based on LOUDS, a succinct data stru...
The abundance of homophones in Chinese significantly increases the number of similarly acceptable candidates in English-to-Chinese transliteration (E2C). The dialectal factor also...
Often, Statistical Machine Translation (SMT) between English and Korean suffers from null alignment. Previous studies have attempted to resolve this problem by removing unnecessar...