The state-of-the-art system combination method for machine translation (MT) is the word-based combination using confusion networks. One of the crucial steps in confusion network d...
Most work on unsupervised entailment rule acquisition focused on rules between templates with two variables, ignoring unary rules - entailment rules between templates with a singl...
We present a sub-sentential alignment system that links linguistically motivated phrases in parallel texts based on lexical correspondences and syntactic similarity. We compare th...
This paper shows that it is very often possible to identify the source language of medium-length speeches in the EUROPARL corpus on the basis of frequency counts of word n-grams (...
In this paper we present a taxonomy of dialogue moves which describe the actions that students and tutors perform in tutorial dialogue. We are motivated by the need for a categori...
This paper presents a probabilistic model for resolution of non-pronominal anaphora in biomedical texts. The model seeks to find the antecedents of anaphoric expressions, both cor...
Psycholinguistic studies suggest a model of human language processing that 1) performs incremental interpretation of spoken utterances or written text, 2) preserves ambiguity by m...
William Schuler, Samir AbdelRahman, Tim Miller, La...
This paper proposes an approach using large scale case structures, which are automatically constructed from both a small tagged corpus and a large raw corpus, to improve Chinese d...
This paper presents recent advances in an established treebank annotation framework comprising of an abstract XMLbased data format, fully customizable editor of tree-based annotat...
We address corpus building situations, where complete annotations to the whole corpus is time consuming and unrealistic. Thus, annotation is done only on crucial part of sentences...
Yuta Tsuboi, Hisashi Kashima, Shinsuke Mori, Hirok...