Most current sentence alignment approaches adopt sentence length and cognate as the alignment features; and they are mostly trained and tested in the documents with the same style...
We propose a formal characterization of variation in the syntactic realization of semantic arguments, using hierarchies of syntactic relations and thematic roles, and a mechanism ...
I present a novel approach to the determination of recurrent sound correspondences in bilingual wordlists. The idea is to relate correspondences between sounds in wordlists to tra...
Automatic text categorization is a problem of automatically assigning text documents to predefined categories. In order to classify text documents, we must extract good features f...
We describe a parser for robust and flexible interpretation of user utterances in a multi-modal system for web search in newspaper databases. Users can speak or type, and they can...
This paper describes a dialog based QA system, Dialog Navigator, which can answer questions based on large text knowledge base. In real world QA systems, vagueness of questions is...
We describe a language-independent, flexible, and accurate method for the detection of abbreviations in text corpora. It is based on the idea that an abbreviation can be viewed as...
This paper proposes an unsupervised learning model for classifying named entities. This model uses a training set, built automatically by means of a small-scale named entity dicti...
We present a comparative evaluation of two data-driven models used in translation selection of English-Korean machine translation. Latent semantic analysis(LSA) and probabilistic ...
We present a novel disambiguation method for unification-based grammars (UBGs). In contrast to other methods, our approach obviates the need for probability models on the UBG side...