This paper presents a new approach to bitext correspondence problem (BCP) of noisy bilingual corpora based on image processing (IP) techniques. By using one of several ways of est...
A new account of parameter setting during grammatical acquisition is presented in terms of Generalized Categorial Grammar embedded in a default inheritance hierarchy, providing a ...
In data-oriented language processing, an annotated language corpus is used as a stochastic grammar. The most probable analysis of a new sentence is constructed by combining fragme...
This paper introduces new methods based on exponential families for modeling the correlations between words in text and speech. While previous work assumed the effects of word co-...
There exists strong word association in natural language. Based on mutual information, this paper proposes a new MI-Trigger-based modeling approach to capture the preferred relati...
Machine Translation (MT) need not be confined to inter-language activities. In this paper, we discuss inter-dialect MT in general and Cantonese-Mandarin MT in particular. Mandarin...
In this paper, we present a chunk based partial parsing system for spontaneous, conversational speech in unrestricted domains. We show that the chunk parses produced by this parsi...
This paper presents a system which automatically generates shallow semantic frame structures for conversational speech in unrestricted domains. We argue that such shallow semantic...