In order to extractrigidexpressions with a high frequency of use, new algorithm that can efficientlyextract both uninterruptedand interruptedcollocationsfrom very large corpora ha...
The paper describes problems in disambiguating the morphological analysis of Bantu languages by using Swahili as a test language. The main factors of ambiguity in this language gr...
This paper proposes a segmentation standard for Chinese natural language processing. The standard is proposed to achieve linguistic felicity, computational feasibility, and data u...
Since treebanks have become available to researchers a wide variety of techniques has been used to make broad coverage parsing systems. This makes quantitative evaluation very imp...
In this paper, I discuss machine translation of English text into a relatively "free" word order language, specifically Turkish. I present algorithms that use contextual...
This paper aims to analyze word dependency structure in compound nouns appearing in Japanese newspaper articles. The analysis is a dil't:icult problem because such compound n...