We report in this paper the observation of one tokenization per source. That is, the same critical fragment in different sentences from the same source almost always realize one a...
This paper discusses the treatment of fixed word expressions developed for our ITS-2 FrenchEnglish translation system. This treatment makes a clear distinction between compounds -...
A method of determining the similarity of nouns on the basis of a metric derived from the distribution of subject, verb and object in a large text corpus is described. The resulti...
We describe a new framework for dependency grammar, with a modular decomposition of immediate dependency and linear precedence. Our approach distinguishes two orthogonal yet mutua...
Traditional vector-based models use word co-occurrence counts from large corpora to represent lexical meaning. In this paper we present a novel approach for constructing semantic ...