We present an algorithm for simultaneously constructing both the syntax and semantics of a sentence using a Lexicalized Tree Adjoining Grammar (LTAG). This approach captures natur...
String transformation systems have been introduced in (Brill, 1995) and have several applications in natural language processing. In this work we consider the computational proble...
In this paper, we describe a method for automatically retrieving collocations from large text corpora. This method retrieve collocations in the following stages: 1) extracting str...
Concerning different approaches to automatic PoS tagging: EngCG-2, a constraintbased morphological tagger, is compared in a double-blind test with a state-of-the-art statistical t...
We describe a simple variant of the interpolated Markov model with nonemitting state transitions and prove that it is strictly more powerful than any Markov model. More importantl...
This paper presents a method to combine a set of unsupervised algorithms that can accurately disambiguate word senses in a large, completely untagged corpus. Although most of the ...
Several recent efforts in statistical natural language understanding (NLU) have focused on generating clumps of English words from semantic meaning concepts (Miller et al., 1995; ...
Stephen Della Pietra, Mark Epstein, Salim Roukos, ...
This paper presents a trainable rule-based algorithm for performing word segmentation. The algorithm provides a simple, language-independent alternative to large-scale lexicai-bas...