We present an automatic approach to tree annotation in which basic nonterminal symbols are alternately split and merged to maximize the likelihood of a training treebank. Starting...
Slav Petrov, Leon Barrett, Romain Thibaux, Dan Kle...
Unknown words are a hindrance to the performance of hand-crafted computational grammars of natural language. However, words with incomplete and incorrect lexical entries pose an e...
Abstract. Kanazawa has shown that several non-trivial classes of categorial grammars are learnable in Gold’s model. We propose in this article to adapt this kind of symbolic lear...
This paper describes the winning entry to the Omphalos context free grammar learning competition. Our approach integrates an information theoretic constituent likelihood measure to...
The use of XML has become pervasive. It is used in a range of data storage and data exchange applications. In many cases such XML data is captured from users via forms or transform...