Sciweavers

ACL
1994

An Automatic Treebank Conversion Algorithm for Corpus Sharing

14 years 26 days ago
An Automatic Treebank Conversion Algorithm for Corpus Sharing
An automatic treebank conversion method is proposed in this paper to convert a treebank into another treebank. A new treebank associated with a different grammar can be generated automatically from the old one such that the information in the original treebank can be transformed to the new one and be shared among different research communities. The simple algorithm achieves conversion accuracy of 96.4% when tested on 8,867 sentences between two major grammar revisions of a large MT system. Motivation Corpus-based research is now a major branch for language processing. One major resource for corpus-based research is the treebanks available in many research organizations [Marcus et al.1993], which carry skeletal syntactic structures or 'brackets' that have been manually verified. Unfortunately, such resources may be based on different tag sets and grammar systems of the respective research organizations. As a result, reusability of such resources across research laboratories i...
Jong-Nae Wang, Jing-Shin Chang, Keh-Yih Su
Added 02 Nov 2010
Updated 02 Nov 2010
Type Conference
Year 1994
Where ACL
Authors Jong-Nae Wang, Jing-Shin Chang, Keh-Yih Su
Comments (0)