We present an adaptive technique that enables users to produce a high quality dictionary parsed into its lexicographic components (headwords, pronunciations, parts of speech, translations, etc.) using an extremely small amount of user provided training data. We use transformationbased learning (TBL) as a postprocessor at two points in our system to improve performance. The results using two dictionaries show that the tagging accuracy increases from 83% and 91% to 93% and 94% for individual words or "tokens", and from 64% and 83% to 90% and 93% for contiguous "phrases" such as definitions or examples of usage.
Burcu Karagol-Ayan, David S. Doermann, Amy Weinber