A Finite State and Data-Oriented Method for Grapheme to Phoneme Conversion

15 years 8 months ago

Download acl.ldc.upenn.edu

A finite-state method, based on leftmost longestmatch replacement, is presented for segmenting words into graphemes, and for converting graphemes into phonemes. A small set of hand-crafted conversion rules for Dutch achieves a phoneme accuracy of over 93%. The accuracy of the system is further improved by using transformation-based learning. The phoneme accuracy of the best system (using a large rule and a 'lazy' variant of Brill's algoritm), trained on only 40K words, reaches 99%.

Gosse Bouma

Real-time Traffic

ANLP 2000 | Hand-crafted Conversion Rules | Leftmost Longestmatch Replacement | Phoneme Accuracy |

claim paper

Post Info
More Details (n/a)

Added	01 Nov 2010
Updated	01 Nov 2010
Type	Conference
Year	2000
Where	ANLP
Authors	Gosse Bouma

Comments (0)

Sciweavers

A Finite State and Data-Oriented Method for Grapheme to Phoneme Conversion

ANLP 2000 | Hand-crafted Conversion Rules | Leftmost Longestmatch Replacement | Phoneme Accuracy |

Explore & Download

Productivity Tools

Sciweavers